CN113127060A - Software function point identification method based on natural language pre-training model (BERT) - Google Patents

Software function point identification method based on natural language pre-training model (BERT) Download PDF

Info

Publication number
CN113127060A
CN113127060A CN202110386325.8A CN202110386325A CN113127060A CN 113127060 A CN113127060 A CN 113127060A CN 202110386325 A CN202110386325 A CN 202110386325A CN 113127060 A CN113127060 A CN 113127060A
Authority
CN
China
Prior art keywords
module
word segmentation
named entity
requirement description
function point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110386325.8A
Other languages
Chinese (zh)
Inventor
仲兆祥
袁华新
张笑闻
郭琼琼
朱玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Communication Service Application And Solution Technology Co ltd
Original Assignee
China Communication Service Application And Solution Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Communication Service Application And Solution Technology Co ltd filed Critical China Communication Service Application And Solution Technology Co ltd
Priority to CN202110386325.8A priority Critical patent/CN113127060A/en
Publication of CN113127060A publication Critical patent/CN113127060A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/77Software metrics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/268Morphological analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a software function point identification method based on a natural language pre-training model (BERT), which comprises the following steps: acquiring at least one requirement description statement; inputting the at least one requirement description statement into the trained named entity recognition model to obtain at least one named entity; performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, wherein the word segmentation set comprises at least one word segmentation; merging the at least one named entity and the participles in the participle set, and performing part-of-speech tagging on a merged result; and processing the part-of-speech tagging result to identify the functional point. The method does not need manual evaluation, and has high functional point identification speed and high accuracy.

Description

Software function point identification method based on natural language pre-training model (BERT)
Technical Field
The invention belongs to the technical field of function point identification, and particularly relates to a function point identification method, a function point identification device, a function point identification application and a computer device.
Background
The function point method is a method for estimating the size of a software item, and measures the scale of the software by quantifying the functions of a system from the viewpoint of a user, and the measurement is mainly based on the logic design of the system. The function point scale measurement method has been widely applied internationally, and has replaced code lines as the most mainstream software scale measurement method. The core idea of the function point method is to decompose the software system according to components so as to determine the number of the function points of the system. The function point method is a size measurement method of decomposition type, namely, a complex system is decomposed into smaller subsystems for evaluation. The software system is decomposed according to components, so that the number of the functional points of the system is determined.
The method adopts a function point method to carry out estimation work, needs to manually read a software requirement document, utilizes professional knowledge mastered by an evaluator to identify the function point from the requirement document, and calculates software research and development cost according to industry productivity benchmark data and the like.
The existing function point method is adopted to realize software function point identification, the requirement on evaluators is high, professional knowledge of the function point method is required to be mastered, and professional knowledge of the software application field and the software development field is required to be mastered, so that the overall evaluation efficiency is low.
Disclosure of Invention
In order to solve the existing problems, the invention provides a function point identification method, a device, an application and a computer device, which identify a named entity in a requirement description sentence through a named entity identification model, combine the named entity and a participle to realize function point identification, do not need to participate in evaluation manually, and have high function point identification speed and accuracy.
The invention is realized by the following technical scheme:
in a first aspect, the present disclosure provides a method for identifying a function point, including:
acquiring at least one requirement description statement;
inputting the at least one requirement description statement into the trained named entity recognition model to obtain at least one named entity;
performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, wherein the word segmentation set comprises at least one word segmentation;
merging the at least one named entity and the participles in the participle set, and performing part-of-speech tagging on a merged result;
and processing the part-of-speech tagging result to identify the functional point.
According to the method, the named entities in the requirement description sentences are identified based on the named entity identification model, the named entities and the participles of the requirement description sentences are combined, and the combining result part-of-speech tagging result is analyzed to identify the function points.
In one possible design, the processing the part-of-speech tagging result to identify the functional point includes:
performing dependency syntax analysis on the part of speech tagging result to obtain the dependency relationship between words;
the functional points are identified according to the dependency relationships.
In one possible design, the named entity recognition model includes a bidirectional pre-training language model Bert and a conditional random field CRF, which are signal-connected in sequence. For example, aiming at software, the field is wide, some fields are limited by the limitation of linguistic data, the recognition effect is unstable, the scheme adopts a target entity recognition model formed on the basis of a two-way pre-training language model Bert and a conditional random field CRF, the conditional random field CRF layer is added on the basis of the two-way pre-training language model Bert of a main layer, the conditional probability between named entity labels in the professional field is learned through the two-way pre-training language model Bert, the transition probability between the named entity labels is learned through the conditional random field CRF, the conditional random field is an undirected graph model which is a typical discriminant model, the characteristics of a maximum entropy model and a hidden Markov model are combined, and the problem of label bias such as maximum entropy, hidden Markov and the like is solved because the conditional random field adopts global normalization; through the combination of the two, the stability of recognition and the cross-domain recognition capability are effectively improved. Compared with the existing model based on the technologies such as maximum entropy, conditional random field, cyclic neural network and the like, the model is novel in structure, and strong in identification stability and cross-domain identification capability.
In one possible design, the method for training the bi-directional pre-training language model Bert includes:
pre-training a bidirectional pre-training language model Bert by utilizing a training set, wherein the training set comprises multi-field corpus data, and each corpus data in the multi-field corpus data comprises at least one requirement description statement;
and adjusting the pre-trained bidirectional pre-training language model Bert by utilizing the corpus data of a professional field, wherein the corpus data of the professional field comprises at least one requirement description sentence of the professional field.
According to the scheme, the bidirectional pre-training language model Bert is pre-trained by adopting multi-field requirement description sentences, and is finely adjusted by utilizing a professional field requirement description sentence, so that the recognition accuracy of the conditional probability of the bidirectional pre-training language model Bert between the named entity labels in the corresponding professional fields is improved.
In one possible design, the functional points include at least one of an internal logic file ILF, an external interface file EIF, an external input EI, an external output EO, and an external query EQ.
In a second aspect, the present invention provides a software development cost estimation method, including the following steps:
identifying the function points in the software requirement document by adopting any one of the function point identification methods in the first aspect;
and calculating software research and development cost based on the function points and the industry benchmark data.
In a third aspect, the invention provides a function point identification device, which comprises a sentence acquisition module, a named entity acquisition module, a merging module, a part-of-speech tagging module and a function point identification module, which are sequentially connected in a communication manner, wherein a word segmentation module is also connected between the sentence acquisition module and the merging module in a communication manner;
the statement acquisition module is used for acquiring at least one requirement description statement;
the named entity obtaining module is used for inputting the at least one requirement description sentence into the trained named entity recognition model to obtain at least one named entity;
the word segmentation module is used for performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, and the word segmentation set comprises at least one word segmentation;
the merging module is used for merging the at least one named entity and at least one participle in the participle set;
the part-of-speech tagging module is used for carrying out part-of-speech tagging on the merged result;
and the functional point identification module is used for processing the part-of-speech tagging result and identifying the functional point.
In a fourth aspect, the present invention provides a software development cost estimation device, which comprises a function point identification device and a cost accounting module, which are sequentially connected in a communication manner,
the function point identification device is the function point identification device provided in the third aspect;
and the cost accounting module is used for calculating the software research and development cost according to the function points and the industry benchmark data.
In a fifth aspect, the present invention provides a computer device, comprising a memory and a processor, wherein the memory is used for storing a computer program, and the processor is used for reading the computer program and executing the function point identification method according to the first aspect or the software development cost estimation method according to the second aspect.
In a sixth aspect, the present invention provides a computer-readable storage medium, having stored thereon instructions, which, when executed on a computer, perform the method for identifying a function point according to the first aspect or the method for estimating a software development cost according to the second aspect.
Compared with the prior art, the invention at least has the following advantages and beneficial effects:
according to the method, the named entities in the requirement description sentences are identified based on the named entity identification model, the named entities and the participles of the requirement description sentences are combined, and the combining result part-of-speech tagging result is analyzed to identify the function points.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a functional point identification method of the present invention;
FIG. 2 is an identification diagram of a specific requirement description statement of the present invention;
FIG. 3 is a diagram of a named entity recognition model architecture;
FIG. 4 is a flow chart of a software development cost estimation method of the present invention;
fig. 5 is a schematic diagram of a function point identifying device.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely illustrative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present invention.
It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, B exists alone, and A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists independently, and A and B exist independently; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.
It will be understood that when an element is referred to herein as being "connected," "connected," or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Conversely, if a unit is referred to herein as being "directly connected" or "directly coupled" to another unit, it is intended that no intervening units are present. In addition, other words used to describe the relationship between elements should be interpreted in a similar manner (e.g., "between … …" versus "directly between … …", "adjacent" versus "directly adjacent", etc.).
It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.
It should also be noted that, in some alternative designs, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
It should be understood that specific details are provided in the following description to facilitate a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure the examples in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring example embodiments.
A first aspect of the present embodiment provides a method for identifying a function point, where the method may be performed by an identifying device, where the identifying device may be software, or a combination of software and hardware, and the identifying device may be integrated in a server, a terminal device, or the like. Specifically, as shown in fig. 1, the method for identifying a function point includes the following steps S101 to S105:
and step S101, acquiring at least one requirement description statement. The requirement description statement can be a pure requirement description statement which is marked and refined, or a requirement description statement which is obtained by preprocessing a requirement document. If the requirement document is obtained, preprocessing the requirement document, reading a functional requirement part in the requirement document, and obtaining a requirement description statement. The description will be given by taking an example of a requirement description statement "resource management system provides a function of performing calendar card management on each professional in-network device, and allows an operator to record, modify, delete, query and view corresponding information" in fig. 2.
And S102, inputting the at least one requirement description sentence into the trained named entity recognition model to obtain at least one named entity. Named entities include entities, operations, attributes, functions, and others. The entity is a business entity and a corresponding universal vocabulary, such as a user, an account and the like; the operation is the operation on the entity, such as data operation such as adding, deleting, modifying and inquiring, and professional operation vocabularies such as opening an account and selling an account; attributes, i.e. the name of the attribute of the service entity, such as the user's password, phone, etc.; function is a comprehensive business function, such as daily report; others are some modifier qualifiers, such as consumer in consumer work orders.
The named entity recognition model used in step S102 may be implemented in various ways, and in this embodiment, the named entity recognition model is preferably formed by using a two-way pre-training language model Bert and a conditional random field CRF, and the structure of the named entity recognition model formed by the model is shown in fig. 3.
The named entity recognition model needs to be trained in advance before being used, and is trained in multiple ways, wherein one way is to pre-train a bidirectional pre-trained language model Bert by utilizing a training set, the training set comprises multi-field corpus data, and each corpus data in the multi-field corpus data comprises at least one requirement description statement. The training set can adopt massive Wikipedia data, the corpus of the training set relates to various fields, and the number of the training set is large, so that the bidirectional pre-training language model Bert contains a large amount of priori knowledge, and the improvement of the recognition performance and the cross-domain recognition capability of the model is facilitated. Secondly, pre-training the bidirectional pre-training language model Bert by utilizing a training set, wherein the training set comprises multi-field corpus data, each corpus data in the multi-field corpus data comprises at least one requirement description sentence, and the training set can adopt massive Wikipedia data. And then, adjusting the pre-trained bidirectional pre-training language model Bert by utilizing at least one corpus data of a professional field, wherein each corpus data of the at least one corpus data of the professional field comprises at least one requirement description sentence of the professional field. The specialized field is the same as the field of the requirement description sentence in practical application, that is, the field is the same as the field of the requirement description sentence in step S101, for example, if the functional point recognition is performed on the requirement document in the building aspect, the requirement description sentence in the building field is used as the adjustment training set of the bi-directional pre-training language model Bert.
By taking the requirement description statement in fig. 2 as an example, the requirement description statement is input into the named entity recognition model obtained in the second training mode, and the obtained named entity includes entity, operation, attribute, function and others, and specifically includes a machine calendar card, management, recording, modification, deletion, query and viewing.
Step S103, performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, wherein the word segmentation set comprises at least one word segmentation.
In the step, if the requirement description statement is Chinese, Chinese word segmentation processing is carried out on the requirement description statement; and if the requirement description sentence is English, carrying out English word segmentation on the requirement description sentence.
And step S104, merging the at least one named entity and the participles in the participle set, and performing part-of-speech tagging on the merged result. Part-of-speech tagging may be based on Hidden Markov Models (HMMs), Conditional Random Fields (CRFs), Maximum Entropy Markov Models (MEMMs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs).
And step S105, processing the part-of-speech tagging result to identify the functional point. Specifically, dependency syntax analysis is performed on the part-of-speech tagging results to obtain the dependency relationship between words; and identifying the functional points according to the dependency relationship.
The function points identified in step S105 include at least one of an internal logic file ILF, an external interface file EIF, an external input EI, an external output EO, and an external query EQ. Taking the requirement description statement in step fig. 2 as an example, the identified function points include an internal logic file ILF, an external input EI, an external output EO, and an external query EQ, wherein the detailed identification result is shown in fig. 2.
According to the function point identification method detailed in the steps S101 to S105, the named entity in the requirement description sentence is identified through the named entity identification model, the named entity and the participle of the requirement description sentence are combined, and the function point is identified through analyzing the part-of-speech tagging result of the combined result.
A second aspect of the present embodiment provides a software development cost estimation method, which may also be executed by an estimation device, where the estimation device may be software, or a combination of software and hardware, and the estimation device may be integrally disposed in a server, a terminal device, or the like. Specifically, as shown in fig. 4, the software development cost estimation method includes the following steps S201 to S202.
Step S201, identifying a function point in the software requirement document by using any one of the function point identification methods in steps S101 to S105 in the first aspect.
And S202, calculating software development cost based on the function points and the industry benchmark data.
A third aspect of this embodiment provides a function point identification device, as shown in fig. 5, where the function point identification device includes a sentence acquisition module, a named entity acquisition module, a merging module, a part-of-speech tagging module, and a function point identification module, which are sequentially connected in a communication manner, and a word segmentation module is further connected between the sentence acquisition module and the merging module in a communication manner;
the statement acquisition module is used for acquiring at least one requirement description statement;
the named entity obtaining module is used for inputting the at least one requirement description sentence into the trained named entity recognition model to obtain at least one named entity;
the word segmentation module is used for performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, and the word segmentation set comprises at least one word segmentation;
the merging module is used for merging the at least one named entity and at least one participle in the participle set;
the part-of-speech tagging module is used for carrying out part-of-speech tagging on the merged result;
and the functional point identification module is used for processing the part-of-speech tagging result and identifying the functional point.
In a possible design, the recognition apparatus further includes a storage module, and the storage module is configured to store information such as the requirement description statement, the training set of the named entity recognition model, and the like.
In one possible design, the function point identification module comprises a dependency analysis module and an identification module which are sequentially connected in a communication manner, wherein the dependency analysis module performs dependency syntax analysis on the part-of-speech tagging result to obtain the dependency relationship between words; the identification module identifies the functional points according to the dependency relationships.
In one possible design, the named entity acquisition module includes a bidirectional pre-trained language model Bert and a conditional random field CRF, which are signal-connected in sequence.
A third aspect of the present embodiment provides a software development cost estimation device, which includes a function point identification device and a cost accounting module, which are sequentially connected in communication,
the function point identification device is a function point identification device designed in the third aspect and any one of the possible ways;
and the cost accounting module is used for calculating the software research and development cost according to the function points and the industry benchmark data.
A fourth aspect of the present embodiment provides a computer device, including a memory and a processor, where the memory is used for storing a computer program, and the processor is used for reading the computer program and executing the method for identifying a function point according to the first aspect or the method for estimating a software development cost according to the second aspect. For example, the Memory may include, but is not limited to, a Random-Access Memory (RAM), a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a First-in First-out (FIFO), and/or a First-in Last-out (FILO), and the like; the processor may not be limited to the use of a microprocessor of the model number STM32F105 family. In addition, the computer device may also include, but is not limited to, a power module, a display screen, and other necessary components.
A fifth aspect of the present embodiment provides a computer-readable storage medium, which stores instructions that, when executed on a computer, perform the function point identification method according to the first aspect or the software development cost estimation method according to the second aspect.
The embodiments described above are merely illustrative, and may or may not be physically separate, if referring to units illustrated as separate components; if reference is made to a component displayed as a unit, it may or may not be a physical unit, and may be located in one place or distributed over a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: modifications may be made to the embodiments described above, or equivalents may be substituted for some of the features described. And such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims (10)

1. A method for identifying a function point is characterized by comprising the following steps:
acquiring at least one requirement description statement;
inputting the at least one requirement description statement into the trained named entity recognition model to obtain at least one named entity;
performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, wherein the word segmentation set comprises at least one word segmentation;
merging the at least one named entity and the participles in the participle set, and performing part-of-speech tagging on a merged result;
and processing the part-of-speech tagging result to identify the functional point.
2. The method according to claim 1, wherein the processing the part-of-speech tagging result to identify the functional point comprises:
performing dependency syntax analysis on the part of speech tagging result to obtain the dependency relationship between words;
the functional points are identified according to the dependency relationships.
3. The method according to claim 1, wherein the named entity recognition model comprises a bidirectional pre-training language model Bert and a conditional random field CRF, which are sequentially signal-connected.
4. The method according to claim 3, wherein the training method of the bi-directional pre-trained language model Bert comprises:
pre-training a bidirectional pre-training language model Bert by utilizing a training set, wherein the training set comprises multi-field corpus data, and each corpus data in the multi-field corpus data comprises at least one requirement description statement;
and adjusting the pre-trained bidirectional pre-training language model Bert by utilizing at least one corpus data of a professional field, wherein each corpus data of the at least one corpus data of the professional field comprises at least one requirement description sentence of the professional field.
5. The method of claim 1, wherein the function point comprises at least one of an internal logic file ILF, an external interface file EIF, an external input EI, an external output EO, and an external query EQ.
6. A software development cost prediction method is characterized by comprising the following steps:
identifying the functional points in the software requirement document by adopting the functional point identification method of any one of the claims 1-5;
and calculating software research and development cost based on the function points and the industry benchmark data.
7. A function point identification device is characterized by comprising a sentence acquisition module, a named entity acquisition module, a merging module, a part-of-speech tagging module and a function point identification module which are sequentially connected in a communication manner, wherein a word segmentation module is further connected between the sentence acquisition module and the merging module in a communication manner;
the statement acquisition module is used for acquiring at least one requirement description statement;
the named entity obtaining module is used for inputting the at least one requirement description sentence into the trained named entity recognition model to obtain at least one named entity;
the word segmentation module is used for performing word segmentation processing on the at least one requirement description sentence to obtain a word segmentation set, and the word segmentation set comprises at least one word segmentation;
the merging module is used for merging the at least one named entity and the participles in the participle set;
the part-of-speech tagging module is used for carrying out part-of-speech tagging on the merged result;
and the functional point identification module is used for processing the part-of-speech tagging result and identifying the functional point.
8. A software development cost pre-estimation device is characterized by comprising a function point identification device and a cost accounting module which are sequentially in communication connection,
the function point identifying apparatus is the function point identifying apparatus of claim 7;
and the cost accounting module is used for calculating the software research and development cost according to the function points and the industry benchmark data.
9. A computer device comprising a memory and a processor communicatively connected, wherein the memory is configured to store a computer program, and the processor is configured to read the computer program, perform the function point identification method according to any one of claims 1 to 5, or perform the software development cost estimation method according to claim 6.
10. A computer-readable storage medium having stored thereon instructions for performing the method of any one of claims 1-5 or the method of estimating development cost of software according to claim 6 when the instructions are run on a computer.
CN202110386325.8A 2021-04-09 2021-04-09 Software function point identification method based on natural language pre-training model (BERT) Pending CN113127060A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110386325.8A CN113127060A (en) 2021-04-09 2021-04-09 Software function point identification method based on natural language pre-training model (BERT)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110386325.8A CN113127060A (en) 2021-04-09 2021-04-09 Software function point identification method based on natural language pre-training model (BERT)

Publications (1)

Publication Number Publication Date
CN113127060A true CN113127060A (en) 2021-07-16

Family

ID=76775937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110386325.8A Pending CN113127060A (en) 2021-04-09 2021-04-09 Software function point identification method based on natural language pre-training model (BERT)

Country Status (1)

Country Link
CN (1) CN113127060A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115113919A (en) * 2022-08-30 2022-09-27 四川赛闯检测股份有限公司 Software scale measurement intelligent informatization system based on BERT model and Web technology
CN117635243A (en) * 2023-11-27 2024-03-01 中安启成科技有限公司 Intelligent software cost assessment method and system for enabling large language model
CN117635243B (en) * 2023-11-27 2024-06-25 中安启成科技有限公司 Intelligent software cost assessment method and system for enabling large language model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330011A (en) * 2017-06-14 2017-11-07 北京神州泰岳软件股份有限公司 The recognition methods of the name entity of many strategy fusions and device
US20180203843A1 (en) * 2017-01-13 2018-07-19 Yahoo! Inc. Scalable Multilingual Named-Entity Recognition
CN109271527A (en) * 2018-09-27 2019-01-25 华东师范大学 A kind of appellative function point intelligent identification Method
CN109376353A (en) * 2018-09-04 2019-02-22 国家电网公司华东分部 A kind of power grid start-up operation ticket generating means and method based on natural language processing
CN109684645A (en) * 2018-12-29 2019-04-26 北京泰迪熊移动科技有限公司 Chinese word cutting method and device
CN111274817A (en) * 2020-01-16 2020-06-12 北京航空航天大学 Intelligent software cost measurement method based on natural language processing technology
CN111563383A (en) * 2020-04-09 2020-08-21 华南理工大学 Chinese named entity identification method based on BERT and semi CRF

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180203843A1 (en) * 2017-01-13 2018-07-19 Yahoo! Inc. Scalable Multilingual Named-Entity Recognition
CN107330011A (en) * 2017-06-14 2017-11-07 北京神州泰岳软件股份有限公司 The recognition methods of the name entity of many strategy fusions and device
CN109376353A (en) * 2018-09-04 2019-02-22 国家电网公司华东分部 A kind of power grid start-up operation ticket generating means and method based on natural language processing
CN109271527A (en) * 2018-09-27 2019-01-25 华东师范大学 A kind of appellative function point intelligent identification Method
CN109684645A (en) * 2018-12-29 2019-04-26 北京泰迪熊移动科技有限公司 Chinese word cutting method and device
CN111274817A (en) * 2020-01-16 2020-06-12 北京航空航天大学 Intelligent software cost measurement method based on natural language processing technology
CN111563383A (en) * 2020-04-09 2020-08-21 华南理工大学 Chinese named entity identification method based on BERT and semi CRF

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吕云翔等: "《Python深度学习》", 31 January 2020, 北京:机械工业出版社, pages: 100 - 101 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115113919A (en) * 2022-08-30 2022-09-27 四川赛闯检测股份有限公司 Software scale measurement intelligent informatization system based on BERT model and Web technology
CN115113919B (en) * 2022-08-30 2023-04-25 四川赛闯检测股份有限公司 Software scale measurement intelligent informatization system based on BERT model and Web technology
CN117635243A (en) * 2023-11-27 2024-03-01 中安启成科技有限公司 Intelligent software cost assessment method and system for enabling large language model
CN117635243B (en) * 2023-11-27 2024-06-25 中安启成科技有限公司 Intelligent software cost assessment method and system for enabling large language model

Similar Documents

Publication Publication Date Title
CN106897439B (en) Text emotion recognition method, device, server and storage medium
CN110019732B (en) Intelligent question answering method and related device
CN110334209B (en) Text classification method, device, medium and electronic equipment
CN110580308B (en) Information auditing method and device, electronic equipment and storage medium
CN113064964A (en) Text classification method, model training method, device, equipment and storage medium
CN112541070B (en) Mining method and device for slot updating corpus, electronic equipment and storage medium
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN110555205A (en) negative semantic recognition method and device, electronic equipment and storage medium
CN113806588A (en) Method and device for searching video
CN115359799A (en) Speech recognition method, training method, device, electronic equipment and storage medium
CN114692628A (en) Sample generation method, model training method, text extraction method and text extraction device
CN112989235A (en) Knowledge base-based internal link construction method, device, equipment and storage medium
CN113127060A (en) Software function point identification method based on natural language pre-training model (BERT)
US20220292587A1 (en) Method and apparatus for displaying product review information, electronic device and storage medium
CN112749238A (en) Search ranking method and device, electronic equipment and computer-readable storage medium
CN114445043B (en) Open ecological cloud ERP-based heterogeneous graph user demand accurate discovery method and system
CN115510860A (en) Text sentiment analysis method and device, electronic equipment and storage medium
CN116090450A (en) Text processing method and computing device
CN114118049B (en) Information acquisition method, device, electronic equipment and storage medium
CN114238370A (en) Method and system for applying NER entity recognition algorithm in report query
CN114647727A (en) Model training method, device and equipment applied to entity information recognition
CN114169418A (en) Label recommendation model training method and device, and label obtaining method and device
CN112925910A (en) Method, device and equipment for assisting corpus labeling and computer storage medium
CN117573956B (en) Metadata management method, device, equipment and storage medium
CN112613295B (en) Corpus recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination