CN113282336B - Code abstract integration method based on quality assurance framework - Google Patents

Code abstract integration method based on quality assurance framework Download PDF

Info

Publication number
CN113282336B
CN113282336B CN202110656618.3A CN202110656618A CN113282336B CN 113282336 B CN113282336 B CN 113282336B CN 202110656618 A CN202110656618 A CN 202110656618A CN 113282336 B CN113282336 B CN 113282336B
Authority
CN
China
Prior art keywords
code
candidate
word
follows
abstract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110656618.3A
Other languages
Chinese (zh)
Other versions
CN113282336A (en
Inventor
鄢萌
胡予星
毕霁超
刘忠鑫
陈秋远
王备
雷晏
徐玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202110656618.3A priority Critical patent/CN113282336B/en
Publication of CN113282336A publication Critical patent/CN113282336A/en
Application granted granted Critical
Publication of CN113282336B publication Critical patent/CN113282336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/72Code refactoring
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a code abstract integration method based on a quality assurance framework. The method comprises the following steps: generating I candidate codes by using the existing code abstract method; based on collaborative filtering components, two quality score Precision are calculated for each candidate code abstract i And Recall i Based on the retrieved components, a quality score REScore is calculated i The method comprises the steps of carrying out a first treatment on the surface of the Quality score Precision using each candidate code digest i And Recall i Calculating a harmonic mean F1score of the candidate code digests i The method comprises the steps of carrying out a first treatment on the surface of the By comparison of the harmonic mean of the candidate code digests and the quality score REScore i Comparing the values, selecting the best one as the final output result sum best . The method used by the invention can effectively integrate the advantages of different models, thereby improving the effectiveness of the code abstract.

Description

Code abstract integration method based on quality assurance framework
Technical Field
The invention relates to the field of software quality assurance, in particular to a code abstract integration method based on a quality assurance framework.
Background
An existing code abstract is a natural language description of a code fragment that can help a developer understand the meaning of the code without reading the entire source code. Considering that developers often spend a lot of time on source code understanding, high quality code digests are essential for software development and maintenance, however, manually writing code digests is a tedious and time-consuming task, which increases the need for automatic code digest methods.
To solve this problem, a number of code digest methods have been proposed. Meanwhile, with the development of deep learning technology and a great deal of source code data which is continuously increased, the automatic learning of a great deal of code abstract pairs by using a deep learning model to generate code abstract has become a very popular research subject. While existing methods of neural code digests have good performance, many high quality code digests can be generated, according to previous studies, some code digests generated by existing methods of code digests tend to have a BLER-4 score of less than 40, which is considered a low quality code digest, which may not only mislead the developer, but also cause the developer to spend a lot of additional time screening.
In fact, almost all document generation methods using neural networks have the above-described problems. To solve this problem, researchers have proposed several quality assurance methods for document generation tasks. However, previous work did not investigate whether the quality assurance method of the document generation task could be applied to improving the code digest.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to solve the technical problems that: the quality of the code abstract is guaranteed, and the effectiveness of the code abstract is improved.
In order to solve the technical problems, the invention adopts the following technical scheme: a code abstract integration method based on a quality assurance framework comprises the following steps:
s100: for a code segment to be tested i Selecting I existing code abstracting methods to generate corresponding I candidate code abstracts
S200: based on collaborative filtering components, for each candidate code digestRespectively calculating two quality fractions i And Recall i
Based on the retrieved components, for each candidate code digestCalculating a mass fraction REScore i
S300: quality score Precision using each candidate code digest i And Recall i Calculating a harmonic mean F1score of the candidate code digests i
S400: selecting the best quality from the I candidate code abstracts as the final output result sum best The specific process is as follows:
f1score to abstract I candidate code i Values were compared and the highest F1score i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
If F1score of the compared candidate code digest i Equal values, then compare the REScore of the candidate code digests i Value of the highest REScore i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
F1score if compared candidate code digests i Value and REScore i If the values are equal, selecting one candidate code abstract as the code of the code segment to be tested i Final code summary result sum best
Preferably, the collaborative filtering-based component in S200, for each candidate code digestRespectively calculating two quality fractions i And Recall i The specific steps of (a) are as follows:
s210: acquiring historical code data, wherein the historical code data is formed by code segments h Reference abstract sum ref And generating summary sum gen Composition;
s211: by word w d Constructing N-dimensional word vectorsThe word w d For codes in historical data h The words contained are specifically defined as follows:
wherein,represents the code +.>Contains word w d N represents the number of history data;
by word w s Constructing N-dimensional word vectorsThe word w s For sum in historical data ref The words contained are specifically defined as follows:
wherein,represents the code +.>Reference abstract->Contains word w s N represents the number of history data;
s212: calculating word w d And w s Correlation Rel (w) d ,w s ) The specific expression is as follows:
s213: building word w d Mapping tableThe definition expression is as follows:
s214: separately computing each candidate code digestTwo quality fraction Precision i And Recall i The specific expression is as follows:
where || represents the length of one set.
The collaborative filtering component is used here for both under-and over-translation cases that exist in summary generation. Under-translation is the fact that the generated abstract is partially missing words compared with the reference. Over-translation is the generation of words that are not within the original reference, or redundant words, etc. The two calculation methods are modified according to the two conditions of under-translation and over-translation, precision is calculated for the over-translation condition, and recovery is calculated for the under-translation condition.
Preferably, the component based on the search in S200, for each candidate code digestCalculating a mass fraction REScore i The specific steps of (a) are as follows:
s220: code segments in historical data using word frequency-inverse document frequencyExpressed as vector +.>The specific expression is as follows:
where #w represents the total number of words,representing the inclusion of the word w in the history data d Code number of (2);
code segment in data to be tested i Represented as vector d i The specific expression is as follows:
wherein the #code i |w d ∈code i Indicating that the data to be tested contains word w d Code number of (2);
s221: calculating code of code segment to be measured i And each history codeSimilarity value between->J similarity values are obtained, and the specific calculation expression is as follows:
s222: the J similarity values obtained in S221 are arranged in a descending order, and the history codes corresponding to the first n similarity values are selected and recorded asTo->
S223: calculating code of code segment to be measured i Correlation scores with the first n history codes, and the obtained result is recorded as the code of the code segment to be tested i Is the mass fraction REScore of (2) i The specific expression is as follows:
the retrieval component is used here because the newly generated summary is likely to be similar to the historical summary, a quality score for retrieval is calculated for this case, which is the BLEU score between the current code and the historical similar code, to obtain the final quality score. The BLEU score is an evaluation index score commonly used in the field of digest generation.
Compared with the prior art, the invention has at least the following advantages:
1. the integration method used by the invention can effectively integrate the advantages of different models, thereby improving the effectiveness of the code abstract.
2. The method has the advantage over the existing most advanced code digest integration method.
Drawings
Fig. 1 is an overall frame diagram of the present invention.
Detailed Description
The present invention will be described in further detail below.
The invention describes a code abstract integration method based on a quality assurance framework. The core idea of the invention is to automatically predict the quality of the digest generated by the most advanced code digest method by giving one code segment and a plurality of code digest methods, and select the one with the best predicted effect as the finally generated digest. The invention is formed by integrating two stages of calculating the quality fraction of the code abstract and a code abstract method. Firstly, the method comprises a component based on collaborative filtering and a component based on retrieval, and is used for calculating the quality score of the abstract; and secondly, the method is composed of alternative code abstract methods for method integration.
Specifically, first, a code is given i The code is obtained by utilizing the most advanced methods of code abstracts at present i Generating multiple candidatesSecond, code is built based on collaborative filtering components i Mapping tables between different words in (a) and words of its corresponding reference abstract, then +.>Calculating the quality fraction Precision based on the mapping table i And Recall i The method comprises the steps of carrying out a first treatment on the surface of the Again, the current code is calculated based on the retrieved components i History->Similarity score between->As a third mass fraction. By comparing candidate code summaries->And selecting the best quality one as the final result.
Referring to fig. 1, a code abstract integration method based on a quality assurance framework is characterized in that: the method comprises the following steps:
s100: for a code segment to be tested i Selecting I existing code abstracting methods to generate corresponding I candidate code abstracts
S200: based on collaborative filtering components, for each candidate code digestRespectively calculating two quality fractions i And Recall i
In particular implementations, the collaborative filtering-based component, for each candidate code digestRespectively calculating two quality fractions i And Recall i The specific steps of (a) are as follows:
s210: acquiring historical code data, wherein the historical code data is formed by code segments h Reference abstract sum ref And generating summary sum gen Composition;
s211: by word w d Constructing N-dimensional word vectorsThe word w d For codes in historical data h The words contained are specifically defined as follows:
wherein,represents the code +.>Contains word w d Specifically expressed as containing->When not included->N represents the number of history data;
by word w s Constructing N-dimensional word vectorsThe word w s For sum in historical data ref The words contained are specifically defined as follows:
wherein,represents the code +.>Reference abstract->Contains word w s Specifically expressed as containing->When not included->N represents the number of history data;
s212: calculating word w d And w s Correlation Rel (w) d ,w s ) The relevance is in terms of the word w d And w s The cosine similarity between the two is expressed by the following specific expression:
s213: building word w d Mapping tableThe definition expression is as follows:
in actual calculation, in order to reduce the size of M and speed up calculation, the default value of k is set to be 10.
S214: separately computing each candidate code digestTwo quality fraction Precision i And Recall i The specific expression is as follows:
where || represents the length of one set.
Based on the retrieved components, for each candidate code digestCalculating a mass fraction REScore i
Concrete embodimentsIn practice, the component based on the search in S200, for each candidate code digestCalculating a mass fraction REScore i The specific steps of (a) are as follows:
s220: using word frequency-inverse document frequency, which is the prior art, to code segments in historical dataExpressed as vector +.>The specific expression is as follows:
where #w represents the total number of words,representing the inclusion of the word w in the history data d Code number of (2);
code segment in data to be tested i Represented as vector d i The specific expression is as follows:
wherein the #code i |w d ∈code i Indicating that the data to be tested contains word w d Code number of (2);
s221: calculating code of code segment to be measured i And each history codeSimilarity value between->J similarity values are obtained, and the specific calculation expression is as follows:
s222: the J similarity values obtained in S221 are arranged in a descending order, and the history codes corresponding to the first n similarity values are selected and recorded asTo->
S223: calculating code of code segment to be measured i Correlation scores with the first n history codes, and the obtained result is recorded as the code of the code segment to be tested i Is the mass fraction REScore of (2) i The specific expression is as follows:
where the value of n is set to 5 by default.
S300: quality score Precision using each candidate code digest i And Recall i Calculating a harmonic mean F1score of the candidate code digests i
S400: selecting the best quality from the I candidate code abstracts as the final output result sum best The specific process is as follows:
f1score to abstract I candidate code i Values were compared and the highest F1score i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
If F1score of the compared candidate code digest i Equal values, then compare the REScore of the candidate code digests i Value of the highest REScore i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
F1score if compared candidate code digests i Value and REScore i If the values are equal, selecting one candidate code abstract as the code of the code segment to be tested i Final code summary result sum best
Experimental data:
in practical experiments, the invention selects three most advanced code abstract generating methods to verify the performance of the invention in terms of improving the code abstract, wherein the code abstract generating methods are Deepcom, rencos and NMT respectively. Deep com is to use a neural network model to generate a summary by combining text information and structural information of codes; rencos is the generation of a digest for a code in combination with a neural network and a search method; NMT uses a neural machine translation model to convert the code into a digest.
The code abstract integration method based on the quality assurance framework is called Ensum. The data used in the experimental procedure was from website gathus, containing two common data sets: both project-and cross-project data sets, both provided by authors of the Deepcom method, from 9,714 GitHub projects, consisting of 588,108 code-digest pairs; wherein the same item data set does not distinguish items, the training set consists of 445,812 code-digest pairs, the verification set and the test set consist of 20,000 code-digest pairs, respectively, in the cross-item data set, the verification set and the test set do not overlap with the training set, the training set consists of 455,000 code-digest pairs, and the verification set and the test set each contain 15,606 code-digest pairs.
And (3) experimental verification:
the invention adopts the methods of manual evaluation and automatic evaluation to verify the effectiveness.
Manual evaluation: in order to verify that the three most advanced code abstract methods selected by the invention have complementarity, the method is suitable for improving the code abstract quality through method integration. The invention uses the result of the manual evaluation to carry out the complementarity analysis, invites 4 participants to carry out the manual evaluation on the experimental result, all participants come from the software engineering specialty and have Java programming experience for 4 years, and are required to evaluate the quality of the generated abstract by checking the semantic relativity between the reference abstract and candidate abstracts generated by deep com, rencos and NMT. Specifically, 100 pieces of data from each data set are randomly selected for evaluation, each piece is scored by 3 participants, and the participants are required to give a quality score of 1 to 5 for each generated summary to measure the semantic correlation between the summary and the reference summary; wherein, 1 represents no semantic association between two abstracts, and 5 represents that the two abstracts have the same semantic. The summary is considered high quality when the score is 4 points or 5 points, and the remaining summary scores are considered low quality.
Automatic evaluation: the invention uses automatic evaluation indexes to measure the quality of the generated code abstract, wherein the used automatic evaluation indexes are BLEU, METEOR and ROUGE-L: the BLEU score is based on the formulaWherein->Representing corrected n_gram accuracy of text block, and penalty factor is +.>c represents a generated digest length, and r represents a reference digest length; METEOR= (1-pen). Times.F means Wherein pen is a punishment factor, punishment is that word sequences in the candidate abstract are different from word sequences in the reference abstract,alpha is a controllable parameter, < >>m is the number of matched tuples in the candidate generated abstract, and c and r are the same as BLEU; the ROUGE-L calculates the length of the longest public subsequence for generating the abstract and the reference abstract, and the longer the length is, the higher the score is based on the F value, +.>Wherein-> Wherein X represents the generated digest, Y represents the reference digest, LCS (X, Y) represents the length of the longest common subsequence of the generated digest and the reference digest, m represents the length of the reference digest, and n represents the length of the candidate digest.
The three most advanced code abstracting methods selected by the invention have strong complementarity with each other, and specific complementarity analysis is shown in table 1.
TABLE 1 complementarity analysis of Deepcom, rencos and NMT
Good (only) means that only the summary generated by the current method is of high quality compared to its reference summary; good (all) means that the digests generated by the three methods are of high quality relative to the reference digests, e.g. 14 unique high quality digests from deep com in the same project dataset, 17 unique high quality digests from Rencos, 8 unique high quality digests from NMT. The above phenomena show that the three code summarization methods are complementary, so the Ensum method provided by the invention integrates the three code summarization methods to improve the complementarity.
The results of automatic evaluation of Ensum on code digest promotion are shown in Table 2.
TABLE 2 automatic evaluation results by OAcom when integrated with three most advanced code digest generation methods on three datasets
The invention compares the selected three most advanced code digest methods with one most advanced code annotation classification based integration method Codesum. Experimental results show that the integrated result of the invention is superior to the results of all other methods, for example, on the same item data set, the integrated result BLEU-4, METEOR and ROUGE-L respectively reach 0.406, 0.289 and 0.557, and the result reaches the standard of high-quality abstract; meanwhile, on BLER, METEOR and ROUGE-L indexes, ensum is respectively improved by 25%, 16% and 9% compared with deep com with the highest index score; furthermore, ensum is increased by 26%, 17% and 9% over Codesum on BLEU-4, METEOR, ROUGE-L indicators, respectively, on the same item dataset; ensum is improved by 11%, 6% and 5% over Codesum on BLEU-4, METEOR and ROUGE-L, respectively, over the cross-project dataset; thus, ensum's idea can more effectively combine the advantages of the three code digest method and produce a higher quality code digest than Codesum.
In short, experimental results prove that the code abstract integration method based on the quality assurance framework can effectively improve the quality of the code abstract; meanwhile, the method can be widely applied to actual working scenes, and contributes to improving the practicability of the quality of the existing code abstract.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (1)

1. A code abstract integration method based on a quality assurance framework is characterized in that: the method comprises the following steps:
s100: for a code segment to be tested i Selecting I existing code abstracting methods to generate corresponding I candidate code abstracts
S200: based on collaborative filtering components, for each candidate code digestRespectively calculating two quality fractions i And Recall i
Based on the retrieved components, for each candidate code digestCalculating a mass fraction REScore i
For each candidate code digestRespectively calculating two quality fractions i And Recall i The specific steps of (a) are as follows:
s210: acquiring historical code data, wherein the historical code data is formed by code segments h Reference abstract sum ref And generating summary sum gen Composition;
s211: by word w d Constructing N-dimensional word vectorsThe word w d For codes in historical data h The words contained are specifically defined as follows:
wherein V is wd, Representing the code at the jth historyContains word w d N represents the number of history data;
by word w s Constructing N-dimensional word vectorsThe word w s For sum in historical data ref The words contained are specifically defined as follows:
wherein,represents the code +.>Reference abstract->Contains word w s N represents the number of history data;
s212: calculating word w d And w s Correlation Rel (w) d ,w s ) The specific expression is as follows:
s213: building word w d Mapping tableThe definition expression is as follows:
s214: separately computing each candidate code digestTwo quality fraction Precision i And Recall i The specific expression is as follows:
where || represents the length of one set;
for each candidate code digestCalculating a mass fraction REScore i The specific steps of (a) are as follows:
s220: code segments in historical data using word frequency-inverse document frequencyExpressed as vector +.>The specific expression is as follows:
where # denotes the total number of words,representing the inclusion of the word w in the history data d Code number of (2);
code segment in data to be tested i Represented as vector d i The specific expression is as follows:
wherein, #, is i |w d ∈code i Indicating that the data to be tested contains word w d Code number of (2);
s221: calculating code of code segment to be measured i And each history codeSimilarity value between->J similarity values are obtained, and the specific calculation expression is as follows:
s222: the J similarity values obtained in S221 are arranged in a descending order, and the history codes corresponding to the first n similarity values are selected and recorded asTo->
S223: calculating code of code segment to be measured i Correlation scores with the first n history codes, and the obtained result is recorded as the code of the code segment to be tested i Is the mass fraction REScore of (2) i The specific expression is as follows:
s300: quality score Precision using each candidate code digest i And Recall i Calculating a harmonic mean F1 of the candidate code digests i
S400: selecting the best quality from the I candidate code abstracts as the final output result sum best The specific process is as follows:
f1 abstracting I candidate codes i Comparing the values, the highest F1 i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
If F1 of the compared candidate code digest i Equal values, then compare the REScore of the candidate code digests i Value of the highest REScore i Candidate code abstract corresponding to value is used as code of code segment to be tested i Final code summary result sum best
F1 of the compared candidate code digests i Value and REScore i If the values are equal, selecting one candidate code abstract as the code of the code segment to be tested i Final code summary result sum best
CN202110656618.3A 2021-06-11 2021-06-11 Code abstract integration method based on quality assurance framework Active CN113282336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110656618.3A CN113282336B (en) 2021-06-11 2021-06-11 Code abstract integration method based on quality assurance framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110656618.3A CN113282336B (en) 2021-06-11 2021-06-11 Code abstract integration method based on quality assurance framework

Publications (2)

Publication Number Publication Date
CN113282336A CN113282336A (en) 2021-08-20
CN113282336B true CN113282336B (en) 2023-11-10

Family

ID=77284599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110656618.3A Active CN113282336B (en) 2021-06-11 2021-06-11 Code abstract integration method based on quality assurance framework

Country Status (1)

Country Link
CN (1) CN113282336B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491459A (en) * 2018-03-05 2018-09-04 中国人民解放军国防科技大学 Optimization method for software code abstract automatic generation model
CN108519890A (en) * 2018-04-08 2018-09-11 武汉大学 A kind of robustness code abstraction generating method based on from attention mechanism
KR20180115921A (en) * 2017-04-14 2018-10-24 박태영 Conversion method of programming language
CN110427483A (en) * 2019-08-05 2019-11-08 腾讯科技(深圳)有限公司 Text snippet evaluating method, device, system and evaluation and test server
WO2020158409A1 (en) * 2019-01-28 2020-08-06 日本電信電話株式会社 Abstract generation device, method, program, and recording medium
CN112270358A (en) * 2020-10-29 2021-01-26 南京航空航天大学 Code annotation generation model robustness improving method based on deep learning
CN112527769A (en) * 2020-12-09 2021-03-19 重庆大学 Automated quality assurance framework for software change log generation method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180115921A (en) * 2017-04-14 2018-10-24 박태영 Conversion method of programming language
CN108491459A (en) * 2018-03-05 2018-09-04 中国人民解放军国防科技大学 Optimization method for software code abstract automatic generation model
CN108519890A (en) * 2018-04-08 2018-09-11 武汉大学 A kind of robustness code abstraction generating method based on from attention mechanism
WO2020158409A1 (en) * 2019-01-28 2020-08-06 日本電信電話株式会社 Abstract generation device, method, program, and recording medium
CN110427483A (en) * 2019-08-05 2019-11-08 腾讯科技(深圳)有限公司 Text snippet evaluating method, device, system and evaluation and test server
CN112270358A (en) * 2020-10-29 2021-01-26 南京航空航天大学 Code annotation generation model robustness improving method based on deep learning
CN112527769A (en) * 2020-12-09 2021-03-19 重庆大学 Automated quality assurance framework for software change log generation method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Fret: Functional Reinforced Transformer With BERT for Code Summarization;Ruyun Wang等;《IEEE Access》;第8卷;第135591-135604页 *
基于LDA的软件代码主题摘要自动生成方法;李文鹏等;《计算机科学》(第04期);第35-38页 *
基于seq2seq框架的代码注释生成方法研究;封雯婷;《中国优秀硕士学位论文全文数据库 (信息科技辑)》(第03期);I138-180 *
基于关键词的代码自动摘要;张世琨等;《计算机研究与发展》(第09期);第1987-2000页 *

Also Published As

Publication number Publication date
CN113282336A (en) 2021-08-20

Similar Documents

Publication Publication Date Title
CN108647205B (en) Fine-grained emotion analysis model construction method and device and readable storage medium
CN109960724B (en) Text summarization method based on TF-IDF
CN108717470B (en) Code segment recommendation method with high accuracy
CN111581474B (en) Evaluation object extraction method of case-related microblog comments based on multi-head attention system
CN109783631B (en) Community question-answer data verification method and device, computer equipment and storage medium
CN106708929B (en) Video program searching method and device
CN111274267A (en) Database query method and device and computer readable storage medium
CN112307182A (en) Question-answering system-based pseudo-correlation feedback extended query method
CN114970525B (en) Text co-event recognition method, device and readable storage medium
CN111611814A (en) Neural machine translation method based on similarity perception
CN113836896A (en) Patent text abstract generation method and device based on deep learning
Arora et al. Artificial Intelligence as Legal Research Assistant.
CN112597768B (en) Text auditing method, device, electronic equipment, storage medium and program product
CN112699018B (en) Software defect positioning method based on software defect association analysis
CN113282336B (en) Code abstract integration method based on quality assurance framework
CN111325015A (en) Document duplicate checking method and system based on semantic analysis
US20230055769A1 (en) Specificity ranking of text elements and applications thereof
JP4567025B2 (en) Text classification device, text classification method, text classification program, and recording medium recording the program
CN114661892A (en) Manuscript abstract generation method and device, equipment and storage medium
JP6181890B2 (en) Literature analysis apparatus, literature analysis method and program
CN113934450A (en) Method, apparatus, computer device and medium for generating annotation information
Naseri et al. Accelerating Legislation Processes through Semantic Similarity Analysis with BERT-based Deep Learning
JP2009217406A (en) Document retrieval device, method, and program
JPWO2020157887A1 (en) Sentence structure vectorization device, sentence structure vectorization method, and sentence structure vectorization program
Gottron External plagiarism detection based on standard IR technology and fast recognition of common subsequences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant