CN108763486A - Paper duplicate checking method, terminal and storage medium based on terminal - Google Patents

Paper duplicate checking method, terminal and storage medium based on terminal Download PDF

Info

Publication number
CN108763486A
CN108763486A CN201810534771.7A CN201810534771A CN108763486A CN 108763486 A CN108763486 A CN 108763486A CN 201810534771 A CN201810534771 A CN 201810534771A CN 108763486 A CN108763486 A CN 108763486A
Authority
CN
China
Prior art keywords
paper
text
similar
content
fingerprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810534771.7A
Other languages
Chinese (zh)
Inventor
张勇
李威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Write State Science And Technology Co Ltd
Original Assignee
Hunan Write State Science And Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Write State Science And Technology Co Ltd filed Critical Hunan Write State Science And Technology Co Ltd
Priority to CN201810534771.7A priority Critical patent/CN108763486A/en
Publication of CN108763486A publication Critical patent/CN108763486A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The paper duplicate checking method based on terminal that the invention discloses a kind of, terminal and storage medium, method include step:Respond the paper duplicate checking request of user;Content of text in paper and presetting database is subjected to fingerprint matching, to determine the similarity between paper and content of text;Show at least three pages simultaneously in terminal display interface;At least three page includes:The first paper document file page that edit-modify for receiving user instructs is exclusively used in the second paper document file page of the sentence of correspondence markings in paper when display similarity is more than predetermined threshold value, and the page for indexing content of text corresponding with display statement.The reference that paper can directly be carried out according to the position to be modified marked in paper original text thereby using family is changed, and no longer needs to search the sentence for needing to change in paper document according to duplicate checking examining report;In addition the switching repeatedly between document need not be also carried out, it is easy to operate, reduce modification time.

Description

Paper duplicate checking method, terminal and storage medium based on terminal
Technical field
The present invention relates to field of computer technology, more particularly to paper duplicate checking method, terminal and computer based on terminal Readable storage medium storing program for executing.
Background technology
College students or researcher need to carry out when obtaining certain technological achievement on the technique direction that itself is studied Paper is delivered.And before paper publishing, it needs to be repeatedly detected the repetitive rate of paper, to avoid wherein a large amount of lengths It is repeated with open source literature, to influence the value of paper itself.
The paper that current existing paper duplicate checking software is mainly uploaded according to user when carrying out duplicate checking is detected, so Generate duplicate checking examining report upon completion of the assays afterwards, duplicate checking examining report record in detail has which word in the paper that user submits With the similarity of open source literature is excessively high and corresponding similar origin.User, which then needs to download this from paper duplicate checking software, to be looked into Re-detection is reported, and the modification of related content is carried out according to duplicate checking examining report, and paper is then uploaded to paper duplicate checking again Software, until the repetitive rate of paper meets the requirements.There are the following problems for this paper duplicate checking method:
User needs the content according to duplicate checking examining report when the content for compareing duplicate checking examining report is accordingly changed Searched in paper document and need the paragraph or sentence changed, also needed to when control modification in addition between two independent documents into Row switches repeatedly, therefore it is long to change the troublesome in poeration and consuming time.
Invention content
Based on this, the present invention provides a kind of paper duplicate checking method, terminal and computer readable storage medium based on terminal, For solving the problems, such as.
In a first aspect, the present invention provides a kind of paper duplicate checking method based on terminal, including step:
Respond the paper duplicate checking request of user;
Content of text in the paper and presetting database is subjected to fingerprint matching, with the determination paper and the text Similarity between this content;
Show at least three pages simultaneously in terminal display interface;At least three page includes:For receiving The the first paper document file page for stating the edit-modify instruction of user, be exclusively used in showing when the similarity is more than predetermined threshold value described in Second paper document file page of the sentence of correspondence markings in paper, and for indexing and showing the corresponding text of the sentence The page of this content.
Optionally, it is additionally provided with duplicate checking button in the terminal display interface;
It is described to further include after the step of showing at least three pages simultaneously in terminal display interface:
The clicking operation of user is detected by the duplicate checking button, and is triggered when detecting the clicking operation of the user Redefine the similarity between the paper and the content of text;
According to the similarity redefined, the second paper document file page is updated.
Optionally, the content of text by the paper and presetting database carries out fingerprint matching, described in determination The step of similarity between paper and the content of text includes:
Obtain the total number of word of all corresponding fingerprints of content of text and the paper in presetting database;
Cutting is carried out to form paper unit to the paper, and using the paper unit as input data, passes through phase The corresponding paper unit fingerprint of each paper unit is calculated like property hash algorithm;
From all corresponding fingerprints of content of text, all fingerprints similar with paper unit fingerprint are searched;
According to all similar fingerprints found, corresponding Similar Text content is loaded;
According to the Similar Text content, the total number of word of the paper unit and the paper, calculate the paper with Similarity between the content of text.
Optionally, the corresponding fingerprint of the content of text is N fingerprints;
The total number of word of all corresponding fingerprints of content of text and the paper in the acquisition presetting database The step of after further include:
The corresponding fingerprint of all content of text is divided into M blocks, to form M fingerprint piecemeal, wherein each fingerprint point Block has N/M fingerprints;
Using the fingerprint with N/M as keyword, inverted index is established respectively for M fingerprint piecemeal;
All similar fingerprints that the basis is found, the step of loading corresponding Similar Text content include:
Determine the fingerprint piecemeal belonging to each similar fingerprint;
Using the similar fingerprint as keyword, from inverted index where the affiliated fingerprint piecemeal of the keyword described in lookup The corresponding Similar Text content of similar fingerprint.
Optionally, the paper unit fingerprint is N fingerprints;
It is described from all corresponding fingerprints of content of text, search similar with paper unit fingerprint all fingerprints The step of include:
All paper unit fingerprints are divided into M blocks, to form M paper fingerprint piecemeal, wherein each paper fingerprint piecemeal With N/M fingerprints;
Each paper fingerprint piecemeal is compared with each fingerprint piecemeal successively, to find out similar all fingerprints.
Optionally, described according to the Similar Text content, the total number of word of the paper unit and the paper, it calculates The step of similarity between the paper and the content of text includes:
According to the Similar Text content, the corresponding phase of each Similar Text content is found out from all paper units Like paper unit;
Each Similar Text content and corresponding similar paper unit are segmented, each Similar Text content is obtained Text participle set and each similar paper unit paper participle set;Wherein, a text participle set is by one Several words in Similar Text content are constituted, and a paper participle set is by several words in a similar paper unit Language is constituted;
Obtain the text size of each Similar Text content and the text size of corresponding similar paper unit;
It is long by the text size of each Similar Text content, text participle set, the text of each similar paper unit Degree and paper participle set, calculate the similar number of words between the paper and the content of text;
The quotient that the similar number of words and the total number of word are divided by is as between the paper and the content of text Similarity.
Optionally, text size, the text by each Similar Text content segments set, each similar paper list The step of text size and the paper participle of member are gathered, calculates the similar number of words between the paper and the content of text wraps It includes:
It is calculated by similar=factor*editSimilar+ (1-factor) * jaccardSimilar each similar Similarity between content of text and corresponding similar paper unit;Wherein, similar be each Similar Text content with it is corresponding Similarity between similar paper unit;Default power of the factor between each Similar Text content and corresponding similar paper unit Repeated factor, 0≤factor≤1;EditSimilar is Editing similarity, editSimilar=1-editDistance (a, b)/ Max, a be Similar Text content text size, b be similar paper unit text size, editDistance be editor away from From;JaccardSimilar be Jie Kade similarities, jaccardSimilar=| A ∩ B |/| A ∪ B |, A be text participle collection It closes, B is paper participle set;
Pass throughCalculate the similar number of words between the paper and the content of text, wherein S is phase Like number of words, i be i-th of similar paper unit, n be similar paper unit sum, similar be each Similar Text content with Similarity between corresponding similar paper unit, b are the text size of the similar paper unit.
Second aspect, the present invention provide a kind of terminal, are used for paper duplicate checking, and the terminal includes:
Respond module, the paper duplicate checking for responding user are asked;
Determining module, for the content of text in the paper and presetting database to be carried out fingerprint matching, to determine Similarity between review text and the content of text;
Display module, for showing at least three pages simultaneously on display interface;At least three page includes:With In the first paper document file page of the edit-modify instruction for receiving the user, it is exclusively used in showing that the similarity is more than default threshold When value in the paper sentence of correspondence markings the second paper document file page, and for indexing and showing that the sentence corresponds to The content of text the page.
The third aspect, the present invention provide a kind of terminal, and the terminal includes:Memory, processor and it is stored in described deposit On reservoir and the computer program that can run on the processor, the computer program are realized when being executed by the processor The step of paper duplicate checking method based on terminal as described above.
Fourth aspect, the present invention provide a kind of computer readable storage medium, are deposited on the computer readable storage medium Computer program is contained, the paper duplicate checking side based on terminal as described above is realized when the computer program is executed by processor The step of method.
The present invention by showing at least three pages simultaneously in terminal display interface;Wherein, at least three page Including:The first paper document file page that edit-modify for receiving the user instructs is exclusively used in showing that the similarity is super The second paper document file page of the sentence of correspondence markings in paper when predetermined threshold value is crossed, and described for indexing and showing The page of the corresponding content of text of sentence allows the user in terminal display interface with reference to the second paper document file page The sentence of middle label, and the markd sentence of display correspond to the page of content of text, are carried out pair in the first paper document file page The edit-modify answered need not search sentence or paragraph to be modified, simple, intuitive according to the content of examining report from paper; And terminal display interface shows that user needs the page for changing paper, the paper document for being marked with sentence to be modified simultaneously The page of the content of text of the page and sentence manipulative indexing to be modified, and then the reference of user is facilitated to change operation, it saves Change the time of paper.
Description of the drawings
Fig. 1 is the flow diagram of one embodiment of paper duplicate checking method the present invention is based on terminal;
Fig. 2 is that the present invention is based on the refinement flow diagrams of step S20 in another embodiment of paper duplicate checking method of terminal;
Fig. 3 is that the present invention is based on the refinement flow signals of step S250 in the another embodiment of paper duplicate checking method of terminal Figure;
Fig. 4 is the high-level schematic functional block diagram of terminal of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific implementation mode
For the ease of more fully understanding the present invention, the present invention is carried out further below in conjunction with related embodiment attached drawing It explains.The embodiment of the present invention is given in attached drawing, but the present invention is not limited in above-mentioned preferred embodiment.On the contrary, providing The purpose of these embodiments be in order to make disclosure of the invention face more fully.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (system of such as computer based system including processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicating, propagating or passing Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.
The more specific example (non-exhaustive list) of computer-readable medium includes following:It is connected up with one or more Electrical connection section (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can be for example by carrying out optical scanner to paper or other media, then into edlin, interpretation or when necessary with it His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the present invention can be realized with hardware, software, firmware or combination thereof.Above-mentioned In embodiment, software that multiple steps or method can in memory and by suitable instruction execution system be executed with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of row technology or their combination are realized:With the logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit application-specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any One or more embodiments or example in can be combined in any suitable manner.
Referring to Fig. 1, the flow chart of the paper duplicate checking method based on terminal provided for one embodiment of the invention, including step Rapid S10 to S30.
Step S10 responds the paper duplicate checking request of user;
When user needs to carry out paper duplicate checking, the request of paper duplicate checking can be initiated, the paper duplicate checking of terminal response user is asked It asks.For example, it may be being provided with duplicate checking button on the display interface of terminal, after paper document uploads successfully, can touch The duplicate checking button is sent out to initiate the request of paper duplicate checking, the processor of terminal receives the duplicate checking button in terminal display module and reports Paper duplicate checking instruction after, can carry out user's paper duplicate checking request response.
Content of text in the paper and presetting database is carried out fingerprint matching, with the determination paper by step S20 With the similarity between the content of text;
It should be noted that presetting database can be made of local data base and network data base.Local data base can Can be .doc formats and/or .pdf formats to collect the paper delivered;It is to be understood that when the paper format collected When being .pdf formats, it can be extracted from the thesis file of .pdf formats specifically according to the Apache Open-Source Tools pdfbox provided File content.File content in network data base can then be implemented to capture document money from internet using crawler technology Source, the document resource can be the document resources after denoising.
During carrying out fingerprint matching, if two files are completely the same, the fingerprint of two files is identical. And hash algorithm is usually used in the calculating of fingerprint, but traditional hash algorithm can only ensure the fingerprint that original contents are calculated Uniformly random as possible, for two identical documents, their original contents must be identical, but differ finger for two Line, traditional hash algorithm will not provide additional information, for paper again other than the original contents for illustrating them differ The dyscalculia of similarity.And in practice, even if the original contents of two files have only differed a byte, correspond to Fingerprint be also likely to differ greatly, therefore in the present embodiment, traditional Hash can be substituted by similitude hash algorithm Algorithm calculate separately out paper fingerprint and presetting database in the corresponding fingerprint of content of text.Similitude hash algorithm is main Thought is dimensionality reduction, by the maps feature vectors of higher-dimension at the feature vector of low-dimensional, by the Hamming distance of two feature vectors come Fingerprint matching is carried out, to determine whether paper is similar to content of text.
It should be noted that when evaluating the similitude of paper and content of text by similitude hash algorithm, pass through the Chinese Prescribed distance judges that whether similar text standard be related with the digit of fingerprint.By taking 64 similitude Hash fingerprints as an example, we Based on experience value it is considered that paper of the Hamming distance within 3 is high with content of text similarity.
Step S30 shows at least three pages simultaneously in terminal display interface;At least three page includes:With In the first paper document file page of the edit-modify instruction for receiving the user, it is exclusively used in showing that the similarity is more than default threshold When value in the paper sentence of correspondence markings the second paper document file page, and for indexing and showing that the sentence corresponds to The content of text the page.
After the similarity between paper and content of text is determined, can by similarity and preset threshold value into Row compares, and wherein threshold value can be set as one, can also be set as multiple.Similarity is one and is more than or equal to 0, is less than or equal to 1 numerical value, threshold value be also in this way, but generally shown in the form of percentage, for example, it can be set to be 30%, 50%.
When terminal display interface is shown, it can at least show that three pages, one of page are the first paper text The shelves page can receive user and be instructed by the edit-modify that keyboard, microphone, mouse and/or touch screen are sent out, then basis Edit-modify instruction carries out the modification of papers contents and/or increasing newly for content.Second paper document file page and the first paper text The shelves page is the same paper of the submitted duplicate checking of user, and user can carry out different predetermined threshold values by terminal display interface Adjustment setting so that the labeled statement shown in the second paper document file page is adjusted according to the setting of practical predetermined threshold value. Such as second show the sentence that similarity in paper is more than 50% label in paper document file page, the sentence of the label can be with It is highlighted by modes such as color or overstrikings.Further, in the sentence being marked provided with multiple predetermined threshold values Display when, can also be distinguished by different colours, can for example, similarity is more than the sentence of 50% to 80% label With the highlighting into line statement with the background color of yellow;Similarity is more than the sentence of 80% label, can pass through red Background color highlighting into line statement.In addition, about the page for indexing and showing the corresponding content of text of the sentence, The sentence marked in the second paper document file page that can be then selected according to user carries out the index of corresponding content of text and shows Show, in addition other than showing similar content of text, can also show the author of Similar Text content, the phase where publishing thesis Print title, article name delivers the time, the corresponding similarity of the Similar Text, and in the amending advice of user at least One.
In addition it is also necessary to explanation, in terminal display interface the display of the page load of FreeMarker technologies is utilized Web page template, to generate the html web page data for showing the page.
The present embodiment is asked by responding the paper duplicate checking of user;By the content of text in the paper and presetting database Fingerprint matching is carried out, with the similarity between the determination paper and the content of text;It is shown simultaneously in terminal display interface At least three pages;At least three page includes:The first paper text that edit-modify for receiving the user instructs The shelves page is exclusively used in showing the second paper text of the sentence of correspondence markings in the paper when similarity is more than predetermined threshold value The shelves page, and the page for indexing and showing the corresponding content of text of the sentence.It allows the user in terminal On display interface content of text is corresponded to reference to the sentence marked in the second paper document file page, and the markd sentence of display The page carries out corresponding edit-modify in the first paper document file page, need not be looked into from paper according to the content of examining report Look for sentence or paragraph to be modified, simple, intuitive;And terminal display interface simultaneously show user need change paper the page, It is marked with the page of the paper document file page of sentence to be modified and the content of text of sentence manipulative indexing to be modified, in turn Facilitate the reference of user to change operation, saves the time of modification paper.
In other embodiments, it is additionally provided with duplicate checking button in the terminal display interface, it can be with after the step S30 Include the following steps:
The clicking operation of user is detected by the duplicate checking button, and is triggered when detecting the clicking operation of the user Redefine the similarity between the paper and the content of text;
According to the similarity redefined, the second paper document file page is updated.
It should be noted that in traditional paper duplicate checking scheme, when the paper polishing of user is completed, it is also necessary on again Paper is passed to paper duplicate checking software, cannot know whether repetitive rate meets the requirements at once, it is therefore desirable to be repaiied repeatedly to paper Change.By the way that duplicate checking button is arranged, on the basis of three interfaces shown based on terminal display interface, directly again according to network Database and local data base calculate similarity to modified paper, to find out between modified paper and content of text Similarity and similar content, facilitate user to receive the update feedback of paper duplicate checking result, and modify, grasp to paper Work facilitates intuitive.
Referring to Fig. 2, being the thin of step S20 in the paper duplicate checking method based on terminal that another embodiment of the present invention provides Change flow chart, including step S21 to step S25.
Step S21, obtain presetting database in all corresponding fingerprints of content of text and the paper it is total Number of words;
The determination of similarity is divided by acquirement by the total number of word of all similar number of words and paper, and wherein paper The software that total number of word can be supported by terminal display interface is directly acquired, it is usually required mainly for what is sought is similar number of words. It has been related to carrying out fingerprint comparison with the content of text in local data base and network data base among these, it is therefore desirable to obtain The corresponding fingerprint of each content of text.
Specifically, each the acquisition methods of the corresponding fingerprint of content of text can be according to preset standard to text Content carries out cutting, and the fingerprint of each unit after cutting is then calculated by similitude hash algorithm.It is alternatively possible to sentence Son is the cutting that unit carries out content of text.
Further, it when the corresponding fingerprint of the content of text is N fingerprints, can also be wrapped after the step S21 Include step:
The corresponding fingerprint of all content of text is divided into M blocks, to form M fingerprint piecemeal, wherein each fingerprint point Block has N/M fingerprints;
Using the fingerprint with N/M as keyword, inverted index is established respectively for M fingerprint piecemeal;
Real time response speed in order to improve paper duplicate checking can be the corresponding sea of all content of text in presetting database Amount fingerprint establishes multiple index, to achieve the purpose that improve response speed.Wherein the corresponding fingerprint of content of text is N 64 fingerprints, when establishing multiple index, can be divided into 4 pieces, i.e. M is equal to 4, each fingerprint by fingerprint for example, N is equal to 64 Piecemeal has 16 fingerprints.16 fingerprints having using each fingerprint piecemeal is keywords, you can establish in keyword and text Hold associated inverted index, inverted index is equivalent to the mapping table of keyword content of text corresponding with keyword, The quantity for arranging index is identical as the number of fingerprint piecemeal.
Step S22 carries out cutting to form paper unit, and using the paper unit as input number to the paper According to calculating the corresponding paper unit fingerprint of each paper unit by similitude hash algorithm;
The fingerprint for calculating paper can also be similar with the fingerprint of content of text is calculated, when carrying out specific fingerprint matching, If the corresponding fingerprint of content of text is N fingerprints in presetting database, the corresponding fingerprint of paper also should be N fingerprints, N For the integer more than 0.
Step S23 is searched similar with paper unit fingerprint all from all corresponding fingerprints of content of text Fingerprint;
When carrying out fingerprint matching processing, it is also desirable to divide equally to the corresponding fingerprint of paper.Still it is with N fingerprints The corresponding fingerprint of paper, can be divided into the consistent M blocks of the equal timesharing of the corresponding fingerprint of content of text, to form M block papers by example Fingerprint piecemeal, each paper fingerprint piecemeal have N/M fingerprints.Since the corresponding fingerprint of content of text has been also divided into M blocks, because This passes through the comparison of each fingerprint piecemeal and each paper fingerprint piecemeal, you can finds out similar fingerprint.
Based on above-mentioned analysis, when the corresponding fingerprint of the content of text is N fingerprints, the step S23 includes Step:
All paper unit fingerprints are divided into M blocks, to form M paper fingerprint piecemeal, wherein each paper fingerprint piecemeal With N/M fingerprints;
Each paper fingerprint piecemeal is compared with each fingerprint piecemeal successively, to find out similar all fingerprints.
Step S24 loads corresponding Similar Text content according to all similar fingerprints found;
It is to be understood that there are similar fingerprint between the corresponding fingerprint of content of text fingerprint corresponding with paper, that Necessarily there are Similar Text contents in urtext data between content of text and paper, can be by fingerprint and text Correspondence between appearance is searched with similar fingerprint, to carry out the load of Similar Text content.
Still by taking the inverted index established in the present embodiment as an example, the step S24 may include step:
Determine the fingerprint piecemeal belonging to each similar fingerprint;
Using the similar fingerprint as keyword, from inverted index where the affiliated fingerprint piecemeal of the keyword described in lookup The corresponding Similar Text content of similar fingerprint.
Since the corresponding fingerprint foundation of content of text has inverted index, similar fingerprint can be first passed through and find affiliated finger Line piecemeal is quickly found from the correspondence fingerprint piecemeal in corresponding Similar Text then using similar fingerprint as keyword Hold.
Step S25 calculates institute according to the Similar Text content, the total number of word of the paper unit and the paper Similarity between review text and the content of text.
It should be noted that the calculating of similar number of words is related to the distance of feature vector between text, pass through the size of distance The similar situation between text can be assessed.And the calculating of similar number of words and all Similar Text contents and corresponding paper list The similar situation of word between member is related, therefore can calculate phase by Similar Text content and corresponding paper unit Like number of words, the similarity between paper and content of text then can be calculated in conjunction with paper total number of word.
It is paper this gives fingerprint matching to obtain the specific method of similarity between content of text and paper The implementation of duplicate checking provides technical foundation.In addition, having also set up the multiple index of the corresponding fingerprint of content of text, help to accelerate The real-time response of paper duplicate checking and obtaining for paper duplicate checking result.
Referring to Fig. 3, being the thin of step S25 in the paper duplicate checking method based on terminal that further embodiment of this invention proposes Change flow diagram.The step S25 includes step S251 to step S255.
Step S251 is found out from all paper units in each Similar Text according to the Similar Text content Hold corresponding similar paper unit;
Similar paper unit in the present embodiment may include:The similarity that second paper document file page is shown is more than default The sentence when sentence and similarity marked in paper when threshold value is less than predetermined threshold value in paper, i.e., all theories of similarity Literary unit is the set that content is associated between content of text and paper.
Step S252 segments each Similar Text content and corresponding similar paper unit, obtains each phase Set is segmented like the text participle set of content of text and the paper of each similar paper unit;Wherein, a text participle Set is made of several words in a Similar Text content, and a paper participle set is by a similar paper unit Several words constitute;
Since the distance that the determination of similar number of words is related to feature vector between text calculates, and the calculating of similar number of words with The similar situation of word between all Similar Text contents and corresponding paper unit is related, it is therefore desirable to in Similar Text Appearance is segmented with corresponding paper unit, and technology can refer to existing participle tool used by specific participle and participle is calculated Method, herein without repeating.After each Similar Text and corresponding similar paper unit are segmented, you can segmented The set that word afterwards is constituted, wherein obtaining paper participle set, single similar text after single similar paper unit participle Text participle set is obtained after this content participle.Similar paper can be calculated by paper participle set and text participle set Jie Kade similarities between unit and Similar Text content, or referred to as Jie Kade similarity factors.
The text of step S253, the text size and corresponding similar paper unit that obtain each Similar Text content are long Degree;
It should be noted that perhaps the text size of theory of similarity text unit indicates a Similar Text content in Similar Text Or the corresponding number of words of a similar paper unit.By text size can determine similar paper unit and Similar Text content it Between editing distance and Editing similarity.
Step S254 passes through the text size of each Similar Text content, text participle set, each similar paper unit Text size and paper participle set, calculate the similar number of words between the paper and the content of text;
Specifically, the step S254 may include:
It is calculated by similar=factor*editSimilar+ (1-factor) * jaccardSimilar each similar Similarity between content of text and corresponding similar paper unit;Wherein, similar be each Similar Text content with it is corresponding Similarity between similar paper unit;Default power of the factor between each Similar Text content and corresponding similar paper unit Repeated factor, 0≤factor≤1;EditSimilar is Editing similarity, editSimilar=1-editDistance (a, b)/ Max, a be Similar Text content text size, b be similar paper unit text size, editDistance be editor away from From;JaccardSimilar be Jie Kade similarities, jaccardSimilar=| A ∩ B |/| A ∪ B |, A be text participle collection It closes, B is paper participle set;
Pass throughCalculate the similar number of words between the paper and the content of text, wherein S is phase Like number of words, i be i-th of similar paper unit, n be similar paper unit sum, similar be each Similar Text content with Similarity between corresponding similar paper unit, b are the text size of the similar paper unit.
It should be noted that Jie Kade similarities are used to compare the similitude and otherness between finite sample collection, outstanding person's card Moral similarity is bigger, and the similarity between sample set is higher.And the calculating of similar number of words is to be blocked respectively by outstanding person in the present embodiment Moral similarity, Editing similarity combination weight factor are multiplied what addition obtained respectively.
Step S255, the quotient that the similar number of words and the total number of word are divided by is as the paper and the text Similarity between content.
It should be noted that since paper includes catalogue, title, formula, chart and bibliography etc., but passing through When software carries out total number of word identification, above-mentioned non-body part can be generally skipped, therefore the total number of word of practical paper can be more than detection The total number of word gone out.And it is wherein generally smaller with the paper actual content degree of correlation similar to contents such as title, catalogues, it is excluded total The determination of true similarity is also beneficial within the scope of number of words.This gives the calculating sides of similar number of words in similarity calculation Formula is conducive to the specific implementation of paper duplicate checking.
Referring to Fig. 4, Fig. 4 is the structural schematic diagram for the terminal that one embodiment of the invention proposes, the terminal includes:
Respond module 10, the paper duplicate checking for responding user are asked;
Determining module 20, for the content of text in the paper and presetting database to be carried out fingerprint matching, with determination Similarity between the paper and the content of text;
Display module 30, for showing at least three pages simultaneously on display interface;At least three page includes: The first paper document file page that edit-modify for receiving the user instructs is exclusively used in showing that the similarity is more than default When threshold value in the paper sentence of correspondence markings the second paper document file page, and for indexing and showing the sentence pair The page for the content of text answered.
Further, in another embodiment, it is additionally provided with duplicate checking button in the terminal display interface;
The terminal further includes:
Detecting module 40, the clicking operation for detecting user by the duplicate checking button, and detecting the user Clicking operation when triggering redefine similarity between the paper and the content of text;
Update module 50, for according to the similarity redefined, updating the second paper document file page.
Further, in another embodiment, the determining module 20 includes:
Acquiring unit 21, for obtaining the corresponding fingerprint of content of text all in presetting database and institute's review The total number of word of text;
Computing unit 22, for carrying out cutting to the paper to form paper unit, and using the paper unit as Input data calculates the corresponding paper unit fingerprint of each paper unit by similitude hash algorithm;
Searching unit 23, for from all corresponding fingerprints of content of text, searching and paper unit fingerprint phase As all fingerprints;
Loading unit 24, for according to all similar fingerprints found, loading corresponding Similar Text content;
The computing unit 22 is additionally operable to according to the Similar Text content, the paper unit and the paper Total number of word calculates the similarity between the paper and the content of text.
Further, in another embodiment, the corresponding fingerprint of the content of text is N fingerprints;
The determining module 20 further includes:
Cutting unit 25, for the corresponding fingerprint of all content of text to be divided into M blocks, to form M fingerprint piecemeal, Wherein each fingerprint piecemeal has N/M fingerprints;
Unit 26 is established, for using the fingerprint with N/M as keyword, the row's of falling rope to be established respectively for M fingerprint piecemeal Draw;
The loading unit 24 includes:
Determination subelement 241, for determining the fingerprint piecemeal belonging to each similar fingerprint;
First searches subelement 242, is used for using the similar fingerprint as keyword, from the affiliated fingerprint piecemeal of the keyword The corresponding Similar Text content of the similar fingerprint is searched in the inverted index of place.
Further, in another embodiment, the paper unit fingerprint is N fingerprints;
The searching unit 23 includes:
Cutting subelement 231, for all paper unit fingerprints to be divided into M blocks, to form M paper fingerprint piecemeal, In each paper fingerprint piecemeal there are N/M fingerprints;
Second searches subelement 232, is successively compared each paper fingerprint piecemeal with each fingerprint piecemeal, to search Go out similar all fingerprints.
Further, in another embodiment, the computing unit 22 includes:
Third searches subelement 221, every for according to the Similar Text content, being found out from all paper units The corresponding similar paper unit of a Similar Text content;
Participle subelement 222 is obtained for being segmented to each Similar Text content and corresponding similar paper unit The paper of text participle set and each similar paper unit to each Similar Text content segments set;Wherein, one Text participle set is made of several words in a Similar Text content, and a paper participle set is by a theory of similarity Several words in literary unit are constituted;
Obtain subelement 223, the text size for obtaining each Similar Text content and corresponding similar paper list The text size of member;
Computation subunit 224 segments set, Mei Gexiang for text size, the text by each Similar Text content Set is segmented like the text size and paper of paper unit, calculates the similar character between the paper and the content of text Number;
The computation subunit 224, the quotient for the similar number of words and the total number of word to be divided by is as described in Similarity between paper and the content of text.
Further, in another embodiment, the computation subunit 224 is additionally operable to pass through similar=factor* EditSimilar+ (1-factor) * jaccardSimilar calculate each Similar Text content and corresponding similar paper unit Between similarity;Wherein, similarities of the similar between each Similar Text content and corresponding similar paper unit; Default weight factors of the factor between each Similar Text content and corresponding similar paper unit, 0≤factor≤1; EditSimilar is Editing similarity, and editSimilar=1-editDistance (a, b)/max, a are Similar Text content Text size, b be similar paper unit text size, editDistance is editing distance;JaccardSimilar is Jie Kade similarities, jaccardSimilar=| A ∩ B |/| A ∪ B |, A is text participle set, and B is paper participle set;And Pass throughCalculating the similar number of words between the paper and the content of text, wherein S is similar number of words, I be i-th of similar paper unit, n be similar paper unit sum, similar be each Similar Text content with it is corresponding Similarity between similar paper unit, b are the text size of the similar paper unit.
The present embodiment also provides a kind of terminal, and the terminal includes:It memory, processor and is stored on the memory And the computer program that can be run on the processor, the computer program realize institute as above when being executed by the processor The step of paper duplicate checking method based on terminal stated.
The present embodiment also provides a kind of computer readable storage medium, and meter is stored on the computer readable storage medium Calculation machine program realizes the step of the paper duplicate checking method based on terminal as described above when the computer program is executed by processor Suddenly.
Above embodiment described the technical principles of the present invention, and the description is merely to explain the principles of the invention, and It cannot be construed to the limitation of the scope of the present invention in any way.Based on the explanation herein, those skilled in the art is not required to Other specific implementation modes of the present invention can be associated by paying performing creative labour, these modes fall within the present invention's In protection domain.

Claims (10)

1. a kind of paper duplicate checking method based on terminal, which is characterized in that including step:
Respond the paper duplicate checking request of user;
Content of text in the paper and presetting database is subjected to fingerprint matching, in the determination paper and the text Similarity between appearance;
Show at least three pages simultaneously in terminal display interface;At least three page includes:For receiving the use First paper document file page of the edit-modify instruction at family, is exclusively used in showing the paper when similarity is more than predetermined threshold value Second paper document file page of the sentence of middle correspondence markings, and for indexing and showing in the corresponding text of the sentence The page of appearance.
2. the paper duplicate checking method according to claim 1 based on terminal, which is characterized in that in the terminal display interface It is additionally provided with duplicate checking button;
It is described to further include after the step of showing at least three pages simultaneously in terminal display interface:
The clicking operation of user is detected by the duplicate checking button, and is triggered again when detecting the clicking operation of the user Determine the similarity between the paper and the content of text;
According to the similarity redefined, the second paper document file page is updated.
3. the paper duplicate checking method according to claim 1 based on terminal, which is characterized in that described by the paper and pre- If content of text in database carries out fingerprint matching, the step of with similarity between the determination paper and the content of text Including:
Obtain the total number of word of all corresponding fingerprints of content of text and the paper in presetting database;
Cutting is carried out to form paper unit to the paper, and using the paper unit as input data, passes through similitude Hash algorithm calculates the corresponding paper unit fingerprint of each paper unit;
From all corresponding fingerprints of content of text, all fingerprints similar with paper unit fingerprint are searched;
According to all similar fingerprints found, corresponding Similar Text content is loaded;
According to the Similar Text content, the total number of word of the paper unit and the paper, calculate the paper with it is described Similarity between content of text.
4. the paper duplicate checking method according to claim 3 based on terminal, which is characterized in that the content of text is right respectively The fingerprint answered is N fingerprints;
The step of the total number of word of all corresponding fingerprints of content of text and the paper in the acquisition presetting database Further include after rapid:
The corresponding fingerprint of all content of text is divided into M blocks, to form M fingerprint piecemeal, wherein each fingerprint piecemeal has There are N/M fingerprints;
Using the fingerprint with N/M as keyword, inverted index is established respectively for M fingerprint piecemeal;
All similar fingerprints that the basis is found, the step of loading corresponding Similar Text content include:
Determine the fingerprint piecemeal belonging to each similar fingerprint;
Using the similar fingerprint as keyword, searched from inverted index where the affiliated fingerprint piecemeal of the keyword described similar The corresponding Similar Text content of fingerprint.
5. the paper duplicate checking method according to claim 4 based on terminal, which is characterized in that the paper unit fingerprint is N fingerprints;
It is described from all corresponding fingerprints of content of text, search the step of similar with paper unit fingerprint all fingerprints Suddenly include:
All paper unit fingerprints are divided into M blocks, to form M paper fingerprint piecemeal, wherein each paper fingerprint piecemeal has N/M fingerprints;
Each paper fingerprint piecemeal is compared with each fingerprint piecemeal successively, to find out similar all fingerprints.
6. the paper duplicate checking method according to claim 3 based on terminal, which is characterized in that described according to the similar text The total number of word of this content, the paper unit and the paper, calculates the similarity between the paper and the content of text The step of include:
According to the Similar Text content, the corresponding theory of similarity of each Similar Text content is found out from all paper units Literary unit;
Each Similar Text content and corresponding similar paper unit are segmented, the text of each Similar Text content is obtained The paper participle set of well-behaved set of words and each similar paper unit;Wherein, a text participle set is similar by one Several words in content of text are constituted, and a paper participle set is by several word structures in a similar paper unit At;
Obtain the text size of each Similar Text content and the text size of corresponding similar paper unit;
By the text size of the text size of each Similar Text content, text participle set, each similar paper unit with And paper participle set, calculate the similar number of words between the paper and the content of text;
Using the similar number of words to the quotient that the total number of word is divided by as similar between the paper and the content of text Degree.
7. the paper duplicate checking method according to claim 6 based on terminal, which is characterized in that described to pass through each similar text The text size, text participle set, the text size of each similar paper unit and paper participle set of this content, calculate The step of similar number of words between the paper and the content of text includes:
Each Similar Text is calculated by similar=factor*editSimilar+ (1-factor) * jaccardSimilar Similarity between content and corresponding similar paper unit;Wherein, similar be each Similar Text content with it is corresponding similar Similarity between paper unit;Default weights of the factor between each Similar Text content and corresponding similar paper unit because Son, 0≤factor≤1;EditSimilar is Editing similarity, editSimilar=1-editDistance (a, b)/max, A is the text size of Similar Text content, and b is the text size of similar paper unit, and editDistance is editing distance; JaccardSimilar be Jie Kade similarities, jaccardSimilar=| A ∩ B |/| A ∪ B |, A be text participle set, B It segments and gathers for paper;
Pass throughCalculate the similar number of words between the paper and the content of text, wherein S is similar character Number, i be i-th of similar paper unit, n be similar paper unit sum, similar be each Similar Text content with it is corresponding Similar paper unit between similarity, b be the similar paper unit text size.
8. a kind of terminal, which is characterized in that the terminal is used for paper duplicate checking, and the terminal includes:
Respond module, the paper duplicate checking for responding user are asked;
Determining module, for the content of text in the paper and presetting database to be carried out fingerprint matching, the review to determine Similarity between the literary and described content of text;
Display module, for showing at least three pages simultaneously on display interface;At least three page includes:For connecing The the first paper document file page for receiving the edit-modify instruction of the user, when being exclusively used in showing that the similarity is more than predetermined threshold value Second paper document file page of the sentence of correspondence markings in the paper, and for indexing and showing the corresponding institute of the sentence State the page of content of text.
9. a kind of terminal, which is characterized in that the terminal includes:It memory, processor and is stored on the memory and can The computer program run on the processor realizes such as claim when the computer program is executed by the processor The step of paper duplicate checking method based on terminal described in any one of 1 to 7.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes the opinion based on terminal as described in any one of claim 1 to 7 when the computer program is executed by processor The step of literary duplicate checking method.
CN201810534771.7A 2018-05-30 2018-05-30 Paper duplicate checking method, terminal and storage medium based on terminal Pending CN108763486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810534771.7A CN108763486A (en) 2018-05-30 2018-05-30 Paper duplicate checking method, terminal and storage medium based on terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810534771.7A CN108763486A (en) 2018-05-30 2018-05-30 Paper duplicate checking method, terminal and storage medium based on terminal

Publications (1)

Publication Number Publication Date
CN108763486A true CN108763486A (en) 2018-11-06

Family

ID=64003733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810534771.7A Pending CN108763486A (en) 2018-05-30 2018-05-30 Paper duplicate checking method, terminal and storage medium based on terminal

Country Status (1)

Country Link
CN (1) CN108763486A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615001A (en) * 2018-12-05 2019-04-12 上海恺英网络科技有限公司 A kind of method and apparatus identifying similar article
CN109635084A (en) * 2018-11-30 2019-04-16 宁波深擎信息科技有限公司 A kind of real-time quick De-weight method of multi-source data document and system
CN109635254A (en) * 2018-12-03 2019-04-16 重庆大学 Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model
CN110347782A (en) * 2019-07-18 2019-10-18 知者信息技术服务成都有限公司 Article duplicate checking method, apparatus and electronic equipment
CN111353031A (en) * 2020-02-27 2020-06-30 海南谊之脉科技有限公司 Thesis management method, server and system based on big data
CN111400446A (en) * 2020-03-11 2020-07-10 中国计量大学 Standard text duplicate checking method and system
CN111460180A (en) * 2020-03-30 2020-07-28 维沃移动通信有限公司 Information display method and device, electronic equipment and storage medium
CN111611787A (en) * 2019-02-25 2020-09-01 中国海洋大学 Plagiarism evaluation method, system and auxiliary writing system
CN111737966A (en) * 2020-06-11 2020-10-02 北京百度网讯科技有限公司 Document repetition degree detection method, device, equipment and readable storage medium
CN111753536A (en) * 2020-03-19 2020-10-09 北京信聚知识产权有限公司 Automatic patent application text writing method and device
WO2021037012A1 (en) * 2019-08-30 2021-03-04 智慧芽信息科技(苏州)有限公司 Text information navigation and browsing method, apparatus, server and storage medium
CN113139375A (en) * 2021-04-21 2021-07-20 洛阳墨潇网络科技有限公司 Paper similarity detection method and device based on big data
CN113255369A (en) * 2021-06-10 2021-08-13 平安国际智慧城市科技股份有限公司 Text similarity analysis method and device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021201A (en) * 2016-05-19 2016-10-12 珠海市魅族科技有限公司 File editing method and device
CN106126235A (en) * 2016-06-24 2016-11-16 中国科学院信息工程研究所 A kind of multiplexing code library construction method, the quick source tracing method of multiplexing code and system
CN106156154A (en) * 2015-04-14 2016-11-23 阿里巴巴集团控股有限公司 The search method of Similar Text and device thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156154A (en) * 2015-04-14 2016-11-23 阿里巴巴集团控股有限公司 The search method of Similar Text and device thereof
CN106021201A (en) * 2016-05-19 2016-10-12 珠海市魅族科技有限公司 File editing method and device
CN106126235A (en) * 2016-06-24 2016-11-16 中国科学院信息工程研究所 A kind of multiplexing code library construction method, the quick source tracing method of multiplexing code and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WAYBACK MACHINE: "http://www.paperfree.cn/onlineCheck.html在线查重", 《HTTP://WEB.ARCHIVE.ORG/WEB/20180519071819/HTTP://WWW.PAPERFREE.CN/ONLINECHECK.HTML》 *
李宝莹: "面向科研文本的资料管理与查重子***的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635084A (en) * 2018-11-30 2019-04-16 宁波深擎信息科技有限公司 A kind of real-time quick De-weight method of multi-source data document and system
CN109635084B (en) * 2018-11-30 2020-11-24 宁波深擎信息科技有限公司 Real-time rapid duplicate removal method and system for multi-source data document
CN109635254A (en) * 2018-12-03 2019-04-16 重庆大学 Paper duplicate checking method based on naive Bayesian, decision tree and SVM mixed model
CN109615001A (en) * 2018-12-05 2019-04-12 上海恺英网络科技有限公司 A kind of method and apparatus identifying similar article
CN109615001B (en) * 2018-12-05 2020-03-10 上海恺英网络科技有限公司 Method and device for identifying similar articles
CN111611787A (en) * 2019-02-25 2020-09-01 中国海洋大学 Plagiarism evaluation method, system and auxiliary writing system
CN110347782A (en) * 2019-07-18 2019-10-18 知者信息技术服务成都有限公司 Article duplicate checking method, apparatus and electronic equipment
WO2021037012A1 (en) * 2019-08-30 2021-03-04 智慧芽信息科技(苏州)有限公司 Text information navigation and browsing method, apparatus, server and storage medium
CN112445891A (en) * 2019-08-30 2021-03-05 智慧芽信息科技(苏州)有限公司 Text information navigation browsing method, device, server and storage medium
CN111353031B (en) * 2020-02-27 2023-04-14 海南谊之脉科技有限公司 Thesis management method, server and system based on big data
CN111353031A (en) * 2020-02-27 2020-06-30 海南谊之脉科技有限公司 Thesis management method, server and system based on big data
CN111400446A (en) * 2020-03-11 2020-07-10 中国计量大学 Standard text duplicate checking method and system
CN111753536A (en) * 2020-03-19 2020-10-09 北京信聚知识产权有限公司 Automatic patent application text writing method and device
CN111460180A (en) * 2020-03-30 2020-07-28 维沃移动通信有限公司 Information display method and device, electronic equipment and storage medium
CN111460180B (en) * 2020-03-30 2024-03-15 维沃移动通信有限公司 Information display method, information display device, electronic equipment and storage medium
CN111737966A (en) * 2020-06-11 2020-10-02 北京百度网讯科技有限公司 Document repetition degree detection method, device, equipment and readable storage medium
CN111737966B (en) * 2020-06-11 2024-03-01 北京百度网讯科技有限公司 Document repetition detection method, device, equipment and readable storage medium
CN113139375A (en) * 2021-04-21 2021-07-20 洛阳墨潇网络科技有限公司 Paper similarity detection method and device based on big data
CN113255369A (en) * 2021-06-10 2021-08-13 平安国际智慧城市科技股份有限公司 Text similarity analysis method and device and storage medium

Similar Documents

Publication Publication Date Title
CN108763486A (en) Paper duplicate checking method, terminal and storage medium based on terminal
JP4116329B2 (en) Document information display system, document information display method, and document search method
CN106547871B (en) Neural network-based search result recall method and device
CN107491518A (en) Method and apparatus, server, storage medium are recalled in one kind search
US9183281B2 (en) Context-based document unit recommendation for sensemaking tasks
US9582486B2 (en) Apparatus and method for classifying and analyzing documents including text
WO2022142027A1 (en) Knowledge graph-based fuzzy matching method and apparatus, computer device, and storage medium
Im et al. Linked tag: image annotation using semantic relationships between image tags
CN105373546B (en) A kind of information processing method and system for knowledge services
US20150269691A1 (en) Legal reasoning graphs and usage thereof
CN105912645A (en) Intelligent question and answer method and apparatus
WO2016175785A1 (en) Topic identification based on functional summarization
JP2008210024A (en) Apparatus for analyzing set of documents, method for analyzing set of documents, program implementing this method, and recording medium storing this program
DE102014012597A1 (en) Search method for related pictures and user interface for controlling the method
Rúbio et al. Enhancing academic literature review through relevance recommendation: using bibliometric and text-based features for classification
CN115795030A (en) Text classification method and device, computer equipment and storage medium
JP2019204246A (en) Learning data creation method and learning data creation device
CN106202552A (en) Data search method based on cloud computing
Lang et al. Discernibility matrix simplification with new attribute dependency functions for incomplete information systems
CN106919593B (en) Searching method and device
US9384285B1 (en) Methods for identifying related documents
KR20050070955A (en) Method of scientific information analysis and media that can record computer program thereof
TW201506650A (en) System and method for sorting documents
KR20200117542A (en) Apparatus and method for generating information link
CN115617978A (en) Index name retrieval method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106

RJ01 Rejection of invention patent application after publication