CN107370718A - The detection method and device of black chain in webpage - Google Patents
The detection method and device of black chain in webpage Download PDFInfo
- Publication number
- CN107370718A CN107370718A CN201610319264.2A CN201610319264A CN107370718A CN 107370718 A CN107370718 A CN 107370718A CN 201610319264 A CN201610319264 A CN 201610319264A CN 107370718 A CN107370718 A CN 107370718A
- Authority
- CN
- China
- Prior art keywords
- domain name
- webpage
- detected
- chain
- type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1483—Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2119—Authenticating web pages, e.g. with suspicious links
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of detection method of black chain in webpage, including:Obtain the second domain name type corresponding to the URL of the first domain name type and the webpage to be detected corresponding to the URL of exterior chain in webpage to be detected;Obtain the first similarity difference of the first domain name type and the second domain name type;When the first similarity difference is more than predetermined threshold value, judge black chain be present in the webpage to be detected.The invention also discloses a kind of detection means of black chain in webpage.The probability that the mode that the present invention detects black chain is judged by accident is relatively low, improves the accuracy of detecting black chain.
Description
Technical field
The present invention relates to the detection method and device of black chain in communication technical field, more particularly to a kind of webpage.
Background technology
With the increasing to internet illegal contents hitting dynamics, pass through normal channels (e.g., search engine)
It is fewer and fewer directly to have access to the situation of illegal objectionable website entrance, therefore these illegal objectionable website products
Other approach are found in pole, increase the chance that oneself is arrived by dereference.In nets such as government, education, industries
It is that one of most efficient method, black chain can be effectively around search engines to non-among these that black chain is hung on standing
The filtering of method harmful content, and cheat search engine and include illegal harmful content.
Traditional detecting black chain method is typically to check whether exterior chain is black to determine whether with hiding Styles
Chain, because the exterior chain to hide Styles is exactly not necessarily black chain, hide link effect meanwhile, it is capable to produce
Pattern is too numerous to mention, will realize that detection is hardly possible to every kind of pattern, in addition, black chain is also not necessarily
The exterior chain exactly to stash, creates collapse directories in website, is wherein creating comprising illegal bad interior
The webpage of appearance is also common black chain form, but this black chain is not easy by traditional detecting black chain method inspection
Measure, therefore, the detection method detecting black chain precision of this kind of black chain is relatively low.
The content of the invention
It is a primary object of the present invention to propose the detection method and device of black chain in a kind of webpage, it is intended to solve
The certainly relatively low technical problem of detecting black chain precision in the prior art.
To achieve the above object, the present invention provides a kind of detection method of black chain in webpage, in the webpage
The detection method of black chain comprises the following steps:
Obtain the first domain name type corresponding to the URL of exterior chain and the webpage to be detected in webpage to be detected
URL corresponding to the second domain name type;
Obtain the first similarity difference of the first domain name type and the second domain name type;
When the first similarity difference is more than predetermined threshold value, exist in the judgement webpage to be detected black
Chain.
Alternatively, the similarity difference for obtaining the first domain name type and the second domain name type
The step of after, the detection method of black chain also includes step in the webpage:
Judge to whether there is content of text in the exterior chain of the webpage to be detected;
When content of text be present in having the exterior chain, obtain corresponding to the keyword in the content of text
First keyword type;
Obtain the of first keyword type, second keyword type corresponding with the webpage to be detected
Two similarity differences, and the second similarity difference is superimposed to the first similarity difference and obtains
Three similarity differences;
When the third phase is more than the predetermined threshold value like degree difference, judge to deposit in the webpage to be detected
In black chain;
Content of text is not present in the exterior chain, and in the first similarity difference more than described pre-
If during threshold value, perform in the judgement webpage to be detected and the step of black chain be present.
Alternatively, each first keyword type of acquisition is corresponding with the URL of webpage to be detected
Before the step of second similarity difference of the second keyword type, the detection method of black chain in the webpage
Also include step:
Obtain the search engine sensitive tags in the webpage to be detected;
Obtain keyword type corresponding to each search engine sensitive tags, and by the pass of acquisition
Keyword type is as the second keyword type corresponding to the webpage to be detected.
Alternatively, it is described to obtain the first domain name type and institute corresponding to the URL of exterior chain in webpage to be detected
Before the step of stating the second domain name type corresponding to the URL of webpage to be detected, the inspection of black chain in the webpage
Survey method also includes step:
Obtain the first domain name in each URL inserted in webpage to be detected and the URL in webpage to be detected
In the second domain name;
Obtain in first domain name with unmatched 3rd domain name of second domain name;
Will the URL corresponding with the 3rd domain name as exterior chain, wherein, first domain name with
It is described when second domain name is identical, or during subdomain name that first domain name is second domain name
First domain name matches with second domain name.
Alternatively, it is described to obtain the first domain name type and institute corresponding to the URL of exterior chain in webpage to be detected
The step of stating the second domain name type corresponding to the URL of webpage to be detected includes:
Domain name type where the domain name according to corresponding to exterior chain determines the first domain corresponding to the URL of the exterior chain
Name type, and determine URL pairs of the webpage to be detected according to the domain name type where second domain name
The the second domain name type answered.
In addition, to achieve the above object, the present invention also proposes a kind of detection means of black chain in webpage, institute
The detection means for stating black chain in webpage comprises the following steps:
Acquisition module, for obtain in webpage to be detected first domain name type corresponding to the URL of exterior chain and
Second domain name type corresponding to the URL of the webpage to be detected;
Similarity difference calculating module, for obtaining the first domain name type and the second domain name type
The first similarity difference;
Black chain determination module, for when the first similarity difference is more than predetermined threshold value, described in judgement
Black chain in webpage to be detected be present.
Alternatively, the detection means of black chain also includes in the webpage:
Judge module, it whether there is content of text in the exterior chain for judging the webpage to be detected;
The black chain determination module, it is additionally operable in the exterior chain that content of text is not present, and described
When first similarity difference is more than predetermined threshold value, judge black chain be present in the webpage to be detected;
The acquisition module, it is additionally operable to, when content of text be present in having the exterior chain, obtain the text
First keyword type corresponding to keyword in content;
The similarity difference calculating module, be additionally operable to obtain first keyword type with it is described to be checked
Second similarity difference of the second keyword type corresponding to survey grid page, and by the second similarity difference
It is superimposed to the first similarity difference and obtains third phase and seemingly spends difference;
The black chain determination module, it is additionally operable to when the third phase is more than the predetermined threshold value like degree difference,
Judge black chain be present in the webpage to be detected.
Alternatively, the acquisition module is additionally operable to:
Obtain the search engine sensitive tags in the webpage to be detected;
Obtain keyword type corresponding to each search engine sensitive tags, and by the pass of acquisition
Keyword type is as the second keyword type corresponding to the webpage to be detected.
Alternatively, the detection means of black chain also includes in the webpage:
The acquisition module, it is additionally operable to obtain the first domain name in each URL inserted in webpage to be detected
With the second domain name in the URL in webpage to be detected, and obtain in first domain name with described second
Unmatched 3rd domain name of domain name;
Processing module, for will the URL corresponding with the 3rd domain name as exterior chain, wherein,
When first domain name is identical with second domain name, or first domain name is second domain name
During subdomain name, first domain name matches with second domain name.
Alternatively, the acquisition module, the domain name type where being additionally operable to the domain name according to corresponding to exterior chain are true
First domain name type corresponding to the URL of the fixed exterior chain, and according to the domain name kind where second domain name
Type determines the second domain name type corresponding to the URL of the webpage to be detected.
The detection method and device of black chain in webpage proposed by the present invention, after black chain is inserted in webpage,
Corresponding type correlation can differ very more to type corresponding to black chain in itself with webpage, then can pass through acquisition
The URL of first domain name type and webpage to be detected corresponding to the URL of exterior chain is corresponding in webpage to be detected
The second domain name type, and obtain the first similarity of the first domain name type and the second domain name type
Difference, to determine whether to exist black chain, when the first similarity difference is more than predetermined threshold value, illustrate insertion
Exterior chain and webpage to be detected between style differences it is very big, now can determine that and exist in webpage to be detected
Black chain, the probability that the mode that this kind detects black chain is judged by accident is relatively low, improves the accuracy of detecting black chain.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the detection method first embodiment of black chain in webpage of the present invention;
Fig. 2 is the schematic flow sheet of the detection method second embodiment of black chain in webpage of the present invention;
Fig. 3 is the schematic flow sheet of the detection method 3rd embodiment of black chain in webpage of the present invention;
Fig. 4 is the high-level schematic functional block diagram of the detection means first embodiment of black chain in webpage of the present invention;
Fig. 5 is the high-level schematic functional block diagram of the detection means second embodiment of black chain in webpage of the present invention;
Fig. 6 is the high-level schematic functional block diagram of the detection means 3rd embodiment of black chain in webpage of the present invention.
The realization, functional characteristics and advantage of the object of the invention will be done further referring to the drawings in conjunction with the embodiments
Explanation.
Embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to limit
The fixed present invention.
The present invention provides a kind of detection method of black chain in webpage.
Reference picture 1, Fig. 1 are the schematic flow sheet of the embodiment of detection method one of black chain in webpage of the present invention.
The present embodiment proposes a kind of detection method of black chain in webpage, and the detection method of black chain includes in webpage:
Step S10, obtain in webpage to be detected first domain name type corresponding to the URL of exterior chain and to be checked
Second domain name type corresponding to the URL of survey grid page;
Predeterminable domain name typelib, prestore in the domain name typelib between each domain name and domain name type
Mapping relations, then can obtain the domain name in the URL of exterior chain, and by each domain name of acquisition and default domain
Domain name in name typelib is compared, and to get the first domain name type, similarly obtains webpage to be detected
URL in domain name, and the domain name of acquisition is compared with the domain name in default domain name typelib, with
Obtain the second domain name typelib.The exterior chain of webpage to be detected is connected to outside website corresponding to webpage to be detected
Web page interlinkage.
It is understood that be to improve the efficiency for obtaining domain name type, then by the domain name in the URL of exterior chain
When being compared with the domain name in domain name typelib, can occur according to domain name type the probability of black chain by greatly to
It is small that each domain name type is ranked up, and compare the domain name got and each domain successively according to the order
There is the general of black chain in domain name in name type, such as the domain name type of gambling class, pornographic class and game class
Rate is very big, then can be first by the domain name in the URL of exterior chain and the domain of gambling class, pornographic class and game class
Domain name in name type is compared;Similarly, by the domain name in the URL of webpage to be detected with and domain name
When domain name in typelib is compared, the probability that black chain can be inserted into according to domain name type is descending right
Each domain name type is ranked up, such as government's class and educational domain name are inserted into the probability of black chain very
Height, then can be first by the domain name in the URL of webpage to be detected and government's class and educational domain name type
Domain name be compared.
Step S20, obtain the first similarity difference of each first domain name type and the second domain name type;
Generally realized when black chain is generally inserted in webpage by inserting multiple exterior chains, then predeterminable each domain
Similarity value corresponding to name type, and ask difference to obtain each first domain name type and the second domain name type
Similarity difference, the similarity difference is absolute value the first domain name type and the second domain name in the present embodiment
Similarity value corresponding to type carries out seeking the absolute value obtained after difference operation, and the first similarity difference is to obtaining
To each similarity difference carry out summation operation and obtain.
Step S30, when the first similarity difference is more than predetermined threshold value, judge to exist in webpage to be detected black
Chain.
Illustrate when the first similarity difference is more than predetermined threshold value each exterior chain for being inserted in webpage to be detected with
Correlation between webpage to be detected is very small, then can determine that in the webpage to be detected black chain be present.
It is understood that when black chain be present in webpage to be detected, exportable webpage reliability is relatively low
Prompt message, the exterior chain in the webpage to be detected is also can remove, or to intercept the webpage to be detected etc. a variety of
Processing mode, specific processing mode can as needed be set by developer, will not be repeated here.
The detection method of black chain, black after black chain is inserted in webpage in the webpage that the present embodiment proposes
Corresponding type correlation can differ very more to type corresponding to chain in itself with webpage, then can be treated by obtaining
Detect in webpage corresponding to the URL of the first domain name type and webpage to be detected corresponding to the URL of exterior chain
Second domain name type, and the first similarity difference of the first domain name type and the second domain name type is obtained, to sentence
Surely it whether there is black chain, when the first similarity difference is more than predetermined threshold value, illustrate the exterior chain of insertion with treating
The style differences detected between webpage are very big, now can determine that black chain, this kind in webpage to be detected be present
It is relatively low to detect the probability that the mode of black chain is judged by accident, improves the accuracy of detecting black chain.
Further, reference picture 2, the detection method of black chain in webpage of the present invention is proposed based on first embodiment
Second embodiment, in the present embodiment, after step S20, the detection method of black chain also includes in webpage
Step:
Step S40, judge to whether there is content of text in webpage to be detected;
Step S50, when content of text in having exterior chain be present, obtain corresponding to the keyword in content of text
First keyword type;
Step S60, it is similar to obtain the first keyword type second keyword type corresponding with webpage to be detected
Difference is spent, and the second similarity difference is superimposed to the first similarity difference obtains third phase and seemingly spend difference;
When third phase is more than predetermined threshold value like degree difference, step S30 is performed;
Step S70, content of text is not present in exterior chain, it is pre- to judge whether the first similarity difference is more than
If threshold value;
When the first similarity difference is more than predetermined threshold value, step S30 is performed.
In the present embodiment, the correlation of keyword type can further be increased to determine whether there is black chain,
Keyword, such as " the private clothes of legend " and " swimsuit pin may be attached in the possible linked contents of exterior chain
Sell " etc., then need to obtain the keyword in the exterior chain, and by the keyword of acquisition and default keyword
Keyword in typelib is compared, and keyword type and keyword are prestored in the keyword type storehouse
Between mapping relations, specific comparison process is similar to the comparison process of domain name typelib, no longer superfluous herein
State.First key types of the acquisition may be it is multiple, then be calculated each first key types with
After the similarity difference of second keyword type, the similarity difference is overlapped, to obtain the second phase
Like degree difference.
Keyword in webpage to be detected can be obtained by the search engine sensitive tags in web page interlinkage to be detected
Get, be i.e. also include step before step S60:
Obtain the search engine sensitive tags in webpage to be detected;
Obtain keyword type corresponding to each search engine sensitive tags, and by the keyword type of acquisition
As the second keyword type corresponding to webpage to be detected.
The search engine sensitive tags can be that the title (title) of webpage and keyword etc. are obtained in web page interlinkage
Arrive, compared the keyword of extraction and the keyword in keyword type storehouse after extracting the keyword
It is right, to get the second keyword type.
Further, reference picture 3, the inspection of black chain in webpage of the present invention is proposed based on first or second embodiments
Device 3rd embodiment is surveyed, in the present embodiment, also includes step before step S10:
Step S80, obtain the first domain name in each URL inserted in webpage to be detected and survey grid to be checked
The second domain name in URL in page;
Step S90, obtain the first domain name in unmatched 3rd domain name of the second domain name;
Step S100, using URL corresponding to the 3rd domain name as exterior chain, wherein, in the first domain name and second
When domain name is identical, or during subdomain name that the first domain name is the second domain name, the first domain name and the second domain name
Match somebody with somebody.
When the first domain name in the URL of insertion is the subdomain name of the second domain name, illustrate the URL of the insertion
For the sublink of webpage to be detected.
It is understood that step S10 includes:Domain name type where the domain name according to corresponding to exterior chain is true
Determine the first domain name type corresponding to the URL of exterior chain, and determine to treat according to the domain name type where the second domain name
Detect the second domain name type corresponding to the URL of webpage.
The present invention further provides a kind of detection means of black chain in webpage.
Reference picture 4, Fig. 4 are that the functional module of the detection means preferred embodiment of black chain in webpage of the present invention is shown
It is intended to.
It is emphasized that it will be apparent to those skilled in the art that functional block diagram is only shown in Fig. 4
The exemplary plot of one preferred embodiment, those skilled in the art's black chain in the webpage shown in Fig. 4
The functional module of detection means, the supplement of new functional module can be carried out easily;The title of each functional module
It is self-defined title, is only used for each program function block that auxiliary understands the detection means of black chain in webpage,
Restriction technical scheme is not used in, the core of technical solution of the present invention is, each self-defined title
The function to be reached of functional module.
The present embodiment proposes a kind of detection means of black chain in webpage, and the detection means of black chain includes in webpage:
Acquisition module 10, for obtain in webpage to be detected first domain name type corresponding to the URL of exterior chain with
And the second domain name type corresponding to the URL of webpage to be detected;
Predeterminable domain name typelib, prestore in the domain name typelib between each domain name and domain name type
Mapping relations, then can obtain the domain name in the URL of exterior chain, and by each domain name of acquisition and default domain
Domain name in name typelib is compared, and to get the first domain name type, similarly obtains webpage to be detected
URL in domain name, and the domain name of acquisition is compared with the domain name in default domain name typelib, with
Obtain the second domain name typelib.The exterior chain of webpage to be detected is connected to outside website corresponding to webpage to be detected
Web page interlinkage.
It is understood that be to improve the efficiency for obtaining domain name type, then by the domain name in the URL of exterior chain
When being compared with the domain name in domain name typelib, can occur according to domain name type the probability of black chain by greatly to
It is small that each domain name type is ranked up, and compare the domain name got and each domain successively according to the order
There is the general of black chain in domain name in name type, such as the domain name type of gambling class, pornographic class and game class
Rate is very big, then can be first by the domain name in the URL of exterior chain and the domain of gambling class, pornographic class and game class
Domain name in name type is compared;Similarly, by the domain name in the URL of webpage to be detected with and domain name
When domain name in typelib is compared, the probability that black chain can be inserted into according to domain name type is descending right
Each domain name type is ranked up, such as government's class and educational domain name are inserted into the probability of black chain very
Height, then can be first by the domain name in the URL of webpage to be detected and government's class and educational domain name type
Domain name be compared.
Similarity difference calculating module 20, for obtaining the first of the first domain name type and the second domain name type
Similarity difference;
Generally realized when black chain is generally inserted in webpage by inserting multiple exterior chains, then predeterminable each domain
Similarity value corresponding to name type, and ask difference to obtain each first domain name type and the second domain name type
Similarity difference, the similarity difference is absolute value the first domain name type and the second domain name in the present embodiment
Similarity value corresponding to type carries out seeking the absolute value obtained after difference operation, and the first similarity difference is to obtaining
To each similarity difference carry out summation operation and obtain.
Black chain determination module 30, for when the first similarity difference is more than predetermined threshold value, judging to be detected
Black chain in webpage be present.
Illustrate when the first similarity difference is more than predetermined threshold value each exterior chain for being inserted in webpage to be detected with
Correlation between webpage to be detected is very small, then can determine that in the webpage to be detected black chain be present.
It is understood that when black chain be present in webpage to be detected, exportable webpage reliability is relatively low
Prompt message, the exterior chain in the webpage to be detected is also can remove, or to intercept the webpage to be detected etc. a variety of
Processing mode, specific processing mode can as needed be set by developer, will not be repeated here.
The detection means of black chain, black after black chain is inserted in webpage in the webpage that the present embodiment proposes
Corresponding type correlation can differ very more to type corresponding to chain in itself with webpage, then can be treated by obtaining
Detect in webpage corresponding to the URL of the first domain name type and webpage to be detected corresponding to the URL of exterior chain
Second domain name type, and the first similarity difference of the first domain name type and the second domain name type is obtained, to sentence
Surely it whether there is black chain, when the first similarity difference is more than predetermined threshold value, illustrate the exterior chain of insertion with treating
The style differences detected between webpage are very big, now can determine that black chain, this kind in webpage to be detected be present
It is relatively low to detect the probability that the mode of black chain is judged by accident, improves the accuracy of detecting black chain.
Further, reference picture 5, the detection means of black chain in webpage of the present invention is proposed based on first embodiment
Second embodiment, in the present embodiment, the detection means of black chain also includes in webpage:
Judge module 40, it whether there is content of text in the exterior chain for judging webpage to be detected;
Black chain determination module 30, it is additionally operable in exterior chain that content of text is not present, and in the first similarity
When difference is more than predetermined threshold value, judge black chain in webpage to be detected be present;
Acquisition module 10, being additionally operable to when content of text in having exterior chain be present, obtaining the pass in content of text
First keyword type corresponding to keyword;
Similarity difference calculating module 20, it is corresponding with webpage to be detected to be additionally operable to the first keyword type of acquisition
The second keyword type the second similarity difference, and it is similar that the second similarity difference is superimposed into first
Degree difference obtains third phase and seemingly spends difference;
Black chain determination module 30, it is additionally operable to, when third phase is more than predetermined threshold value like degree difference, judge to be checked
Black chain in survey grid page be present.
In the present embodiment, the correlation of keyword type can further be increased to determine whether there is black chain,
Keyword, such as " the private clothes of legend " and " swimsuit pin may be attached in the possible linked contents of exterior chain
Sell " etc., then need to obtain the keyword in the exterior chain, and by the keyword of acquisition and default keyword
Keyword in typelib is compared, and keyword type and keyword are prestored in the keyword type storehouse
Between mapping relations, specific comparison process is similar to the comparison process of domain name typelib, no longer superfluous herein
State.First key types of the acquisition may be it is multiple, then be calculated each first key types with
After the similarity difference of second keyword type, the similarity difference is overlapped, to obtain the second phase
Like degree difference.
Keyword in webpage to be detected can be obtained by the search engine sensitive tags in web page interlinkage to be detected
Get, i.e., acquisition module 10 is additionally operable to:
Obtain the search engine sensitive tags in webpage to be detected;
Obtain keyword type corresponding to each search engine sensitive tags, and by the keyword type of acquisition
As the second keyword type corresponding to webpage to be detected.
The search engine sensitive tags can be that the title (title) of webpage and keyword etc. are obtained in web page interlinkage
Arrive, compared the keyword of extraction and the keyword in keyword type storehouse after extracting the keyword
It is right, to get the second keyword type.
Further, reference picture 3, the inspection of black chain in webpage of the present invention is proposed based on first or second embodiments
Survey method 3rd embodiment, in the present embodiment, the detection means of black chain also includes in the webpage:
Acquisition module 10, be additionally operable to obtain the first domain name in each URL inserted in webpage to be detected with
The second domain name in URL in webpage to be detected, and obtain in the first domain name and mismatched with the second domain name
The 3rd domain name;
Processing module 50, for using URL corresponding to the 3rd domain name as exterior chain, wherein, in the first domain name
When identical with the second domain name, or the first domain name be the second domain name subdomain name when, the first domain name and second
Domain name matches.
When the first domain name in the URL of insertion is the subdomain name of the second domain name, illustrate the URL of the insertion
For the sublink of webpage to be detected.
It is understood that acquisition module 10, the domain name kind being additionally operable to where the domain name according to corresponding to exterior chain
Type determines the first domain name type corresponding to the URL of exterior chain, and true according to the domain name type where the second domain name
Second domain name type corresponding to the URL of fixed webpage to be detected.
It should be noted that herein, term " comprising ", "comprising" or its any other variant
Be intended to contain including for the nonexcludability so that process, method, article including a series of elements or
Person's device not only includes those key elements, but also the other element including being not expressly set out, or also
Including for this process, method, article or the intrinsic key element of device.In the feelings not limited more
Under condition, the key element that is limited by sentence "including a ...", it is not excluded that the process including the key element,
Other identical element in method, article or device also be present.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-mentioned
Embodiment method can add the mode of required general hardware platform to realize by software, naturally it is also possible to logical
Cross hardware, but the former is more preferably embodiment in many cases.It is of the invention based on such understanding
The part that technical scheme substantially contributes to prior art in other words can in the form of software product body
Reveal and, the computer software product is stored in storage medium (such as ROM/RAM, magnetic disc, a light
Disk) in, including some instructions to cause a station terminal equipment (can be mobile phone, computer, high in the clouds
Server, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
The preferred embodiments of the present invention are these are only, are not intended to limit the scope of the invention, it is every
The equivalent structure or equivalent flow conversion made using description of the invention and accompanying drawing content, or directly or
Connect and be used in other related technical areas, be included within the scope of the present invention.
Claims (10)
- A kind of 1. detection method of black chain in webpage, it is characterised in that the detection side of black chain in the webpage Method comprises the following steps:Obtain the first domain name type corresponding to the URL of exterior chain and the survey grid to be checked in webpage to be detected Second domain name type corresponding to the URL of page;Obtain the first similarity difference of the first domain name type and the second domain name type;When the first similarity difference is more than predetermined threshold value, exist in the judgement webpage to be detected black Chain.
- 2. the detection method of black chain in webpage as claimed in claim 1, it is characterised in that the acquisition After the step of similarity difference of the first domain name type and the second domain name type, the webpage In the detection method of black chain also include step:Judge to whether there is content of text in the exterior chain of the webpage to be detected;When content of text be present in having the exterior chain, obtain corresponding to the keyword in the content of text First keyword type;Obtain the of first keyword type, second keyword type corresponding with the webpage to be detected Two similarity differences, and the second similarity difference is superimposed to the first similarity difference and obtains Three similarity differences;When the third phase is more than the predetermined threshold value like degree difference, judge to deposit in the webpage to be detected In black chain;Content of text is not present in the exterior chain, and in the first similarity difference more than described pre- If during threshold value, perform in the judgement webpage to be detected and the step of black chain be present.
- 3. the detection method of black chain in webpage as claimed in claim 2, it is characterised in that the acquisition The of each first keyword type second keyword type corresponding with the URL of webpage to be detected Before the step of two similarity differences, the detection method of black chain also includes step in the webpage:Obtain the search engine sensitive tags in the webpage to be detected;Obtain keyword type corresponding to each search engine sensitive tags, and by the pass of acquisition Keyword type is as the second keyword type corresponding to the webpage to be detected.
- 4. the detection method of black chain in the webpage as described in claim any one of 1-3, it is characterised in that It is described to obtain the first domain name type and the survey grid to be checked corresponding to the URL of exterior chain in webpage to be detected Page URL corresponding to the second domain name type the step of before, the detection method of black chain is also wrapped in the webpage Include step:Obtain in the first domain name and the webpage to be detected in each URL inserted in webpage to be detected The second domain name in URL;Obtain in first domain name with unmatched 3rd domain name of second domain name;Using the URL corresponding to the 3rd domain name as exterior chain, wherein, in first domain name and institute State the second domain name it is identical when, or during subdomain name that first domain name is second domain name, described the One domain name matches with second domain name.
- 5. the detection method of black chain in webpage as claimed in claim 4, it is characterised in that the acquisition The URL of first domain name type and the webpage to be detected corresponding to the URL of exterior chain in webpage to be detected The step of corresponding second domain name type, includes:According to corresponding to the domain name type where domain name corresponding to the exterior chain determines the URL of the exterior chain First domain name type, and the webpage to be detected is determined according to the domain name type where second domain name Second domain name type corresponding to URL.
- A kind of 6. detection means of black chain in webpage, it is characterised in that the detection dress of black chain in the webpage Put and comprise the following steps:Acquisition module, for obtain in webpage to be detected first domain name type corresponding to the URL of exterior chain and Second domain name type corresponding to the URL of the webpage to be detected;Similarity difference calculating module, for obtaining the first domain name type and the second domain name type The first similarity difference;Black chain determination module, for when the first similarity difference is more than predetermined threshold value, described in judgement Black chain in webpage to be detected be present.
- 7. the detection means of black chain in webpage as claimed in claim 6, it is characterised in that the webpage In the detection means of black chain also include:Judge module, it whether there is content of text in the exterior chain for judging the webpage to be detected;The black chain determination module, it is additionally operable in the exterior chain that content of text is not present, and described When first similarity difference is more than predetermined threshold value, judge black chain be present in the webpage to be detected;The acquisition module, it is additionally operable to, when content of text be present in having the exterior chain, obtain the text First keyword type corresponding to keyword in content;The similarity difference calculating module, be additionally operable to obtain first keyword type with it is described to be checked Second similarity difference of the second keyword type corresponding to survey grid page, and by the second similarity difference It is superimposed to the first similarity difference and obtains third phase and seemingly spends difference;The black chain determination module, it is additionally operable to when the third phase is more than the predetermined threshold value like degree difference, Judge black chain be present in the webpage to be detected.
- 8. the detection means of black chain in webpage as claimed in claim 7, it is characterised in that the acquisition Module is additionally operable to:Obtain the search engine sensitive tags in the webpage to be detected;Obtain keyword type corresponding to each search engine sensitive tags, and by the pass of acquisition Keyword type is as the second keyword type corresponding to the webpage to be detected.
- 9. the detection means of black chain in the webpage as described in claim any one of 6-8, it is characterised in that The detection means of black chain also includes in the webpage:The acquisition module, it is additionally operable to obtain the first domain name in each URL inserted in webpage to be detected With the second domain name in the URL in webpage to be detected, and obtain in first domain name with described second Unmatched 3rd domain name of domain name;Processing module, for using the URL corresponding to the 3rd domain name as exterior chain, wherein, in institute State the first domain name it is identical with second domain name when, or first domain name be second domain name son During domain name, first domain name matches with second domain name.
- 10. the detection means of black chain in webpage as claimed in claim 9, it is characterised in that described to obtain Modulus block, the domain name type where being additionally operable to the domain name according to corresponding to exterior chain determine the URL of the exterior chain Corresponding first domain name type, and determined according to the domain name type where second domain name described to be detected Second domain name type corresponding to the URL of webpage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610319264.2A CN107370718B (en) | 2016-05-12 | 2016-05-12 | Method and device for detecting black chain in webpage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610319264.2A CN107370718B (en) | 2016-05-12 | 2016-05-12 | Method and device for detecting black chain in webpage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107370718A true CN107370718A (en) | 2017-11-21 |
CN107370718B CN107370718B (en) | 2020-12-18 |
Family
ID=60304395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610319264.2A Active CN107370718B (en) | 2016-05-12 | 2016-05-12 | Method and device for detecting black chain in webpage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107370718B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908764A (en) * | 2017-11-27 | 2018-04-13 | 杭州安恒信息技术有限公司 | A kind of exterior chain monitoring method of fixed issue content |
CN109067716A (en) * | 2018-07-18 | 2018-12-21 | 杭州安恒信息技术股份有限公司 | A kind of method and system identifying dark chain |
CN109522494A (en) * | 2018-11-08 | 2019-03-26 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN109561078A (en) * | 2018-11-09 | 2019-04-02 | 深圳万物云联科技有限公司 | A kind of exterior chain url resource transfer method and device |
CN109784038A (en) * | 2018-12-29 | 2019-05-21 | 北京奇安信科技有限公司 | Detecting black chain method, apparatus, system and computer readable storage medium |
CN110532784A (en) * | 2019-09-04 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN111654472A (en) * | 2020-05-14 | 2020-09-11 | 亚信科技(成都)有限公司 | Domain name detection method and device |
WO2020211130A1 (en) * | 2019-04-16 | 2020-10-22 | 网宿科技股份有限公司 | Hidden link detection method and apparatus for website |
CN112532624A (en) * | 2020-11-27 | 2021-03-19 | 深信服科技股份有限公司 | Black chain detection method and device, electronic equipment and readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101295320A (en) * | 2008-06-30 | 2008-10-29 | 腾讯科技(深圳)有限公司 | Method and system for judging anchor text noise level |
CN102073730A (en) * | 2011-01-14 | 2011-05-25 | 哈尔滨工程大学 | Method for constructing topic web crawler system |
CN102236654A (en) * | 2010-04-26 | 2011-11-09 | 广东开普互联信息科技有限公司 | Web useless link filtering method based on content relevancy |
CN103516693A (en) * | 2012-06-28 | 2014-01-15 | 中国电信股份有限公司 | Method and device for identifying phishing website |
CN104077353A (en) * | 2011-12-30 | 2014-10-01 | 北京奇虎科技有限公司 | Method and device for detecting hacking links |
-
2016
- 2016-05-12 CN CN201610319264.2A patent/CN107370718B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101295320A (en) * | 2008-06-30 | 2008-10-29 | 腾讯科技(深圳)有限公司 | Method and system for judging anchor text noise level |
CN102236654A (en) * | 2010-04-26 | 2011-11-09 | 广东开普互联信息科技有限公司 | Web useless link filtering method based on content relevancy |
CN102073730A (en) * | 2011-01-14 | 2011-05-25 | 哈尔滨工程大学 | Method for constructing topic web crawler system |
CN104077353A (en) * | 2011-12-30 | 2014-10-01 | 北京奇虎科技有限公司 | Method and device for detecting hacking links |
CN103516693A (en) * | 2012-06-28 | 2014-01-15 | 中国电信股份有限公司 | Method and device for identifying phishing website |
Non-Patent Citations (1)
Title |
---|
苏秀芝: "网页去噪与特征提取算法的研究及实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908764B (en) * | 2017-11-27 | 2021-06-22 | 杭州安恒信息技术股份有限公司 | External link monitoring method for fixed release content |
CN107908764A (en) * | 2017-11-27 | 2018-04-13 | 杭州安恒信息技术有限公司 | A kind of exterior chain monitoring method of fixed issue content |
CN109067716A (en) * | 2018-07-18 | 2018-12-21 | 杭州安恒信息技术股份有限公司 | A kind of method and system identifying dark chain |
CN109522494A (en) * | 2018-11-08 | 2019-03-26 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN109522494B (en) * | 2018-11-08 | 2020-09-15 | 杭州安恒信息技术股份有限公司 | Dark chain detection method, device, equipment and computer readable storage medium |
CN109561078A (en) * | 2018-11-09 | 2019-04-02 | 深圳万物云联科技有限公司 | A kind of exterior chain url resource transfer method and device |
CN109784038A (en) * | 2018-12-29 | 2019-05-21 | 北京奇安信科技有限公司 | Detecting black chain method, apparatus, system and computer readable storage medium |
WO2020211130A1 (en) * | 2019-04-16 | 2020-10-22 | 网宿科技股份有限公司 | Hidden link detection method and apparatus for website |
CN110532784A (en) * | 2019-09-04 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN111654472A (en) * | 2020-05-14 | 2020-09-11 | 亚信科技(成都)有限公司 | Domain name detection method and device |
CN111654472B (en) * | 2020-05-14 | 2022-05-24 | 亚信科技(成都)有限公司 | Domain name detection method and device |
CN112532624A (en) * | 2020-11-27 | 2021-03-19 | 深信服科技股份有限公司 | Black chain detection method and device, electronic equipment and readable storage medium |
CN112532624B (en) * | 2020-11-27 | 2023-09-05 | 深信服科技股份有限公司 | Black chain detection method and device, electronic equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107370718B (en) | 2020-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107370718A (en) | The detection method and device of black chain in webpage | |
CN103605738B (en) | Web page access data statistical method and device | |
CN104462152B (en) | A kind of recognition methods of webpage and device | |
CN104143008B (en) | The method and device of fishing webpage is detected based on picture match | |
CN109639744A (en) | A kind of detection method and relevant device in the tunnel DNS | |
CN103685228B (en) | Website vulnerability rapid scanning method and device | |
CN105868630A (en) | Malicious PDF document detection method | |
CN107798080B (en) | Similar sample set construction method for fishing URL detection | |
CN103399872B (en) | The method and apparatus that webpage capture is optimized | |
CN105631340B (en) | A kind of method and device of XSS Hole Detection | |
CN108156165A (en) | A kind of method and system for reporting detection by mistake | |
CN109413016A (en) | A kind of rule-based message detecting method and device | |
CN113221032A (en) | Link risk detection method, device and storage medium | |
CN105205356A (en) | APP application re-packaging detection method | |
CN104679798B (en) | Page detection method and device | |
CN108667766A (en) | File detection method and file detection device | |
Park et al. | Phishing website detection framework through web scraping and data mining | |
CN105704099A (en) | Method for detecting illegal links hidden in website scripts | |
CN107506649A (en) | A kind of leak detection method of html web page, device and electronic equipment | |
CN103475673B (en) | Fishing website recognition methods, device and client | |
CN109522494B (en) | Dark chain detection method, device, equipment and computer readable storage medium | |
KR101639869B1 (en) | Program for detecting malignant code distributing network | |
CN107633020B (en) | Article similarity detection method and device | |
CN110011964B (en) | Webpage environment detection method and device | |
CN109067716A (en) | A kind of method and system identifying dark chain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518052 No. 1001 Nanshan Chi Park building A1 layer Applicant after: SANGFOR TECHNOLOGIES Inc. Address before: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518052 No. 1001 Nanshan Chi Park building A1 layer Applicant before: Sangfor Technologies Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |