CN109525600A - A method of based on the anti-web crawlers encrypted to paging parameter - Google Patents
A method of based on the anti-web crawlers encrypted to paging parameter Download PDFInfo
- Publication number
- CN109525600A CN109525600A CN201811617924.0A CN201811617924A CN109525600A CN 109525600 A CN109525600 A CN 109525600A CN 201811617924 A CN201811617924 A CN 201811617924A CN 109525600 A CN109525600 A CN 109525600A
- Authority
- CN
- China
- Prior art keywords
- client
- parameter
- ciphertext
- encrypted
- paging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention relates to a kind of method based on the anti-web crawlers encrypted to paging parameter, steps of the present invention are as follows: S1. client and server-side arrange a set of cryptographic protocol;S2. client is done paging parameter based on the cryptographic protocol of inside agreement and mapped;S3. when client sends paging request, the parameter of carrying is encrypted ciphertext;S4. after server-side receives the ciphertext parameter that client is sent, ciphertext is decrypted according to cryptographic protocol, response results carry out page presentation to client, client.Implementing the present invention can be improved the difficulty that site resource is maliciously crawled;Prevent all kinds of reptile instruments from consuming a large amount of site resources;Reduce a possibility that ordinary user is identified as crawler.
Description
Technical field
The present invention relates to a kind of anti-web crawlers method, more specifically to a kind of based on being encrypted to paging parameter
The method of anti-web crawlers.
Background technique
With the rapid development of internet, more and more companies are put into some important informations on internet, Huo Zhetong
It crosses internet and shows that some valuable things, these valuable things are also just faced with the wind for illegally being crawled downloading in batches
A degree of harm is brought to company, or even influences the normal operation of company's site in danger.Server-side counts access,
To the statistical analysis of the access of single IP, such as some IP frequent regular access website in a short time, it can be to it
IP is handled.
The userAgent that server-side carries when can be to client request is verified, if carrying userAgent useless
Or the userAgent carried is abnormal, can request to handle to it, only belong to the userAgent ability of normal range (NR)
Normal access.Crawlers can simulate session by technological means, and the data such as userAgent crawl data through row access,
Ip can also be disposed in batches, does not stop to switch ip and accesses and crawl data, allow backstage that can not identify that the access is crawler or general
General family.
With icon, picture etc., which does text or number, to be mapped, and text is mapped as picture presentation on the page, because at present
Crawlers are all that website is converted into character string to carry out interception analysis, can not Direct Recognition picture, so improve crawler hardly possible
Degree.One picture of every increase can all increase a background request, and the website load time has been significantly greatly increased in excessive picture request, sacrificial
Domestic animal web site performance, influences user experience, and function is more single.
In client, if there is mass data needs are shown, paging processing can be generally done, avoids requesting for the first time a large amount of
Data reduce the waiting time, to improve user experience, but the regularity that paging parameter is incremental, pole is brought to crawlers
Big convenience, it is only necessary to paging parameter be handled in a manner of looping through through row, website current class can be disposably crawled
All data so that influence website operate normally.
Summary of the invention
The technical problem to be solved in the present invention is that for the defects in the prior art, providing a kind of based on to paging ginseng
The method of the anti-web crawlers of number encryption avoids website data information from illegally being climbed in batches the case where not influencing web site performance
It takes.
The technical solution adopted by the present invention to solve the technical problems is: constructing a kind of anti-based on encrypting to paging parameter
The method of web crawlers arranges a set of cryptographic protocol by client and server-side, prevents all kinds of reptile instruments from consuming a large amount of nets
It stands resource.
It is described based on to paging ginseng in the method for the present invention based on the anti-web crawlers encrypted to paging parameter
The method and step of the anti-web crawlers of number encryption are as follows:
S1. client and server-side arrange a set of cryptographic protocol;
S2. client is done paging parameter based on the cryptographic protocol of inside agreement and mapped;
S3. when client sends paging request, the parameter of carrying is encrypted ciphertext;
S4. after server-side receives the ciphertext parameter that client is sent, ciphertext is decrypted according to cryptographic protocol, response results are to visitor
Family end, client carry out page presentation.
Implement a kind of method based on the anti-web crawlers encrypted to paging parameter of the invention, has below beneficial to effect
Fruit: the ciphertext is compared with the conventional number with progressive law as paging parameter, the paging realized by this method
Operation can not be identified by batch crawlers and extract and crawl site resource in batches;Implementing the present invention can be improved site resource quilt
The difficulty that malice crawls;Prevent all kinds of reptile instruments from consuming a large amount of site resources;It reduces and ordinary user is identified as crawler
Possibility.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the method flow diagram of the invention based on the anti-web crawlers encrypted to paging parameter
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
As shown in Figure 1, the method and step based on the anti-web crawlers encrypted to paging parameter are as follows:
S1. client and server-side arrange a set of cryptographic protocol;
S2. client is done paging parameter based on the cryptographic protocol of inside agreement and mapped;
S3. when client sends paging request, the parameter of carrying is encrypted ciphertext;
S4. after server-side receives the ciphertext parameter that client is sent, ciphertext is decrypted according to cryptographic protocol, response results are to visitor
Family end, client carry out page presentation.
Further, the ciphertext is the non-increasing number generated after internal agreement encryption, the character of not no evident regularity
Or number.
Further, client receives ciphertext, is first decrypted according to agreement, and when inquiring database, paging parameter has been normal
The paging parameter of rule, respond to the data of front end also with conventional paging indistinction, so there is no any influence on user experience,
But there is great obstruction for batch crawlers, and ordinary user will not be identified as crawlers.
Although being disclosed by above embodiments to the present invention, scope of protection of the present invention is not limited thereto,
Under conditions of without departing from present inventive concept, deformation, the replacement etc. done to above each component will fall into right of the invention
In claimed range.
Claims (2)
1. a kind of method based on the anti-web crawlers encrypted to paging parameter, which is characterized in that described based on to paging parameter
The method and step of the anti-web crawlers of encryption are as follows:
S1. client and server-side arrange a set of cryptographic protocol;
S2. client is done paging parameter based on the cryptographic protocol of inside agreement and mapped;
S3. when client sends paging request, the paging parameter of carrying is encrypted ciphertext;
S4. after server-side receives the ciphertext parameter that client is sent, ciphertext is decrypted according to cryptographic protocol, response results are to client
End, client carry out page presentation.
2. the method according to claim 1 based on the anti-web crawlers encrypted to paging parameter, which is characterized in that described
Ciphertext is the non-increasing number generated after internal agreement encryption, the character or number of not no evident regularity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811617924.0A CN109525600A (en) | 2018-12-28 | 2018-12-28 | A method of based on the anti-web crawlers encrypted to paging parameter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811617924.0A CN109525600A (en) | 2018-12-28 | 2018-12-28 | A method of based on the anti-web crawlers encrypted to paging parameter |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109525600A true CN109525600A (en) | 2019-03-26 |
Family
ID=65798414
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811617924.0A Pending CN109525600A (en) | 2018-12-28 | 2018-12-28 | A method of based on the anti-web crawlers encrypted to paging parameter |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109525600A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112653695A (en) * | 2020-12-21 | 2021-04-13 | 浪潮卓数大数据产业发展有限公司 | Method and system for realizing crawler resistance |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7472413B1 (en) * | 2003-08-11 | 2008-12-30 | F5 Networks, Inc. | Security for WAP servers |
US20100306249A1 (en) * | 2009-05-27 | 2010-12-02 | James Hill | Social network systems and methods |
CN105786415A (en) * | 2016-01-07 | 2016-07-20 | 浪潮通用软件有限公司 | File printing encryption method and device |
CN106547778A (en) * | 2015-09-21 | 2017-03-29 | 北京国双科技有限公司 | The crawling method and device of webpage |
CN106960158A (en) * | 2017-03-22 | 2017-07-18 | 福建中金在线信息科技有限公司 | A kind of method and apparatus for preventing blog from being retrieved by web crawlers |
CN107770171A (en) * | 2017-10-18 | 2018-03-06 | 厦门集微科技有限公司 | The verification method and system of the anti-reptile of server |
-
2018
- 2018-12-28 CN CN201811617924.0A patent/CN109525600A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7472413B1 (en) * | 2003-08-11 | 2008-12-30 | F5 Networks, Inc. | Security for WAP servers |
US20100306249A1 (en) * | 2009-05-27 | 2010-12-02 | James Hill | Social network systems and methods |
CN106547778A (en) * | 2015-09-21 | 2017-03-29 | 北京国双科技有限公司 | The crawling method and device of webpage |
CN105786415A (en) * | 2016-01-07 | 2016-07-20 | 浪潮通用软件有限公司 | File printing encryption method and device |
CN106960158A (en) * | 2017-03-22 | 2017-07-18 | 福建中金在线信息科技有限公司 | A kind of method and apparatus for preventing blog from being retrieved by web crawlers |
CN107770171A (en) * | 2017-10-18 | 2018-03-06 | 厦门集微科技有限公司 | The verification method and system of the anti-reptile of server |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112653695A (en) * | 2020-12-21 | 2021-04-13 | 浪潮卓数大数据产业发展有限公司 | Method and system for realizing crawler resistance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210240825A1 (en) | Multi-representational learning models for static analysis of source code | |
US9418218B2 (en) | Dynamic rendering of a document object model | |
US11741185B1 (en) | Managing content uploads | |
JP2020030866A (en) | Sensitive information processing method, device and server, and security determination system | |
US9438625B1 (en) | Mitigating scripted attacks using dynamic polymorphism | |
CN103929440B (en) | Webpage tamper resistant device and its method based on web server cache match | |
US10848505B2 (en) | Cyberattack behavior detection method and apparatus | |
CN107251528B (en) | Method and apparatus for providing data originating within a service provider network | |
Miculan et al. | Formal analysis of Facebook Connect single sign-on authentication protocol | |
US20210240826A1 (en) | Building multi-representational learning models for static analysis of source code | |
KR20050058296A (en) | Method and system for monitoring user interaction with a computer | |
US20080275843A1 (en) | Identifying an application user as a source of database activity | |
CN105959313A (en) | Method and device for preventing HTTP proxy attack | |
EP2642718A2 (en) | Dynamic rendering of a document object model | |
US20210383023A1 (en) | System and method for dynamic management of private data | |
CN110232146A (en) | A kind of data grab method and grabbing device | |
CN105357027A (en) | Lightweight data service bus system based on large data | |
CN106713318B (en) | WEB site safety protection method and system | |
CN111245838A (en) | Method for protecting key information by anti-crawler | |
CN110704816A (en) | Interface cracking recognition method, device, equipment and storage medium | |
CN112653671A (en) | Network communication method, device, equipment and medium for client and server | |
WO2021078062A1 (en) | Ssl certificate verification method, apparatus and device, and computer storage medium | |
US20220263871A1 (en) | Executing code injected into an intercepted application response message to eliminate accumulation of stale computing sessions | |
CN116324766A (en) | Optimizing crawling requests by browsing profiles | |
CN107196898B (en) | Account login method, page display method, client and server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190326 |
|
RJ01 | Rejection of invention patent application after publication |