CN103870329B - Distributed crawler task scheduling method based on weighted round-robin algorithm - Google Patents
Distributed crawler task scheduling method based on weighted round-robin algorithm Download PDFInfo
- Publication number
- CN103870329B CN103870329B CN201410073829.4A CN201410073829A CN103870329B CN 103870329 B CN103870329 B CN 103870329B CN 201410073829 A CN201410073829 A CN 201410073829A CN 103870329 B CN103870329 B CN 103870329B
- Authority
- CN
- China
- Prior art keywords
- node
- reptile
- crawler
- url
- main controlled
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
A distributed crawler task scheduling method based on a weighted round-robin algorithm at least includes the following steps: (1) according to different scales, network crawlers are divided into five types of crawlers, i.e. a stand-alone multi-thread crawler, a homogeneous centralized crawler, a heterogeneous centralized crawler, a small-scale distributed crawler and a large-scale distributed crawler; (2) a master-slave architecture is deployed; (3) when a crawler node is connected to a master node for the first time, the master node gives an initial weight to the crawler node; (4) according to the scheduling algorithm based on weighted round-robin, the master node continuously chooses a crawler node and assigns a URL (Uniform Resource Locator) task to be crawled to the crawler node; (5) each time when a URL task is crawled by a crawler node, a result is returned to the master node, and the weight of the crawler node is updated by the master node, and the like. The distributed crawler scheduling policy based on the weighted round-robin algorithm, which is put forward by the invention, is designed for small-scale distributed crawlers and can ensure the load balance of each crawler node and ensure that crawler nodes have flexible scalability and fault tolerance.
Description
Technical field
The present invention relates to web search technical field.
Background technology
One search engine is segmented into several part such as reptile, index, searcher and user interface.Wherein, reptile
It is responsible for the information in the Internet constantly being made a look up and collecting, play important role in a search engine.With network
Rapidly development, information is even more skyrocketed through, the simple unit web crawlers of tradition and centralized network reptile crawl ability
The growth rate of internet information can not have been kept up with.And get more and more the today being mentioned in distributed concept, distribution
Formula reptile also becomes the scheme solving the problems, such as big data quantity naturally.Distributed reptile is dispersed in the middle part of wide area network by multiple
The node composition of administration, carrying out that can be parallel crawls work, meets the needs to reptile ability for the people.Due to crawling of each node
Ability is different, and a good scheduling strategy is requisite.Reptile for different scales has different dispatching algorithms,
Wherein, the dispatching algorithm comparing main flow has:
(1) Hash scheduling
Common hash function is a kind of mapping relations, by this mapping relations, by the character string of script, number or other
Information is converted to an index value.The crawler system of early stage is all this mode adopting mostly in fact, and it is using url as Hash
Input, the value being obtained according to hash function is just as the output dispatched.Such scheduling strategy is not only very easy to calculate, and
And overhead also very little;Meanwhile, due to the mathematical randomness of hash function, just ensure that task between reptile node
The uniformity of distribution.
(2) central load dispatch
Taking the Beijing University sky net reptile after extensive improvement as a example, it is the pattern of centerized fusion, and its overall framework is
One master control node carries out collaborative work with several reptile nodes.The scheduling method that its task scheduling adopts is: master control node
It is responsible for distribution url, and reptile node is responsible for crawling url.Each website is responsible for by crawlers, all on this website
Url is crawled by this crawlers.One reptile node can have multiple crawlers, but each crawlers must be
Run on one reptile node.Master control node is allocated from seed url, each place website is not also started and climbs
The url of worm program, can find a reptile node according to certain load balance principle, url is transmitted in the past, and requires it to open
Open new crawlers.Next all this website ground url can be distributed to this reptile node, and is entered by this crawlers
Row crawls work.
(3) it is scheduling according to network site
In large-scale search engine, because reptile node is deployed in all parts of the world, the calculating of therefore network site is
Considerable.In such reptile, the thought of its basic scheduling strategy is exactly using such as gnp algorithm, by measurement
Network distance between less pre-determined several groups of websites and reptile node, estimate other substantial amounts of nodes between network away from
From, finally obtain network distance using prediction and to calculate reptile node again crawling the time required for the corresponding webpage of url, and will be minimum
The reptile node sets of time overhead are the scheduler object of corresponding url.Such scheduling scheme is effectively according to network distance pair
Reptile task is dispatched, and decreases the time overhead of Layge-scale Internet measurement.
Content of the invention
Distributed reptile scheduling strategy based on weighted round robin algorithm proposed by the present invention, is for small distributed reptile
And be designed, because of thought and central load dispatch strategy yearning between lovers, also it is simultaneously suitable for the centralized reptile of isomery, can make
Each reptile node load balance, and make reptile node have flexible extensibility and fault-tolerance.
The inventive method technical scheme is characterized as:
A kind of distributed reptile method for scheduling task based on weighted round robin algorithm is it is characterised in that successively according to as follows
Step is implemented:
1) different according to scale, web crawlers is divided into that unit multithreading, isomorphism are centralized, isomery is centralized by the present invention,
Small distributed and large-scale distributed five class reptiles, this reptile method for scheduling task is that the reptile task for small distributed is adjusted
Degree method.Although small distributed reptile refers to that each node is distributed deployment, but still it is deployed in a little physical region
Among, therefore each node network delay difference on the internet less, but the transmission between each node might not be
Carry out in LAN environment, therefore transmit possibly insecure, propagation delay time also must account for.
2) master-slave architecture deployment, i.e. a main controlled node and several distributed deployments and energy and main controlled node intercommunication
Reptile node it is ensured that all reptile nodes can be connected to the Internet.Main controlled node is responsible for the traffic control of reptile task, and that is, one
Which reptile node individual url to be crawled should distribute to completes, and duplicate removal work, will reptile node return one
Central new url to be crawled after the exterior chain duplicate removal that bar url obtains.Reptile node is then responsible for specific reptile work, to each
Bar main controlled node is distributed to its url and is removed to crawl its whole html on the Internet, and parses comprise in this page outer
Chain, these information are returned to main controlled node.
3) when reptile node First Contact Connections are to main controlled node, main controlled node gives its empirical value and weighs as initial
Value.
4) main controlled node, according to the dispatching algorithm based on weighted round robin proposed by the present invention, constantly selects a reptile section
Point, a url task to be crawled is distributed to it.The main body of this dispatching algorithm is traditional weighted round robin dispatching algorithm, that is,
One current scheduling weights of setting, are reinitialized to the maximum of currently all node weights when it is kept to non-positive number,
Then each node being inquired successively, seeing whether its weights is not less than current scheduling weights, if then being dispatched, when all sections
After point inquiry finishes, a step-length that current scheduling weights subtract certainly, then start each node is inquired successively, so constantly reciprocal.
In traditional weighted round robin dispatching algorithm, step-length is the minimum common divisor of all weights that is to say, that there being a lot of weights
In the case of may be considered 1.And the weight calculation method that dispatching algorithm proposed by the present invention then sets according to this method and a large amount of
Its step size settings is 4 by experiment.
5) when reptile node has crawled a url task, return result to main controlled node, main controlled node is according to this
The weight calculation method according to nearest task completion time and the number of tasks not completed that invention proposes updates this reptile node
Weights.
6) when the weights of a reptile node are reduced to zero with the increase of number of tasks, main controlled node will be no longer allocated to it
Task.When its weights revert to positive number again, just can retrieve distribution.
7) url is constantly distributed to reptile node by so main controlled node, and url is then constantly crawled and obtains it by reptile node
Html and exterior chain return to main controlled node, and main controlled node is redistributed away after exterior chain duplicate removal again.Reality according to the Internet
Situation, such whole system will be gone down in endless operation, constantly crawls and obtains new webpage, until manually according to reality
Situation stops manually.
8) have fault recovering mechanism, main controlled node can detect the abnormal conditions of reptile node, and its weights is put
Zero.
9) have good autgmentability, new node can add system at any time, and old node can also be at any time from system
In remove.
Different according to scale, web crawlers is divided into five classes by the present invention:
(1) unit multithreading reptile
Unit multithreading reptile is reptile form the most traditional, and its load balance is embodied in task and uniformly divides as far as possible
It is fitted on each thread.All kinds of hash algorithms is all to be suitable for dispatching algorithm.
(2) the centralized reptile of isomorphism
The centralized reptile of isomorphism is similar with unit multithreading reptile, and each node is equivalent to each in unit multithreading
Thread, only scale is slightly larger, and ability is slightly strong.Therefore, all kinds of hash algorithms is still the scheduling calculation of such reptile suitable
Method.
(3) the centralized reptile of isomery
The centralized reptile of isomery and front different being of two classes, the index such as performance of each node is different, therefore each section
That puts crawls ability and differs.The strong node of ability should be assigned to more tasks, and the node of ability should distribute
To less task.Central load dispatch can have a good scheduling to such reptile.
(4) small distributed reptile
Although small distributed reptile refers to that each node is distributed deployment, but still it is deployed in a little region, respectively
Node network delay difference on the internet less, its reptile centralized with isomery compare similar, but between each node
Transmission might not carry out in LAN, therefore transmission may be considered insecure, and propagation delay time also must be examined
Consider.Such reptile can be carried out necessarily to such reptile currently without preferable specific aim dispatching algorithm, central load dispatch
The scheduling of degree, but good scheduling strategy should be to make some changes on the basis of central load dispatch, with more preferable
Agree with such reptile.
(5) large-scale distributed reptile
Large-scale distributed reptile is exactly the reptile form that all kinds of large commercial search engines adopt now, each Node distribution
All over the world, network delay differs greatly, and is exactly to be such reptile amount body therefore according to the strategy that network site is scheduling
Make.
Distributed reptile scheduling strategy based on weighted round robin algorithm proposed by the present invention, is for little by above-mentioned classification
Type distributed reptile is designed.
The present invention devises a weight computing formula currently crawling efficiency based on each reptile node, and its major function is
Ensure that the load balance of system.And be then based on this power based on the distributed reptile task scheduling algorithm of weighted round robin algorithm
Value computing formula is specifically responsible for the task scheduling of url.In addition the fault recovering mechanism of present invention design has then been to ensure that system
Stability.
Brief description
Fig. 1 scheduling flow figure.
Fig. 2 is based on weighted round robin algorithm flow chart.
Specific embodiment
The present invention adopt master-slave mode reptile framework, in main controlled node, exist a node table, three url queues and
Scheduler module and reptile feedback module.Node table records the information of each reptile node, including node number, weights etc..It must
Must dynamically update to keep consistent with actual reptile node situation.The opportunity that it is dynamic to update can be reptile node each time
Carry out the feedback of a url task or per certain time has been carried out once, can arrange as the case may be.Scheduling
Module first takes out a url from url queue to be crawled, then takes out each nodal information in from node table, and therefrom selects one
Individual reptile node is scheduling, and this url is distributed to this reptile node, and this url is stored in allocated url queue.And
When a reptile node completes after crawling work of a url, reptile feedback module goes this url in allocated url queue
Inquiry, if existing, being removed from it, and being stored in the url queue having crawled, the exterior chain that finally this url crawls out can be sent into
Export after molality block to url queue to be crawled, only consider scheduling process here, and ignore this process.Scheduling flow figure such as Fig. 1
It is shown.
In general it may be considered that load balancing factor have cpu performance, cpu utilization rate, memory usage, transmission when
Prolong, but they are returned more on earth or embody in time, therefore we adopt this index of time as the weighing apparatus of load balancing
Amount standard, that is, determine the factor of weights.After we go to judge according to the ruuning situation before a reptile node, it is possible
Situation, determines weights with this.
Specifically, for a reptile node it is assumed that its completed number of tasks is n, the total time spending altogether
It is that (time here includes going out on missions to this node until this node is fed back from main controlled node distribution t millisecond, is leading
Control node rather than reptile node are to take propagation delay time into account the reason calculating weights), then this reptile node
Averagely complete the time that a task needs to spendFor:
Assume to have distributed to this reptile node but the number of tasks that remains unfulfilled is m, then this reptile node completes residue
The time t that required by task is wanted is exactlyNamely:
The value of t is bigger, also implies that it completes the time of remaining task needs more, then main controlled node just should be given
The less task of this node distribution, that is, weight w should be less, therefore, inverted to t, obtain:
(2) are substituted in (3), obtain:
Wherein, with constantly crawling, n value and t value all can constantly become big, but when node is idle, value will be zero, therefore
In order to allow denominator to be not zero, replace m using m+1, be so updated in (4), obtain:
It is noted that t value and n value record always starts up-to-date situation from the 1st task, then with t value and n
The continuous change of value is big,Can tend to stable.However, this to be not us desired, because now no matter the crawling of this node
During run into any problem, all cannot embody from formula it is intended that weights should be able to reflect this node work as
Front situation.Therefore, the system has used for reference the concept of sliding window, and weights are modified.Only consider and calculate nearest k
The performance of task is it is assumed that tiThe time completing for nearest i-th task, then weight w should be just:
Wherein, how much properly k value takes, and the present invention is taken as 100 when realizing according to the situation of crawling, and this value can make actual
With during different reptiles taken with suitable value.
The present invention proposes a kind of scheduling strategy based on weighted round robin algorithm, from weighted round robin algorithm as scheduling plan
Foundation main slightly will consider aspect:
(1) simply efficient
In addition to the corresponding weights of each reptile node, algorithm only need to store two simple variables (j and c), can be in o
Complete once to dispatch in (x) time, x here refers to reptile nodes.
(2) support-weight dynamic change
The weights taking in all algorithms directly can obtain it means that the weights of each acquirement are all in from node table
Currently up-to-date value, therefore no matter in the how asynchronous concept transfer table of feedback module reptile node corresponding weights, algorithm is all
Traffic control can be completed according to the current node condition of system.
(3) variation tendency of weights can be estimated
In algorithm, the variation tendency of threshold weights weights variation tendency corresponding with each reptile node matches.This means
Even if any temporary problem in reptile node not feeding back in time to update weights, algorithm also accurate can predict it
The change of weights, reasonably distributes task.
(4) the reptile node of low weights will not be died of hunger
In algorithm, threshold weights are reduced to a wheel point of zero (or being less than zero) from currently all node maximum weights each time
During joining, no matter weights height, all can obtain the chance dispatched, this make right value update not in time when, the climbing of low weights
Worm node also will not be died of hunger.
Assume node table n={ n0,n1,...,nx1, w (nj) represent node njWeights, variable j represents last selected
Node, variable c represents the weights of current scheduling, and max (n) represents maximum weights in all nodes in n, and s represents c each time
Reduce value, this value we need former algorithm is modified, to be mated with the weights that the present invention designs, eventually pass through experiment
The value of s is set to 4 by us.The initial value of variable j and c is 0.Flow chart based on weighted round robin algorithm is as shown in Figure 2.
It is noted that w (n herej) got by formula (6), and it is a decimal in (0,1) interval, also simultaneously
It is not applied for weighted round robin algorithm it is therefore necessary to be changed.Allow it be multiplied by a coefficient a first, then it is taken
Whole operation, such weight w will be changed in the interval of 0 to one positive integer.So, weight w is:
Wherein, a can be set as suitably being worth by the situation according to reptile.Here a is set as 300,000 by us, when
When m is zero, w is in 120 about floatings.
The fault recovering mechanism of the present invention can be divided into two parts, is respectively directed to fault recovering mechanism and the pin of node
Fault recovering mechanism to url.
When a node delays suddenly machine, main controlled node also should be able to real-time capture to this situation, and to this system
The mistake occurring is recovered.Typically admissible scheme is heartbeat mechanism.But we employ socket when realizing
Mode, directly catch the io that socket dishes out and extremely just can capture the situation that node disconnects, then we find out
All url distributing to this reptile node in allocated url queue, and they are re-started distribution.So it is directed to node
Fault recovering mechanism just complete.
In addition, we monitor allocated url queue, when the situation that a url does not feed back for a long time it is believed that
This url there is a problem, has such as been lost it should again distribute to it in transmitting procedure.Here it is the mistake for url
Restoration Mechanism.
The innovative point of technical solution of the present invention and its advantage:
1st, the classification being refined reptile according to scale, facilitates the reptile of different scales to select different scheduling plans
Slightly.
2nd, devise the weight computing formula currently crawling efficiency based on each reptile node, can preferably reflect each reptile
The present situation of node is so that the scheduling strategy based on this weights being capable of load balancing.
3rd, devise the dispatching algorithm based on weighted round robin algorithm, mainly step-length be have modified according to experimental result so that
This dispatching algorithm coordinates each reptile node weights computing formula proposed by the present invention, and whole crawler system can be allowed to realize well
Load balancing.
4th, fault recovering mechanism can allow crawler system have suitable fault-tolerance so that the stability of whole system is good
Good.
Claims (1)
1. a kind of distributed reptile method for scheduling task based on weighted round robin algorithm is it is characterised in that walk according to following successively
Rapid enforcement:
1) different according to scale, web crawlers is divided into unit multithreading, isomorphism are centralized, isomery is centralized, small distributed
With large-scale distributed five class reptiles, for the reptile task scheduling of small distributed, small distributed reptile refers to each node
It is distributed deployment, be deployed among a little physical region;
2) master-slave architecture deployment, that is, main controlled node and several distributed deployments and can and the climbing of main controlled node intercommunication
Worm node is it is ensured that all reptile nodes can be connected to the Internet;Main controlled node is responsible for the traffic control of reptile task, treats for one
Which reptile node the url crawling should distribute to completes, and duplicate removal work, will return one of reptile node
Central new url to be crawled after the exterior chain duplicate removal that url obtains;Reptile node is then responsible for specific reptile work, to each
Main controlled node is distributed to its url and is removed to crawl its whole html on the Internet, and parses the exterior chain comprising in this page,
These information are returned to main controlled node;
3) when reptile node First Contact Connections are to main controlled node, main controlled node gives its empirical value as initial weight;
4) main controlled node, according to the dispatching algorithm based on weighted round robin, is constantly selected a reptile node, one is waited to crawl
Url task distribute to it;This dispatching algorithm, that is, arrange current scheduling weights, when it is kept to non-positive number again just
Begin to turn to the maximum of currently all node weights, then each node is inquired successively, see its weights whether not less than current
Scheduling weights, if then being dispatched, after the inquiry of all nodes finishes, a step-length that current scheduling weights subtract certainly, then start
Each node is inquired successively, so constantly reciprocal;And the weight computing side that described dispatching algorithm then sets according to this method
Its step size settings is 4 by method and many experiments;
5) when reptile node has crawled a url task, return result to main controlled node, main controlled node is appointed according to nearest
The weight calculation method of business deadline and the number of tasks not completed updates the weights of this reptile node;
6) when the weights of a reptile node are reduced to zero with the increase of number of tasks, main controlled node will be no longer allocated to it
Business, when its weights revert to positive number again, just can retrieve distribution;
7) url is constantly distributed to reptile node by so main controlled node, reptile node then constantly url is crawled obtain its html and
Exterior chain returns to main controlled node, and main controlled node is redistributed away after exterior chain duplicate removal again;According to the practical situation of the Internet, this
Sample whole system will be gone down in endless operation, constantly crawls and obtains new webpage, until manually according to practical situation handss
Dynamic stopping;
8) have fault recovering mechanism, main controlled node can detect the abnormal conditions of reptile node, and by its weights zero setting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410073829.4A CN103870329B (en) | 2014-03-03 | 2014-03-03 | Distributed crawler task scheduling method based on weighted round-robin algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410073829.4A CN103870329B (en) | 2014-03-03 | 2014-03-03 | Distributed crawler task scheduling method based on weighted round-robin algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103870329A CN103870329A (en) | 2014-06-18 |
CN103870329B true CN103870329B (en) | 2017-01-18 |
Family
ID=50908893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410073829.4A Active CN103870329B (en) | 2014-03-03 | 2014-03-03 | Distributed crawler task scheduling method based on weighted round-robin algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103870329B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104376063B (en) * | 2014-11-11 | 2019-02-19 | 南京邮电大学 | Multi-threaded network crawler method and information real-time update system based on Classification Management |
CN105989151B (en) * | 2015-03-02 | 2019-09-06 | 阿里巴巴集团控股有限公司 | Webpage capture method and device |
CN104766014B (en) | 2015-04-30 | 2017-12-01 | 安一恒通(北京)科技有限公司 | For detecting the method and system of malice network address |
CN106021608A (en) * | 2016-06-22 | 2016-10-12 | 广东亿迅科技有限公司 | Distributed crawler system and implementing method thereof |
CN106897129B (en) * | 2017-01-24 | 2019-07-23 | 浙江工商大学 | A kind of multiple agent internet data acquisition tasks dispatching method based on region |
CN107122246B (en) * | 2017-04-27 | 2020-05-19 | 中国海洋石油集团有限公司 | Intelligent numerical simulation operation management and feedback method |
CN110020066B (en) * | 2017-07-31 | 2021-09-07 | 北京国双科技有限公司 | Method and device for annotating tasks to crawler platform |
CN107590188B (en) * | 2017-08-08 | 2020-02-14 | 杭州灵皓科技有限公司 | Crawler crawling method and management system for automatic vertical subdivision field |
WO2019079966A1 (en) * | 2017-10-24 | 2019-05-02 | 麦格创科技(深圳)有限公司 | Distributed crawler task distribution method and system |
CN108712503B (en) * | 2018-05-30 | 2021-06-22 | 南京邮电大学 | Multi-agent distributed crawler system and method for network load balancing |
CN111092921B (en) * | 2018-10-24 | 2022-05-10 | 北大方正集团有限公司 | Data acquisition method, device and storage medium |
CN111125487A (en) * | 2019-12-24 | 2020-05-08 | 个体化细胞治疗技术国家地方联合工程实验室(深圳) | Crawling method and device for web crawler |
CN111158892B (en) * | 2020-04-02 | 2020-10-02 | 支付宝(杭州)信息技术有限公司 | Task queue generating method, device and equipment |
CN112231534A (en) * | 2020-10-14 | 2021-01-15 | 上海蜜度信息技术有限公司 | Crawler configuration method and equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8042112B1 (en) * | 2003-07-03 | 2011-10-18 | Google Inc. | Scheduler for search engine crawler |
CN102968465A (en) * | 2012-11-09 | 2013-03-13 | 同济大学 | Network information service platform and search service method based on network information service platform |
CN103514301A (en) * | 2013-10-24 | 2014-01-15 | 深圳市同洲电子股份有限公司 | Method and system for scheduling tasks of distributed network crawlers |
CN103559258A (en) * | 2013-11-04 | 2014-02-05 | 同济大学 | Webpage ranking method based on cloud computation |
CN103605764A (en) * | 2013-11-26 | 2014-02-26 | Tcl集团股份有限公司 | Web crawler system and web crawler multitask executing and scheduling method |
-
2014
- 2014-03-03 CN CN201410073829.4A patent/CN103870329B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8042112B1 (en) * | 2003-07-03 | 2011-10-18 | Google Inc. | Scheduler for search engine crawler |
CN102968465A (en) * | 2012-11-09 | 2013-03-13 | 同济大学 | Network information service platform and search service method based on network information service platform |
CN103514301A (en) * | 2013-10-24 | 2014-01-15 | 深圳市同洲电子股份有限公司 | Method and system for scheduling tasks of distributed network crawlers |
CN103559258A (en) * | 2013-11-04 | 2014-02-05 | 同济大学 | Webpage ranking method based on cloud computation |
CN103605764A (en) * | 2013-11-26 | 2014-02-26 | Tcl集团股份有限公司 | Web crawler system and web crawler multitask executing and scheduling method |
Also Published As
Publication number | Publication date |
---|---|
CN103870329A (en) | 2014-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103870329B (en) | Distributed crawler task scheduling method based on weighted round-robin algorithm | |
Kao et al. | Assessment of top-down and bottom-up controls on the collapse of alewives (Alosa pseudoharengus) in Lake Huron | |
CN108846570A (en) | A method of solving resource constrained project scheduling problem | |
CN107682395A (en) | A kind of big data cloud computing runtime and method | |
CN106779346A (en) | A kind of Forecasting Methodology of monthly power consumption | |
Shi et al. | Partitioning dynamic graph asynchronously with distributed FENNEL | |
CN107958052A (en) | A kind of access method and device of large scale network crawlers | |
Mostafa et al. | An intelligent dynamic replica selection model within grid systems | |
CN113672684A (en) | Layered user training management system and method for non-independent same-distribution data | |
Xiaoshuan et al. | A forecasting support system for aquatic products price in China | |
Bredahl et al. | Behavior and productivity implications of institutional and project funding of research | |
CN104573916B (en) | A kind of technical indicator example generation method and device | |
Yadav et al. | Parallel crawler architecture and web page change detection | |
Nagappan et al. | Agent based weighted page ranking algorithm for Web content information retrieval | |
Zhao et al. | Resource schedule algorithm based on artificial fish swarm in cloud computing environment | |
Khatib et al. | Evolutionary computing for multidisciplinary optimisation | |
Gong et al. | Accelerating large-scale prioritized graph computations by hotness balanced partition | |
Nasonov et al. | Metaheuristic coevolution workflow scheduling in cloud environment | |
Woodard et al. | Exploiting volatile opportunistic computing resources with Lobster | |
Gupta et al. | MetaFusion: An efficient metasearch engine using genetic algorithm | |
Hu et al. | Adaptive evolvement of query plan based on low cost in dynamic grid database | |
Long et al. | Optimizing Data Mining Efficiency in Professional Farmer Simulation Training System with Cloud-Edge Collaboration | |
Tambaoan et al. | Prediction of migration path of a colony of bounded-rational species foraging on patchily distributed resources | |
Liu et al. | Performance analysis of data aggregation in wireless sensor mesh networks | |
Levin et al. | The investigation of the possibility of automated collection of information in the hidden segment of the Internet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |