CN108885561B

CN108885561B - Resource allocation for computer processing

Info

Publication number: CN108885561B
Application number: CN201680078758.4A
Authority: CN
Inventors: 罗伯特·布拉德肖; 拉斐尔·德·杰西·费尔南德斯·蒙特祖玛; 丹尼尔·米尔斯; 塞缪尔·格林·米克维提; 塞缪尔·卡尔·惠特尔; 安德烈·马克西姆恩科; 科思明·约内尔·阿拉德; 马克·布莱恩·希尔兹; 哈里斯·塞缪尔·诺弗; 曼纽尔·阿尔弗雷德·范德里奇; 杰弗里·保罗·加德纳; 米哈伊尔·斯马里恩; 鲁文·拉克斯; 艾哈迈德·阿尔泰; 克雷格·D·钱伯斯
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2016-03-04
Filing date: 2016-12-19
Publication date: 2022-04-08
Anticipated expiration: 2036-12-19
Also published as: JP2020074101A; SG11201805281YA; AU2022200716A1; EP3971719A1; AU2020201056B2; CN114756341A; AU2022200716B2; KR102003872B1; JP6637186B2; US20170255491A1; AU2016396079A1; JP6971294B2; JP2019508795A; CN108885561A; WO2017151209A1; US20200225991A1; AU2016396079B2; AU2020201056A1; KR20180085806A; US10558501B2

Abstract

A job that receives a data stream as input is executed. For a job, iteratively determining over a first period of time: the backlog is increased; the backlog amount; and whether to adjust the amount of processing resources. For each iteration determined to adjust the amount of processing resources committed to the job, the amount of processing resources committed to the job is adjusted. For each iteration that is determined not to adjust the amount of processing resources allocated to the job, the amount of processing resources allocated to the job is maintained.

Description

Resource allocation for computer processing

Cross Reference to Related Applications

This application claims priority to U.S. provisional application serial No. 62/303, 827 filed on 2016, 3, 4, which is hereby incorporated by reference in its entirety.

Background

A computer network is a collection of other hardware and computers interconnected by communication channels that allow shared resources and information. Communication protocols define the data formats and rules used to exchange information in a computer network.

Disclosure of Invention

This document relates to computer processing of inputs in a parallel processing environment.

In one aspect, a method is performed by a computer system. The method comprises the following steps: a job is run in a computer system that includes a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited. The method includes iteratively determining, for a job: a backlog increase over a first period of time, wherein the backlog increase is a measure of the increase in raw data received in a data stream to be input into a job; a backlog amount, which is a measure of unprocessed data in the received data stream to be input into the job; a determination is made whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount. The method comprises the following steps: for each iteration determined to adjust the amount of processing resources committed to the job, the amount of processing resources committed to the job is adjusted. The method comprises the following steps: and for each iteration determined not to adjust the amount of processing resources allocated to the job, maintaining the amount of processing resources allocated to the job.

Implementations may include any, all, or none of the following features. For a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be at the target; determining that the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be at the target. For a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be below the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is reduced in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target. For a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be above the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target. For a determined iteration: the backlog increase is determined to be positive; the backlog amount is determined to be below a target, and the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target. For a determined iteration: the backlog increase is determined to be positive; determining that the amount of processing resources allocated to the job is determined to be adjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target. The backlog growth is a measure of the size of the data. The unit of the data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality. The backlog growth is a measure of the processing time. The unit of the treatment time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days. The backlog is a measure of the size of the data. The unit of the data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality. The backlog is a measure of the processing time. The unit of the treatment time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days. The method further comprises the following steps: iteratively determining a processor utilization for the job; wherein iteratively determining whether to adjust the amount of processing resources allocated to the job is further based on processor utilization. For a determined iteration: processor utilization is below a value; determining that an amount of processing resources allocated to a job is determined to be adjusted in response to determining that processor utilization is below a value; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources allocated to the job is reduced in response to determining that the processor utilization is below a value. Reducing the amount of processing resources allocated to the job in response to determining that the processor utilization is below a value comprises: reducing the discrete amount of resources allocated to the job, the discrete amount based on processor utilization. The discrete number is the number of memory disks in the computer. Determining whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount comprises: the determination of the oscillation of the amount of processing resources that results in a commit to the job is smoothed. Smoothing the determination includes: waiting a second period of time. Smoothing the determination includes: the multiple determinations of whether to adjust the amount of processing resources allocated to the job are averaged.

In one aspect, a system includes one or more processors configured to execute computer program instructions; and a computer storage medium encoded with computer program instructions that, when executed by one or more processors, cause a computing device to perform operations comprising: a job is run in a computer system that includes a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited. The operations include iteratively determining, for a job: a backlog increase over a first period of time, wherein the backlog increase is a measure of the increase in raw data received in a data stream to be input into a job; a backlog amount, which is a measure of unprocessed data in the received data stream to be input into the job; a determination is made whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount. The operations include adjusting the amount of processing resources committed to the job for each iteration determined to adjust the amount of processing resources committed to the job. The operations include maintaining the amount of processing resources allocated to the job for each iteration determined not to adjust the amount of processing resources allocated to the job.

The systems and processes described herein may be used to provide a number of potential advantages. By using a backlog measure as a measure to determine resource deployment levels, the disadvantages of over-provisioning and under-provisioning may be reduced or eliminated. Over provisioning can result in unused or idle network resources, increasing costs and reducing the number of jobs that can be processed. By reducing the resource allocation of the workload, unused resources may be released for different jobs. Insufficient provisioning results in increased backlog, which may lead to data loss and increased latency. By increasing the resource allocation for the workload, jobs can be processed faster as inputs increase. In many cases, data is input to a job as streams or constants, and unlimited streams of data increase or decrease unpredictability. Dynamically responding to such inputs, rather than treating them as a batch or planning worst case, can flexibly respond to large spikes or dips in the inputs.

Other features, aspects, and potential advantages will become apparent from the following description and the accompanying drawings.

Drawings

FIG. 1 is a block diagram of a highly distributed computing environment that automatically tunes resource deployment for a set of jobs.

FIG. 2 contains a chart showing the resource deployment scenario for a job with and without autotuning.

Fig. 3 contains a graph showing flow signals over time.

FIG. 4 is a flow diagram of an example process for automatically tuning a resource deployment.

FIG. 5 is a flow diagram of an example process for determining whether a resource allocation should be increased or decreased.

6A, 6B, and 6C are flow diagrams of example processes for setting up, resizing, and suspending tasks.

Fig. 7 is a schematic diagram showing an example of a computing device.

Like reference symbols in the various drawings indicate like elements.

Detailed Description

In a shared parallel computing environment, computing resources may be allocated to different tasks, both to different parallel running tasks in a single job, and to different jobs that may be processed simultaneously. To determine how these resources should be committed, a backlog size may be allocated to the job that defines how many input backlogs should be allowed. Resources are then dynamically allocated and deleted based on the actual size of the backlog to maintain the actual backlog at or below the allocated backlog level. This may allow, for example, dynamic deployment of resources to match the needs of a particular job over time.

Such dynamic adaptation may be referred to as auto-scaling. This is because the resources available for a particular job are automatically scaled to match the processing requirements of the input provided to the job over time. As the amount of input changes over time, more resources are required to prevent unacceptable building of unprocessed input. Such unprocessed input creates a backlog, essentially input that is processable but unprocessed.

Backlogs may be measured in data size (e.g., bytes, megabytes), processing time (e.g., duration that data may but has not been processed, expected time to use a current or fixed number of resources until all current backlog data is processed, throughput value of data over time), counts (e.g., number of files or shards), or along other suitable dimensions.

Backlog size is a useful metric for auto-tuning because it is also a metric that can affect other computing resources, can affect the usefulness of a computer-based product, and can be easily understood by a human user. For example, as the backlog increases, the backlog stored in the computer memory will require more and more memory. By keeping the backlog equal to or less than the target size, the required memory size can be known and planned. Also, by keeping the backlog below the target size, quality of service may be provided. The job of parsing social media posts is useful only if the posts are parsed when they are posted. By keeping the backlog of unresolved posts within, for example, two minutes, it can be ensured that the job is useful for its intended purpose. By recording and reporting backlogs in a dimension that is understandable to human users, these users can decide how to utilize their jobs and how future jobs are utilized and paid.

FIG. 1 is a block diagram of a highly distributed computing environment 100 that automatically tunes resource deployment for a set of jobs. A computer network 102, such as a Local Area Network (LAN), Wide Area Network (WAN), the internet, or a combination of these, connects a distributed processing system 104, a streaming data provider 106, and one or more network storage devices 108. For example, the distributed processing system 104 may include a distributed processing system manager 110, a worker manager 112, and a plurality of data processors 114. Although depicted as being separate from the distributed processing system 104, the network storage devices 108 may also be included in the distributed processing system 104.

The stream data provider 106 provides data processed by the distributed processing system 104. The distributed processing system 104 may obtain data records and instructions for executing one or more jobs 111 on the data records. As an example, the stream data provider 106 may provide the search query record to the distributed processing system 104 for filtering, classification, and other analysis. For example, the search query record may include the search query submitted to the search system and related information, such as a timestamp indicating the time of receipt of the query.

For example, data from a streaming data provider may be unbounded, unpredictable, highly variable, and/or bursty. Borderless data includes data or stream data that is a continuously updated data set of no determined size. An example of a persistent update data set may be a server log as they are generated, or all new commands as they are processed. Unpredictable data includes data that is difficult, unpredictable, or not yet predictable in at least some dimensions. For example, the rate at which a user receives email may be unpredictable, as it may depend on factors that are not or cannot be accessed by the highly distributed computing environment 100. The height-varying data includes data that varies significantly in at least one dimension. The highly variable data may be highly seasonal or periodic. Logs from retailer sites may be updated faster or slower depending on year and shopping season. The burst data includes data whose instantaneous rate of data generation does not approach the average rate of data generation. In other words, bursts of data are often received by the highly distributed computing environment 100 in bursts of high data reception followed by low data reception pauses.

Distributed processing system manager 110 may perform processing scheduling and resource management services for data processing system 104, for example, by allocating jobs 111 to be run by workers 113, allocating one or more data processors 114 to workers 113, allocating one or more disks (disks) of network storage 108 to workers 113, identifying and resolving faults and backlogs, and managing temporary and long-term storage. Although the work manager 112 is described as being separate from the distributed processing system manager 110, in some embodiments, the work manager 112 may be part of the distributed processing system manager 110. The worker manager 112 monitors the workers 113, data processors 114, and network storage devices 108 to determine if and when a workload backlog is created. If a workload backlog is being created, or is not created when resources (e.g., data processor 114 and/or network storage 108) are idle, the workload manager 112 may automatically tune the resources allocated to the jobs 111.

The jobs 111 process data from the streaming data provider 106 and produce output that may be stored to the network storage 108 or used for other purposes. In some implementations, job 111 receives input data from multiple data streams. For example, job 11 may receive a social media post sub-stream from stream data provider 106 and a weather data stream from a different (not shown) streaming data provider. To process the received input data stream, a job 111 may be assigned to a worker 113. For example, each worker 113 may have allocated one or more processors 114 and physical or virtual one or more disks of the network storage device 108. The worker 113 assigned to a particular job 111 may then process the stream input data according to the processes defined by the job 111 and send the product output stored in the network storage 108 elsewhere.

While this example describes resources (e.g., data processors 114 and network storage 108) being allocated to workers 113 and to workers 113 of a job 111, other schemes for provisioning resources to a job 111 are possible. For example, disks of the data processor 114 and/or the network storage device 108 may be allocated directly to the job 111 by the distributed processing system manager.

In some cases, job 111 may be broken up into processes, with each process being assigned to one worker 113. This may be advantageous, for example, when job 111 is parallelizable, e.g., job 111 can be decomposed into multiple activities, each of which can run independently of one another.

FIG. 2 contains graphs 200, 208, and 216 illustrating resource allocations for jobs with and without auto-tuning. Without auto-tuning, resources may be over-provisioned or under-provisioned. Over provisioning wastes resources, while under provisioning may result in lags during workload peaks. In contrast, automatically tuning the prepare worker as the workload increases, and similarly reducing the number of workers as the workload decreases, resources may be dynamically adjusted as needed.

In chart 200, the overproduction of a resource is shown. Line 202 represents a constant level of arming that is greater than the workload shown by line 204. Area 206 represents unused and therefore wasted resources that have been allocated.

In the chart 208, the resource is shown under-producing. Line 210 represents a constant level of arming that is below the workload shown by line 212 during peak periods. Region 214 represents the unmanageable workload that caused the backlog.

In the diagram 216, the automatic tuning of the resource allocation is shown. Line 218 represents the dynamic level of provisioning of resources in response to changes in workload as shown by line 220. As shown, the provisioned resources are nearly equal to or greater than the workload. In addition, although the workload is greater than the allocated resources in some cases, which results in an increase in backlog, the resource allocation level is rapidly increased, so that more resources are allocated in a short period. Thus, the increased backlog may be handled by the increased resource allocation and processed to return to the target level.

For example, auto-tuning may use the following signals to make the decision. CPU utilization rate: average CPU utilization of all worker virtual machines in a job. And (3) stage backlog increase: size of unprocessed data. Stage backlog: metrics of unprocessed data in a received data stream to be input to a job. In some embodiments, the backlog amount may be a backlog time that is a measure of the time it takes to resolve the backlog given the current throughput without additional inputs arriving. Other signals may also be used.

Auto-tuning seeks a balance between all phases of the job with the following constraints: backlog growth, backlog amount, and CPU utilization. A subset of these constraints may be used to make the adjustments. For example, in some embodiments, backlog growth and backlog amount may be used. The backlog growth constraint is the average backlog growth < 0. If the backlog growth is positive, the backlog accumulates and the job falls behind. The backlog constraint is the backlog quantity < (target backlog quantity). For example, when the backlog amount is backlog time, a relatively short backlog time is required to avoid such a steady state: the backlog does not grow but there is a large processing delay. Shorter acceptable backlogs achieve lower latency while requiring more processing resources. In some cases, the acceptable backlog delay time is 0 or close to 0. In other cases, the acceptable backlog delay time is longer. The CPU constraint is that the CPU (e.g., data processor 114) utilization is above a threshold. A steady state with low CPU utilization indicates that the streaming job may catch up with fewer workers. In some embodiments, backlog constraints are used to determine whether to adjust the amount of resources allocated to a job, and if the system determines to adjust the amount of resources, CPU constraints are used to determine the amount of resources to adjust.

The overall job size depends on the maximum number of workers required to meet the backlog constraints for all phases. If the backlog constraint satisfies all phases, then the CPU constraint is used to determine how much to reduce the throttling.

In some implementations, persistent disks are used to store state. A job may have a fixed number of persistent disks to store the job status, which in some cases is the same as the maximum number of workers for the job. Each disk corresponds to a critical scope that the job is processing. Thus, workers with more disks are more busy than workers with fewer disks (assuming that the data within the range is evenly distributed). In some cases, scaling the number of workers with uneven disk distribution may result in performance at the maximum load level of the workers. Thus, the auto-tuning will select the number of workers that will provide a substantially uniform disk distribution (e.g., if the system target is d disks per worker, the number of workers with d1 disks is minimized).

In some embodiments, the desired number of workers for a particular stage is calculated by dividing the current input rate (throughput + backlog increase) by the current throughput, and scaled by the current number of workers. For example, if the throughput of 10 workers is 1MB/s and the backlog increases to 200K/s, then auto-tuning will request 1.2 x 10 — 12 workers. In addition, additional workers may be added due to exceeding the desired backlog time.

In some embodiments, each stage independently calculates the desired number of workers. The job pool is then scaled to the desired maximum number of workers for all phases.

To reduce the amount of resources allocated to a job, the job may be in a steady state with an average backlog time below some threshold and a backlog increase of 0 (e.g., on average, over time, within a threshold distance). In some embodiments, the system determines to what extent to reduce the fitting according to the following. Given the desire for even disk distribution, autotuning uses the CPU as a proxy for whether a job can handle the next lower level of the worker, as described below. Assuming the current number of workers, w1, each worker has d1 (or d1-1) disks. The minimum d2> d1 disks and the necessary minimum w2 workers per worker were found, so that w2< w1 and each worker gets d2 (or d2-1) disks. Assuming perfect scaling and no overhead, the new worker will run at the current CPU rate multiplied by w1/w 2. If the new CPU rate is below a reasonable threshold (e.g., below 1.0), then the auto-tune attempt reduces the deployment to w 2.

The signal just after the previous commit increase or decrease typically does not represent a steady state (e.g., due to frequent cache misses, etc.). Thus, the auto-tuning may use smoothing of the decision to avoid unnecessary oscillations while still reacting to real changes in workload. Smoothing of the amplification decision occurs by: 1) waiting for the input signal to stabilize for a period of time after the worker changes; and 2) smoothing the output signal to select an average desired worker over the amplified time window, assuming that the minimum number of workers requested in the window is higher than the current number. Waiting for the minimum to rise over the entire window avoids reacting to short term noise in the input rate. For example, if the number of current workers is less than 10 and the window contains the following desired number of workers [11, 12, 10], the deployment increases the number of workers to be selected 11 as to be started (before normalizing the number to obtain a uniform disk distribution).

For zooming out, in some embodiments, the auto-tune waits until each value in the window falls below the current number of workers within a certain period of time before zooming out to a maximum value (which will currently always be the next lower level worker with a uniform disk distribution).

The following signals may be associated with a worker and may be received from a worker manager. Per stage signals. input _ size: increasing total number of bytes processed. Derivative computation throughput, see below. backlog _ size: current size in backlog in bytes. The derivative calculates the backlog growth, see below. active _ size: the byte currently being processed at this stage. For determining the inherent delay of the phase. system _ watermark: systematic watermarking at this stage. In some embodiments, the increase is at 1M/s.

An operation signal. active _ works: the number of currently active workers. In a streaming case, a worker may be a worker process and/or a worker thread. In some embodiments, the number of worker checks in after receiving the configuration update is calculated. The count may reflect the number of workers that are ready for work (e.g., after all disks are allocated, the scan is complete). attached _ disks: for determining if not all disks are attached and therefore no scaling decision should be made. Average CPU utilization: average CPU utilization for all worker virtual machines in a job.

E.g., a derivative signal for each stage of the job. Throughput from input _ size (since last update and exponential averaging). Backlog _ growth from backlog size (since last update and exponential average). backlog _ time: (backlog _ size active _ size)/throughput. The time required to resolve backlogs when no additional input arrives (which may discount what is currently running). In some embodiments, as the system approaches real time, efficiency may decrease, and thus the effective backlog time may be longer. min _ throughput ═ throughput + max (0, backlog _ growth) minimum throughput needs to be not behind.

The goal of each phase is to scale the worker to achieve equilibrium by the following constraints: on average

backlog _ growth is 0, namely the backlog length is stable and does not increase.

backlog _ time ═ acceptable _ backlog _ time (configuration parameter).

backlog _ time > -max _ backlog _ to _ down scale (configuration parameter). The backlog _ time constraint attempts to ensure that the latency is within an acceptable range to avoid being in a stable backlog state where backlogs are extremely large.

The max _ backing log _ to _ down scale constraint is the signal we are trying to reduce the deployment. If the system is below the threshold, the system may waste resources and may do so with fewer system workers.

In some embodiments, the amplification is based on adding workers to 1) keep up with the input rate, and 2) reduce the backlog to a minimum length. In some cases, it may be assumed that the final throughput scales linearly with the number of workers, that is, as the number of workers increases or decreases by x times, the throughput increases or decreases by x times. In some embodiments, the number of workers required may be calculated as the minimum level of throughput required divided by the throughput of each worker.

The remaining backlog is a backlog that is higher than the acceptable backlog, i.e., when the growth is negative, the system does not appear to decrease the current backlog growth. This can be calculated as the active backlog magnitude, minus the acceptable magnitude of backlog, plus the sum of the negative backlog increase and the backlog recovery time.

Based on the current throughput of each worker, an additional number of workers is selected to resolve the remaining backlog within the backlog recovery time. The number of additional workers may be calculated as the remaining backlog divided by the sum of the throughput of each worker and the backlog recovery time. The expected number of new workers is then calculated as the number of backlogged workers plus the number of worker throughputs.

Since the total number of disks per job may be fixed, the system may distribute existing disks among the workers as the system scales. Disks correspond to the source/scope of the data, so workers with more disks are more busy than workers with fewer system disks (assuming that the data is evenly distributed within the scope). In some cases, scaling to the number of workers with uneven disk distribution may result in performance at the highest loaded worker level. Thus, the system may attempt to scale to the nearest number of workers that provide uniform disk distribution.

To reduce deployment, in some embodiments, the system may need to be in a steady state where the average backlog time is below max _ backlog _ to _ down scale and the backlog growth averages 0. To determine how much to reduce scaling, the system uses the CPU as a proxy for whether the system can handle the next lower level worker. An example of an agent-based determination may be as follows. Assume that the current number of workers w1 means the maximum # disk d1 per worker. The system calculates the maximum workers w2< w1, each with at most d2> d1 disks. In some cases, the new worker will run at the current CPU rate multiplied by w1/w 2. If the new CPU is below the acceptable maximum (e.g., below 1.0), the system attempts to reduce the deployment to w 2.

For job scaling, each stage may independently have the desired number of workers. The job pool is then scaled to the maximum expected number of workers for all phases.

Smoothing may occur by: 1) waiting for the input signal to stabilize for a period of time after the operator change, and 2) smoothing the output signal to select an average desired operator over a window in the amplification, assuming the minimum value is higher than the current amount. Waiting for the minimum to rise over the entire window can avoid reacting to a short term peak in min _ throughput.

For zooming out, in some embodiments, the system may wait until each value in the window is below the current number of workers, and then zoom out to the maximum identified value (which will currently always be the next system level worker with a uniform disk distribution).

Fig. 3 contains

graphs

400, 402, 406, and 408 showing flow signals over time. Graph 400 shows the number of active workers. In this series, the worker gradually reduces deployment from 15 to 8 as can be seen in the bottom series where the requested worker is displayed. The difference between the requested and active workers indicates that some workers may not be active for a short period of time during the resizing, may fail and return, etc. Note that since all workers are currently operating in all phases, the series of active and requested workers is the same for each working phase.

Graph 402 shows the backlog time for this phase. The estimated time required to reduce the backlog to 0 given the current average processing speed.

Graph 404 shows streaming CPU utilization. Shown here is the average utilization across all virtual machines. However, other metrics may be used, including an average, median, mode, minimum, maximum, or a combination of these that excludes outliers.

Graph 406 shows that the input rate starts at about 5MB/s and increases to 6.6MB/s and then starts decreasing below 4 MB/s. The hump in velocity is due to the decreasing resizing point shown in graph 408. Resizing can cause a line (pipeline) stall (stall), which in turn can cause a backup and then a chase work to be performed. These transients are therefore present in the derived signal.

Fig. 4 is a flow diagram of an example process 500 for auto-tuning a resource deployment. The process 500 may be performed by, for example, elements of the highly distributed computing environment 100. Thus, for example, the process 500 will be described with reference to the highly distributed computing environment 100. However, other elements may be used to perform process 500 or other similar processes.

Process 500 includes running a job 502 in a computer system containing a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited. For example, distributed processing system manager 110 may receive jobs from external users. The job may include instructions, for example, from one or more input sources of the streaming data provider 106 to use as input data. The highly distributed computing environment 100 may run a job by allocating disks and/or data processors 114 of the network storage device 108 to one or more workers 113, or instructing the worker manager 112 to run a job by allocating disks and/or data processors 114 of the network storage device 108 to one or more workers 113, and running a job by allocating the one or more workers 113 to the job. The initial deployment of a job may be set based on any technically appropriate factors, including but not limited to past runs of job 111, available system resources, or other factors.

Process 500 includes iteratively determining a backlog increase 504, a backlog amount 506, and whether to adjust a deployment 508 for the job.

Iteratively determining a backlog growth for the job 504 includes determining a backlog growth over a first time period, where the backlog growth is a measure of growth of unprocessed data in the received data stream to be input into the job. For example, when the job 111 runs, it receives input from the stream data provider 106. This input may be stored in a suitable storage structure, such as a queue or buffer, until job 111 is able to accept and process more data. This may create or increase cumulative growth over time if the job 111 cannot process the data as quickly as it is received. This may result in a reduction in backlog growth if job 111 is able to process data in the received storage structure more quickly. In any case, changes or no changes in growth can be monitored. For example, such backlog growth may be measured in terms of data size (i.e., bits, bytes, megabytes, gigabytes, records, cardinalities), processing time (i.e., microseconds, seconds, minutes, hours, days, MB/s), or other changes.

Iteratively determining 506 a backlog amount for the job includes determining a backlog amount that is a measure of unprocessed data in the received data stream to be input into the job. As described, data waiting to be processed by job 111 may be referred to as backlog. For example, the backlog may be measured in terms of data size (i.e., bits, bytes, megabytes, gigabytes, records, cardinalities), processing time (i.e., microseconds, seconds, minutes, hours, days), or other means.

The iterative determination of whether to adjust the deployment 508 for the job includes determining whether to adjust the amount of processing resources deployed to the job based on the backlog growth and the backlog amount. As will be further described in the example process 600 below, the status of backlog growth and backlog amount may be used as an indicator of over-provisioning or under-provisioning, and it may be possible or desirable to provision fewer or more resources.

Determining to adjust the amount of processing resources allocated to the job for each iteration of adjusting the amount of processing resources allocated to the job 510. For example, if network resources are over-provisioned to job 111, they may be reduced. And may be increased if network resource allocation is insufficient.

Determining 512 the amount of processing resources committed to the job is maintained for each iteration that does not adjust the amount of processing resources committed to the job. For example, if a network resource is neither over-provisioned nor under-provisioned, a network resource of the same or similar level may be provisioned to job 111. This may include the persistent provisioning of the exact same resource, or may include the provisioning of different resources in an equivalent manner.

Other example processes may include different orders, numbers, and types of elements. For example, in addition to backlog growth and backlog amount, average processor utilization may be determined. In this example, it is also iteratively determined whether to adjust the amount of processing resources allocated to the job based on processor utilization. The system may then adjust the amount of processing resources committed to the job includes reducing the amount of processing resources committed to the job in response to the processor value being determined to be below a value.

FIG. 5 is a flow diagram of an example process 600 for determining whether a resource allocation should be increased or decreased. The process 600 may be performed by, for example, elements of the highly distributed computing environment 100. Process 600 may be performed, for example, as part of performing process 500. However, process 600 or other similar processes may be performed using other elements, either as part of process 500 or not as part of process 500.

The amount of processing resources allocated to the job 606 may be maintained. For example, if the backlog increase is determined to be negative or zero (e.g., zero, within a threshold of zero) and the backlog amount is determined to be at a target value (e.g., the same as the target, within a target threshold), this may indicate that the provisioned resources are sufficient to allow the job to process the input data without over-provisioning.

The amount of processing resources allocated to the job may be reduced 608. For example, if the backlog increase is determined to be negative or zero and the backlog amount is below the target, the backlog increase may be allowed-perhaps until the backlog approaches or reaches the target.

The amount of processing resources allocated to the job may be increased 610. For example, if the backlog growth is determined to be zero and the backlog amount is at the target, additional resources may be deployed to reduce the backlog, possibly until the backlog approaches or reaches the target.

The amount of processing resources allocated to the job may be maintained 612. For example, if the backlog growth is determined to be positive and the backlog amount is determined to be below the target, the backlog may be allowed to grow, perhaps until the backlog approaches or reaches the target.

The amount of processing resources allocated to the job may be increased 614. For example, if the backlog increase is determined to be positive and the backlog amount is not below the target, additional resources may be allocated to the job so that the backlog increase may be stopped or reversed.

Other example processes may include different numbers, types, and orders of elements.

During initial startup 700, the stream load balancer may be responsible for telling the workers which scopes they own, which disks to install, and coordinating the attachment of data disks to the correct workers. In the resizing process, the load balancer can tell the workers to stop working within a certain range, unload the corresponding disks, separate the disks, reattach the disks, create a new topology establishment task for all workers, and create a new startup computation task.

The on-stream equalizer interacts with the worker using a small step running instance. These instances are triggered when the stream load balancer needs to interact with the worker. These steps may end with the result of generating a particular disk migration update that the flow load balancer consumes in order to know that the steps it initiated have completed. After the streaming load balancer receives the disk migration update, it can issue a new set-up (topology) task to all workers using the new disk allocation.

An initial start-up 702. The flow load balancer initially sends a task setup message to each worker that contains the topology and disk assignments. The initial setup task may contain disks for each worker, but the system may decide not to send disk assignments on the initial setup task. This may prevent the workers from attempting to install a tray to which they have not yet attached. Workers may declare this setup task and because this may be their first, they may attempt to install the disk, but the disk will not appear in the installation task, which may result in errors in the system.

The stream load balancer may then initiate 704 disk attachment. The first step may perform disk attachment for all workers of a given topology in step inputs. For example, once all disks are attached, a subsequent compute launch may trigger the generation of a stream compute task for all workers to tell them what disks are attached. This step may wait until all of these tasks have entered a completed state. The worker may declare these computational tasks and attempt to install the disk and begin querying the associated scope. At that point, the worker may begin working 706.

When the system detects that all flow computation tasks have completed, the disk migration update 708 may begin to inform the flow load balancer that everything is completed. The streaming load balancer observes the diskmigrantionupdate and cancels all existing setup tasks and can issue new setup tasks to all workers containing the final disk allocation.

The worker may abandon the setup task and seek a new task when the next update of its task setup message lease (1 s). Assuming that the worker already has a properly installed disk, the worker does not need to perform operations on the disk in the new set-up task, as this is not their first one.

Resizing (e.g., moving a disk) 710. There may be at least 2 cases, expand and contract. They may all be handled using the same or similar steps in the protocol, an example of which is as follows: a new null setup task 712 is issued for the new worker (no scaling down). These initially do not contain a disc. Waiting for new workers to declare these null setup tasks (to avoid prematurely stopping existing flows). There may be no need to wait for the shrink. A stop computation task 712 is sent to all workers that have lost the disk and waits for them to all complete. A disc detachment step 714 is performed. A disc attachment step 716 is performed. Send the start compute task to all acquire disk workers 718 and wait for them to all complete. All existing setup tasks are cancelled and a new final setup task is issued to all workers 720. Coordination between the streaming load balancer and the mini-workflow steps described above can be done in the same manner as the initial setup process, i.e., after the last step is completed, the disk migration update results consumed by the streaming load balancer can be generated before the streaming load balancer issues the final new setup task.

And pausing 740. To pause the workflow, a pause parameter 742 may be initialized. A computing pause 704 of the worker may be initiated. The discs 706 may be separated from some or all of the workers. The disk migration update 708 may be completed.

Although

processes

700, 710, and 740 are described as having a particular number, order, and type of operations, other processes are possible. For example, another process is described below. In this process, the system can issue a new build task at the same time the system issues start and stop compute tasks, thereby idempotent the behavior of the worker with respect to the failure. At each disk migration, there may be at least 3 workers: a worker missing a disc. A disc handler is obtained. A work machine holding the same disk. For testing, the system also supports a fourth combination, the worker that got and lost the disk.

Workers who lose any disk can get their tasks and need to get new set up tasks while no longer containing their lost disk. In addition, if the worker can get a disk in the next task, the setup task issued at this time may not contain a disk that it does not already own. Thus, if they crash, they may not necessarily attempt to install a disk that they have/have just abandoned, and they may not necessarily attempt to install a disk that has not yet been received. If they successfully unload their disk, they can reach the desired state. Otherwise, they may re-declare and run the unload again, which may be a no operation (noop) if it has completed. If they crash and restart, they will only install the disk they have retained in the round (neither lost nor acquired)

The worker that obtains the disk gets a new setup task at the same time as getting the start-up computation task. If they crash, they can install a new disk upon reboot, whether or not the boot computing task has been completed. Again, if they have installed some disks, the reinstallation may be idempotent and may not cause problems. During migration, workers that remain on the same disk may be assigned new set-up tasks at any time. The system may create new set-up tasks for these workers at the stop compute point, i.e., the stop compute point is the first point some workers have to stop working using the existing topology.

And (5) starting initially. At initial startup, the system no longer issues initial setup tasks, but rather the system attaches the disk, and then issues all final setup tasks as part of the startup computation steps. This solves the race and shortens the start-up time because no setup tasks need to be discarded and restated.

Resizing (migration). At scale-up, the system should wait for a new virtual machine to be ready before removing the disk from the productive worker. Thus, the system will still issue an initial setup task for the new worker, while a flag (flag) is set (since there are no unallocated disks, they contain virtually no disks at that time). New workers may initially get an empty setup task, but this allows waiting until they are ready.

Thereafter, a stop calculation task and its new setup task (and cancellation of the existing setup task) are issued for the existing workers that lost the disk. At the same time, the system may also issue all setup tasks for workers that have neither added nor lost disks. Once all stop calculation tasks are completed, the disks are detached and then attached. Finally, a new start computation task is issued, as well as a new setup task for the worker that got the disk (and the cancellation of the existing setup task).

The new workers can then see their newly created tasks with the final disk allocation, or first see the computing task begin. Either of which may enable them to be installed on the disc they have at the time. If they crash during that time, they can also reach the new target state.

And (6) pausing. Upon suspension, the system issues stop compute tasks to all virtual machines, and cancellation of new build tasks that do not include disk assignments and trapped build tasks (so in the event of a crash, the workers do not attempt to install disks that they may have abandoned). The disc is separated.

It may be useful to compute a mapping from a worker to its setup tasks. Because this mapping is modified in the stop and start computation steps, which have their own nested transactions, we cannot maintain this mapping consistently as we did previously in the stream load balancer. Instead, we use the result cache in these steps to compute the mapping as needed. This also provides the opportunity to delete all old call results that abort and complete the task.

This mapping can be computed and queried whenever we issue a new setup task for this worker to cancel an existing setup task (if any). This occurs after stopping the computation or after starting the computation step, and zooming out or pausing.

Additionally, some, but not all, of the information available to the system may be used to communicate the communication of job processing to the user. The following are some example details that may be used.

Fig. 7 illustrates an example of a mobile computing device 850 and a computing device 800 that may be used to implement the techniques described herein. Computing device 800 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Mobile computing device 850 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, and other similar computing devices. The components shown here, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to be limiting.

Computing device 800 includes a processor 802, memory 804, a storage device 806, a high-speed interface 808 connecting to memory 804 and to a plurality of high-speed expansion ports 810, and a low-speed interface 812 connecting to low-speed expansion ports 814 and storage device 806. Each of the processor 802, memory 804, storage 806, high-speed interface 808, high-speed expansion port 810, and low-speed interface 812, are interconnected using various buses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 802 may process instructions for execution within the computing device 800, including instructions stored in the memory 804 or on the storage device 806 to display graphical information for a GUI on an external input/output device, such as display 816 coupled to high speed interface 808. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Also, multiple computing devices may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).

The memory 804 stores information within the computing device 800. In some embodiments, memory 804 is a volatile memory unit or units. In some implementations, the memory 804 is one or more non-volatile memory units. The memory 804 may also be another form of computer-readable medium, such as a magnetic or optical disk.

The storage device 806 can provide mass storage for the computing device 800. In some implementations, the storage device 806 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. The instructions may be stored in an information carrier. When executed by one or more processing devices (e.g., processor 802), perform one or more of the methods described above. The instructions may also be stored by one or more storage devices, e.g., a computer or machine readable medium (e.g., memory 804, storage device 806, or memory on processor 802).

The high-speed interface 808 manages bandwidth-intensive operations for the computing device 800, while the low-speed interface 812 manages lower bandwidth-intensive operations. Such a functional adaptation is only an example. In some embodiments, high speed interface 808 is coupled to memory 804, display 816 (e.g., through a graphics processor or accelerator), and high speed expansion port 810, high speed expansion port 810 being receptive to various expansion cards (not shown). In this embodiment, low-speed interface 812 is coupled to storage device 806 and low-speed expansion port 814. The low-speed expansion port 814, which can include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet), can be coupled through, for example, a network adapter to one or more input/output devices such as a keyboard, pointing device (pointing device), scanner, or networking device such as a switch or router.

As shown, the computing device 800 may be implemented in a number of different forms. For example, it may be implemented as a standard server 820, or multiple times in a group of such servers. Additionally, it may be implemented in a personal computer such as a laptop computer 822. It may also be implemented as part of a rack server system 824. In addition, components from computing device 800 may be combined with other components in a mobile device (not shown), such as mobile computing device 850. Each of these devices may contain one or more of computing device 800 and mobile computing device 850, and an entire system may be made up of multiple computing devices in communication with each other.

Mobile computing device 850 includes a processor 852, memory 864, an input/output device such as a display 854, a communication interface 866, and a transceiver 868, among other components. The mobile computing device 850 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of the processor 852, memory 864, display 854, communication interface 866, and transceiver 868 are interconnected using various buses, and several components may be mounted on a common motherboard or in other manners as appropriate.

The processor 852 can execute instructions within the mobile computing device 850, including instructions stored in the memory 864. Processor 852 may be implemented as a chipset of chips that include separate pluralities of analog and digital processors. For example, processor 852 may provide coordination of the other components of mobile computing device 850, such as control of user interfaces, applications run by mobile computing device 850, and wireless communication through mobile computing device 850.

Processor 852 may communicate with a user through a display interface 856 and a control interface 858 coupled to a display 854. For example, the display 854 may be a TFT (thin film transistor liquid crystal display) display or an OLED (organic light emitting diode) display or other suitable display technology. The display interface 856 may comprise appropriate circuitry for driving the display 854 to present graphical and other information to a user. The control interface 858 may receive commands from a user and convert them for submission to the processor 852. In addition, an external interface 862 may provide communication with the processor 852, so as to enable near field communication of the mobile computing device 850 with other devices. For example, external interface 862 may provide for wired communication in some embodiments, or for wireless communication in other embodiments, and multiple interfaces may also be used.

The memory 864 stores information within the mobile computing device 850. The memory 864 may be implemented as one or more of the following: a computer readable medium or media, a volatile memory unit or group of units, or a non-volatile memory unit or group of units. Expansion memory 874 may also be provided and connected to mobile computing device 850 through expansion interface 872, which 872 may include, for example, a SIMM (Single In Line memory Module) card interface. Expansion memory 874 may provide additional storage space for mobile computing device 850, or may also store applications or other information for mobile computing device 850. Specifically, expansion memory 874 may include instructions to carry out or supplement the processes described above, and may include secure information also. Thus, for example, extended memory 874 may be provided as a security module for mobile computing device 850, and may be programmed with instructions that permit secure use of mobile computing device 850. In addition, secure applications may be provided via the SIMM card as well as other information, such as placing identification information on the SIMM card in a non-hackable manner.

As discussed below, the memory may include, for example, flash memory and/or NVRAM memory (non-volatile random access memory). In some embodiments, the instructions are stored in an information carrier. The instructions, when executed by one or more processing devices (e.g., processor 852), perform one or more methods as described above. The instructions may also be stored by one or more storage devices, e.g., by one or more computer-or machine-readable media (e.g., memory 864, expansion memory 874, or memory on processor 852). In some implementations, the instructions may be received in a propagated signal, such as over the transceiver 868 or the external interface 862.

Mobile computing device 850 may communicate wirelessly through a communication interface 866, which may include digital signal processing circuitry if necessary. Communication interface 866 may provide for communications under various modes or protocols, such as through GSM voice calls (global system for mobile communications), SMS (short message service), EMS (enhanced messaging service), or MMS messages (multimedia messaging service), CDMA (code division multiple access), TDMA (time division multiple access), PDC (personal digital cellular), WCDMA (wideband code division multiple access), CDMA2000, or GPRS (general packet radio service), among others. Such communication may occur, for example, through the transceiver 868 using radio frequencies. Additionally, short-range communication may occur, for example, using a Bluetooth, WiFi, or other such transceiver (not shown). In addition, GPS (Global positioning System) receiver module 870 may provide additional navigation-and location-related wireless data to mobile computing device 850, which may be used as appropriate by applications running on mobile computing device 850.

Mobile computing device 850 may also communicate audibly using audio codec 860, where audio codec 860 may receive conversational information from the user and convert it into usable digital information. Audio codec 860 may likewise generate audible sound for a user, e.g., through a speaker, such as in a headset of mobile computing device 850. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.), and may also include sound generated by applications operating on the mobile computing device 850.

As shown, the mobile computing device 850 may be implemented in a number of different forms. For example, it may be implemented as a cellular telephone 880. It may also be implemented as part of a smart phone 882, personal digital assistant, or other similar mobile device.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms machine-readable medium and computer-readable medium refer to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor), a keyboard and a pointing device (e.g., a mouse or a trackball), the display device being for displaying information to the user and by which the user can provide input to the computer. Other types of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that can include a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a Local Area Network (LAN), a Wide Area Network (WAN), and the Internet.

The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

The following examples summarize other embodiments:

example 1: a method implemented in a computer system, the method comprising: running a job in a computer system comprising a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited; iteratively determining for the job: a backlog increase over a first period of time, wherein the backlog increase is a measure of the increase in raw data received in a data stream to be input into a job; a backlog amount, which is a measure of unprocessed data in the received data stream to be input into the job; determining whether to adjust an amount of processing resources allocated to the job based on the backlog growth and the backlog amount; for each iteration determined to adjust the amount of processing resources committed to the job, adjusting the amount of processing resources committed to the job; and for each iteration determined not to adjust the amount of processing resources allocated to the job, maintaining the amount of processing resources allocated to the job.

Example 2: the method of example 1, wherein, for a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be at the target; determining that the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be at the target.

Example 3: the method of example 1 or 2, wherein, for a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be below the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is reduced in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target.

Example 4: the method according to one of examples 1 to 3, wherein, for a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be above the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target.

Example 5: the method according to one of examples 1 to 4, wherein, for a determined iteration: the backlog increase is determined to be positive; the backlog amount is determined to be below the target; in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target, determining that the amount of processing resources committed to the job is determined to be unadjusted.

Example 6: the method according to one of examples 1 to 5, wherein, for a determined iteration: the backlog increase is determined to be positive; the backlog amount is determined to be not lower than the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target.

Example 7: the method according to one of examples 1 to 6, wherein the backlog growth is a measure of data size.

Example 8: the method of example 7, wherein the unit of data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality.

Example 9: the method according to one of examples 1 to 6, wherein the backlog growth is a measure of processing time.

Example 10: the method of example 9, wherein the unit of processing time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days.

Example 11: the method according to one of examples 1 to 10, wherein the backlog amount is a measure of data size.

Example 12: the method of example 11, wherein the unit of data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality.

Example 13: the method according to one of examples 1 to 10, wherein the backlog amount is a measure of processing time.

Example 14: the method of example 13, wherein the unit of processing time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days.

Example 15: the method according to one of examples 1 to 14, wherein the method further comprises: iteratively determining a processor utilization for the job; wherein iteratively determining whether to adjust the amount of processing resources allocated to the job is further based on processor utilization.

Example 16: the method of example 15, wherein: wherein, for a determined iteration: processor utilization is below a value; determining that an amount of processing resources allocated to a job is determined to be adjusted in response to determining that processor utilization is below a value; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources allocated to the job is reduced in response to determining that the processor utilization is below a value.

Example 17: the method of example 16, wherein reducing the amount of processing resources committed to the job in response to determining that the processor utilization is below a value comprises: reducing the discrete amount of resources allocated to the job, the discrete amount based on processor utilization.

Example 18: the method of example 17, wherein the discrete number is a number of computer memory disks.

Example 19: the method of one of examples 1 to 18, wherein determining whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount comprises: the determination of the oscillation of the amount of processing resources that results in a commit to the job is smoothed.

Example 20: the method of example 19, wherein smoothing the determination comprises: waiting a second period of time.

Example 21: the method of example 19, wherein smoothing the determination comprises: the multiple determinations of whether to adjust the amount of processing resources allocated to the job are averaged.

Example 22: a system, comprising: one or more processors configured to execute computer program instructions; and a computer storage medium encoded with computer program instructions that, when executed by one or more processors, cause a computing device to perform operations. The operation comprises the following steps: running a job in a computer system comprising a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited; iteratively determining for the job: a backlog increase over a first period of time, wherein the backlog increase is a measure of the increase in raw data received in a data stream to be input into a job; a backlog amount, which is a measure of unprocessed data in the received data stream to be input into the job; determining whether to adjust an amount of processing resources allocated to the job based on the backlog growth and the backlog amount; for each iteration determined to adjust the amount of processing resources committed to the job, adjusting the amount of processing resources committed to the job; and for each iteration determined not to adjust the amount of processing resources allocated to the job, maintaining the amount of processing resources allocated to the job.

Example 23: the system of example 22, wherein, for the determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be at the target; determining that the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be at the target.

Example 24: the system of example 22 or 23, wherein, for the determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be below the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is reduced in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below the target.

Example 25: the system of one of examples 22 to 24, wherein, for a determined iteration: the backlog increase is determined to be zero or negative; the backlog amount is determined to be above the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above the target.

Example 26: the system of one of examples 22 to 25, wherein, for a determined iteration: the backlog increase is determined to be positive; the backlog amount is determined to be below the target; in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target, determining that the amount of processing resources committed to the job is determined to be unadjusted.

Example 27: the system of one of examples 22 to 26, wherein, for a determined iteration: the backlog increase is determined to be positive; the backlog amount is determined to be not lower than the target; determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources committed to the job is increased in response to the backlog increase being determined to be positive and the backlog amount being determined to be below the target.

Example 28: the system of one of examples 22 to 27, wherein the backlog growth is a measure of data size.

Example 29: the system of example 28, wherein the unit of data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality.

Example 30: the system of one of examples 22 to 27, wherein the backlog increase is a measure of processing time.

Example 31: the system of example 30, wherein the unit of processing time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days.

Example 32: the system of one of examples 22 to 31, wherein the backlog amount is a measure of data size.

Example 33: the system of example 32, wherein the unit of data size is at least one of the group consisting of a bit, a byte, a megabyte, a gigabyte, a record, and a cardinality.

Example 34: the system of one of examples 22 to 31, wherein the backlog amount is a measure of processing time.

Example 35: the system of example 34, wherein the unit of processing time is at least one of the group consisting of microseconds, seconds, minutes, hours, and days.

Example 36: the system of one of examples 22 to 35, wherein the operations further comprise: iteratively determining a processor utilization for the job; wherein iteratively determining whether to adjust the amount of processing resources allocated to the job is further based on processor utilization.

Example 37: the system of example 36, wherein: wherein, for a determined iteration: processor utilization is below a value; determining that an amount of processing resources allocated to a job is determined to be adjusted in response to determining that processor utilization is below a value; and wherein adjusting the amount of processing resources allocated to the job comprises: the amount of processing resources allocated to the job is reduced in response to determining that the processor utilization is below a value.

Example 38: the system of example 37, wherein reducing the amount of processing resources committed to the job in response to determining that the processor utilization is below a value comprises: reducing the discrete amount of resources allocated to the job, the discrete amount based on processor utilization.

Example 39: the system of example 38, wherein the discrete number is a number of computer memory disks.

Example 40: the system of one of examples 22 to 39, wherein determining whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount comprises: the determination of the oscillation of the amount of processing resources that results in a commit to the job is smoothed.

Example 41: the system of example 40, wherein smoothing the determination comprises: waiting a second period of time.

Example 42: the system of example 40 or 41, wherein smoothing the determination comprises: the multiple determinations of whether to adjust the amount of processing resources allocated to the job are averaged.

Although some embodiments have been described in detail above, other modifications are possible. For example, while the client application is described as a delegation of access (delegat), in other embodiments, delegation may be used by other applications implemented by one or more processors, such as applications running on one or more servers. In addition, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Additionally, other acts may be provided, or acts may be deleted, from the described flows, and other components may be added to, or removed from, the described systems. Accordingly, other implementations are within the scope of the following claims.

Claims

1. A method implemented in a computer system, the method comprising:

running a job in a computer system comprising a plurality of processing resources, the job receiving as input a data stream, wherein an amount of data in the data stream is unlimited;

iteratively determining for the job:

a backlog increase over a first period of time, wherein the backlog increase is a measure of increase in raw data received in a data stream to be input into a job, wherein the backlog increase is measured in terms of a change in data size;

a backlog amount that is a measure of unprocessed data in a received data stream to be input into a job, wherein the backlog amount is measured in terms of data size;

determining whether to adjust an amount of processing resources allocated to the job based on the backlog growth and the backlog amount, wherein for iterations of the determining:

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be at a target;

determining that the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be at a target;

for each iteration determined to adjust the amount of processing resources committed to a job, adjusting the amount of processing resources committed to the job, wherein adjusting the amount of processing resources further comprises determining whether to reduce the processing resources and, if it is determined to reduce the processing resources, determining to what extent the processing resources are committed to, using a processor as a proxy for whether the system can handle the next lower level of allocated processing resources; and

for each iteration that is determined not to adjust the amount of processing resources allocated to the job, maintaining the amount of processing resources allocated to the job.

2. The method of claim 1, wherein for an iteration of the determining:

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be below a target;

determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below a target; and is

Wherein adjusting the amount of processing resources allocated to the job comprises: reducing an amount of processing resources allocated to the job in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be below a target.

3. The method of claim 1, wherein for an iteration of the determining:

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be above a target;

determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above a target; and is

Wherein adjusting the amount of processing resources allocated to the job comprises: increasing the amount of processing resources committed to the job in response to the backlog increase being determined to be zero or negative and the backlog amount being determined to be above a target.

4. The method of claim 1, wherein for an iteration of the determining:

the backlog increase is determined to be positive;

the backlog amount is determined to be below a target;

determining that the amount of processing resources committed to the job is determined to be unadjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be below a target.

5. The method of claim 1, wherein for an iteration of the determining:

the backlog increase is determined to be positive;

the backlog amount is determined to be not lower than a target;

determining that the amount of processing resources committed to the job is determined to be adjusted in response to the backlog increase being determined to be positive and the backlog amount being determined to be not below a target; and is

Wherein adjusting the amount of processing resources allocated to the job comprises: increasing the amount of processing resources committed to the job in response to the backlog increase being determined to be positive and the backlog amount being determined to be not below a target.

6. The method of claim 1, wherein the unit of data size is at least one of the group consisting of bits, bytes, megabytes, gigabytes, records, and cardinality.

7. The method of claim 1, further comprising:

iteratively determining a processor utilization for the job;

wherein iteratively determining whether to adjust the amount of processing resources committed to the job is further based on the processor utilization.

8. The method of claim 7, wherein:

wherein, for an iteration of the determining:

the processor utilization is below a value;

determining that an amount of processing resources allocated to the job is determined to be adjusted in response to determining that the processor utilization is below a value; and

wherein adjusting the amount of processing resources allocated to the job comprises: reducing an amount of processing resources allocated to the job in response to determining that the processor utilization is below a value.

9. The method of claim 8, wherein reducing the amount of processing resources allocated to the job in response to determining that the processor utilization is below a value comprises: reducing a discrete number of resources allocated to the job, the discrete number based on the processor utilization.

10. The method of claim 9, wherein the discrete number is a number of computer memory disks.

11. The method of claim 1, wherein determining whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount comprises: smoothing the determination that causes an oscillation in an amount of processing resources allocated to the job.

12. The method of claim 11, wherein smoothing the determination comprises: waiting a second period of time.

13. The method of claim 11, wherein smoothing the determination comprises: averaging the plurality of determinations of whether to adjust the amount of processing resources allocated to the job.

14. A system, comprising:

one or more processors configured to execute computer program instructions; and

one or more computer storage media encoded with computer program instructions that, when executed by one or more processors, cause a computer device to perform operations comprising:

iteratively determining for the job:

a backlog increase over a first period of time, wherein the backlog increase is a measure of increase in raw data received in a data stream to be input into the job, wherein the backlog increase is measured in terms of a change in data size;

a backlog amount that is a measure of unprocessed data in a received data stream to be input into the job, wherein the backlog amount is measured in terms of data size;

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be at a target;

for each iteration determined to adjust the amount of processing resources committed to the job, adjusting the amount of processing resources committed to the job, wherein adjusting the amount of processing resources further comprises determining whether to reduce the processing resources and, if it is determined to reduce the processing resources, determining to what extent the processing resources are committed to, using a processor as a proxy for whether the system can handle the next lower level of allocated processing resources; and

15. The system of claim 14, wherein for an iteration of the determining:

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be below a target;

16. The system of claim 14, wherein for an iteration of the determining:

the backlog increase is determined to be zero or negative;

the backlog amount is determined to be above a target;

17. The system of claim 14, wherein for an iteration of the determining:

the backlog increase is determined to be positive;

the backlog amount is determined to be below a target;

18. The system of claim 14, wherein for an iteration of the determining:

the backlog increase is determined to be positive;

the backlog amount is determined to be not lower than a target;

19. The system of claim 14, wherein the unit of data size is at least one of the group consisting of bits, bytes, megabytes, gigabytes, records, and cardinality.

20. The system of claim 14, wherein the operations further comprise:

iteratively determining a processor utilization for the job;

21. The system of claim 20, wherein:

wherein, for an iteration of the determining:

the processor utilization is below a value;

22. The system of claim 21, wherein reducing the amount of processing resources allocated to the job in response to determining that the processor utilization is below a value comprises: reducing a discrete number of resources allocated to the job, the discrete number based on the processor utilization.

23. The system according to claim 22, wherein said discrete number is a number of computer memory disks.

24. The system of claim 14, wherein determining whether to adjust the amount of processing resources allocated to the job based on the backlog growth and the backlog amount comprises: smoothing the determination that causes an oscillation in an amount of processing resources allocated to the job.

25. The system of claim 24, wherein smoothing the determination comprises: waiting a second period of time.

26. The system of claim 24, wherein smoothing the determination comprises: averaging the plurality of determinations of whether to adjust the amount of processing resources allocated to the job.