CN111726303B

CN111726303B - Flow control method and device and computing equipment

Info

Publication number: CN111726303B
Application number: CN201910221588.6A
Authority: CN
Inventors: 韩华伟; 吕建文; 张祥勇
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-03-22
Filing date: 2019-03-22
Publication date: 2024-01-02
Anticipated expiration: 2039-03-22
Also published as: CN111726303A

Abstract

The invention discloses a flow control method, a flow control device and computing equipment. The method comprises the following steps: determining a requested service scene according to a service request sent by a device side; matching a target restrictor from the plurality of restrictors according to the requested service scene, wherein the requested service scene is contained in the service scene associated with the target restrictor; and carrying out flow limiting processing on the service request through the target flow limiter.

Description

Flow control method and device and computing equipment

Technical Field

The invention relates to the field of internet of things, in particular to a flow control method, a flow control device and computing equipment.

Background

The throttling is to control the traffic rate of the device side access server side portal of the internet of things (Internet of Things, ioT), where traffic may be bandwidth, requests per second, transactions per second, etc. The IoT products have various device forms and uncontrollable behaviors, and when the server faces the scenes of device connection flood peak, malicious traffic attack and the like, the current limitation plays a key role in ensuring the stable operation of the system.

Disclosure of Invention

The present invention has been made in view of the above problems, and it is an object of the present invention to provide a flow control method, apparatus and computing device that overcomes or at least partially solves the above problems.

According to one aspect of the present invention, there is provided a flow control method performed at a server side, the server side being configured with a plurality of restrictors, each restrictor being associated with a traffic scenario, the method comprising:

determining a requested service scene according to a service request sent by a device side;

matching a target restrictor from the plurality of restrictors according to the requested service scene, wherein the requested service scene is contained in the service scene associated with the target restrictor;

and carrying out flow limiting processing on the service request through the target flow limiter.

Optionally, in the flow control method according to the invention, the traffic scenario associated with the flow restrictor is represented by a traffic parameter set comprising at least one of the following parameters: the method comprises the steps of limiting the flow, the attribute of the object, the type of the attribute, the upper limit of the flow, the time unit of the flow limit and the value range of the object.

Optionally, in the flow control method according to the present invention, the requested traffic scenario is represented by a request parameter set, the request parameter set including at least one of the following parameters: the object from which the request originates, the attributes of the object, and the type of attribute.

Optionally, in the flow control method according to the present invention, the object includes a device, a product, and a tenant; the attributes of the object include an IP address, a connection and a message; the types of attributes include bandwidth, number of times, and frequency.

Optionally, in the flow control method according to the present invention, the service parameter set and the request parameter set further include an extension tag, and the extension tag is an identifier for further defining the object and/or an attribute of the object.

Optionally, in the flow control method according to the present invention, the plurality of restrictors prioritize the primary key and the secondary key in the range of values of the objects; the matching the target current limiter from the plurality of current limiters according to the requested service scene comprises the following steps: and matching the target current limiter from the plurality of current limiters according to the order of the priority from high to low.

Optionally, in the flow control method according to the present invention, the performing, by the target restrictor, flow restriction processing on the service request includes: when the number of locally available tokens is less than or equal to 0, a throttling action associated with the target restrictor is performed.

Optionally, in the flow control method according to the present invention, the flow restricting action includes: reject, delay, and silence.

Optionally, in the flow control method according to the present invention, the target restrictor determines the number of locally available tokens as follows: generating tokens in a token bucket based on a token generation algorithm associated with a target restrictor, wherein the number of tokens in the token bucket is a cluster global token number; acquiring the number of consumed tokens of the cluster and the number of local consumed tokens; the number of locally available tokens is calculated based on the cluster global token number, the cluster consumed token number, and the local consumed token number.

Optionally, in the flow control method according to the present invention, the calculating the locally available token number based on the cluster global token number, the cluster consumed token number, and the locally consumed token number includes: determining the available token number of the cluster according to the global token number of the cluster and the consumed token number of the cluster; determining the number of tokens which can be distributed to the server side in the available token numbers of the clusters according to the health degree of the server side and the cluster health degree; calculating the number of locally available tokens according to the number of tokens which can be allocated to the local server and the number of locally consumed tokens; the cluster health degree is an average value of health degrees of all the service terminals in the cluster, and the health degree of each service terminal is determined based on the resource load condition of the service terminal.

According to another aspect of the present invention, there is provided a flow control device for use at a server, the server being configured with a plurality of restrictors, each restrictor being associated with a traffic scenario, the device comprising:

the request scene determining unit is suitable for determining a requested service scene according to the service request sent by the equipment terminal;

a matching unit adapted to match a target restrictor from the plurality of restrictors according to the requested traffic scenario, wherein the requested traffic scenario is included in the traffic scenario associated with the target restrictor;

and the flow limiting processing unit is suitable for carrying out flow limiting processing on the service request through the target flow limiter.

According to yet another aspect of the present invention, there is provided a computing device comprising:

one or more processors;

a memory;

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods described above.

According to yet another aspect of the present invention, there is provided a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods described above.

According to the flow control scheme provided by the invention, a plurality of restrictors are configured at the server, each restrictor corresponds to a restricted service scene, and the target restrictors are selected by matching the requested service scene with the restricted service scene, so that multi-scene self-adaptive current limiting in the cluster environment can be realized. Further, the flow limiting algorithm parameters of the flow limiter can be dynamically adjusted according to the monitoring data obtained from the health center, so that the actual flow curve of the cluster is more in line with expectations.

The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

FIG. 1 illustrates a schematic diagram of a flow control system 100 according to one embodiment of the present invention;

FIG. 2 shows a schematic diagram of a computing device 200 according to one embodiment of the invention;

FIG. 3 illustrates a flow chart of a flow control method 300 according to one embodiment of the invention;

fig. 4 shows a schematic diagram of a flow control device 400 according to one embodiment of the invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

FIG. 1 shows a schematic diagram of a flow control system 100 according to one embodiment of the invention. As shown in fig. 1, the system 100 includes a device side and a service side cluster, where the device side may be, for example, various types of devices of the internet of things, and the service side cluster includes a plurality of service sides, which may be service servers, and by integrating the plurality of service servers, various services may be provided for the device side. The equipment end sends a service request to the service end cluster, the service end cluster executes corresponding service processing, and when the current limiting condition is met, the service end cluster also carries out current limiting processing on the service request.

In the internet of things, the number of the device ends is generally larger, so that the system 100 may further include a load balancing server, and the service request initiated by the device end reaches the load balancing server first, and the load balancing server distributes the service request to the corresponding service server according to a predetermined load balancing algorithm.

A health center may also be included in the system 100, including one or more servers. The health center is connected with each service server in the service end cluster, and can collect (for example, collect according to a preset period) resource load data of each service server. The resource load data is, for example, one or more of the following: CPU occupancy, memory occupancy, response time to Request (RT), bandwidth occupancy, requests per second (QPS), transactions Per Second (TPS).

After the health center collects the resource load data, the health degree of the single machine (the health degree of each service server) and the health degree of the cluster (for example, the average value of the health degrees of all service servers in the cluster) can be calculated according to the resource load data. The higher the single machine health degree of the service server is, the stronger the processing capacity of the service server is, and more service requests can be processed; the higher the cluster health, the stronger the processing power of all the service servers in the cluster as a whole, and more service requests can be processed.

Taking a resource load data 6-tuple (CPU, memory, RT, bandwidth, QPS, TPS) as an example, a single machine health H calculation method of a certain service server is to calculate, for each element in the 6-tuple, a weighted sum after inverting its current value, specifically:

H＝Sum(Max(i)/Cur(i)*Weight(i))；

wherein, max (i) represents the theoretical maximum value of the i element in the 6-tuple, and the corresponding Max 6-tuple can take value (1,1,1000,100,1000,1000) assuming that RT unit is ms and Bandwidth unit is M; cur (i) represents the current value of the i-th element in the 6-tuple; weight (i) represents the Weight value of the i element in the 6-tuple, and the available value is (0.3,0.2,0.2,0.1,0.1,0.1); sum represents summing the formulas through 6 tuples.

The calculation method of the cluster health Hc is that the single machine health degree of all service servers in the cluster is averaged:

Hc＝Sum(H(j))/N；

where N is the number of service servers in the cluster, H (j) is the single-machine health of the jth service server, sum is the Sum of the single-machine health of all N servers.

In addition, the health center may evaluate the importance of the single machine of each service server according to the health degree of the single machine and the health degree of the cluster, for example, the importance of the single machine of the j-th service server: v (j) =h (j)/Hc. It can be seen that the greater the degree of single machine health, the higher the corresponding degree of single machine importance.

Each business server may obtain the above health data from the health center periodically or on demand, including individual health, cluster health, and individual importance.

As described above, in the prior art, the current limiting schemes are mostly single-scenario, single-machine and single-strategy current limiting, and the current limiting strategy cannot be dynamically and adaptively adjusted for multiple scenarios in the clustered environment. According to an embodiment of the present invention, the system 100 further includes a configuration center, where the configuration center includes one or more servers, and the configuration center is connected to each service server in the service-side cluster, and through the configuration center, a plurality of current-limiting rules may be configured for each service server in the cluster, where each current-limiting rule includes a service scenario (abbreviated as a current-limiting scenario) associated with a current limiter, a current-limiting algorithm, and a current-limiting action. For a detailed description of the current limiting scenario, the current limiting algorithm and the current limiting action, see below.

The service server may obtain configuration data regarding the flow restriction from the configuration center, parse the configuration data into flow restriction rules, and construct the flow restrictor according to the flow restriction rules. Wherein, each restrictor corresponds to a restriction rule, that is, each restrictor corresponds to a restriction scene, and a restriction algorithm and a restriction action associated with the restriction scene. In fig. 1, N current limit rules are configured in total, so that each service server includes N current limiters among N service servers included in the cluster.

When the service server receives the service request sent by the device side, a requested service scenario (simply referred to as a request scenario) may be determined according to the service request, for example, the request scenario is constructed according to service parameters in the service request. Then, matching is carried out according to the request scene and the current limiting scene, so that the optimal current limiter is matched, and the matched current limiter is used for executing corresponding current limiting algorithm and current limiting action. And, the algorithm parameters of the current limiting algorithm can be dynamically adjusted according to the monitoring data obtained from the health center.

It should be noted that, in the embodiment of the present invention, the current limiting algorithm includes a token generation algorithm and a method for calculating the number of locally available tokens. Each flow limiter in each service server adopts a corresponding token generation algorithm to generate tokens in a token bucket corresponding to the flow limiter, and the number of the tokens in the token bucket is the global number of the cluster, namely the total number of the tokens which can be used by all the service servers in the cluster under the service scene corresponding to the flow limiter; the number of locally available tokens is then calculated based on the cluster global token number, the cluster consumed token number (sum of token numbers consumed by all traffic servers in the cluster) and the local consumed token number (token number consumed by the traffic server). The flow limiter of each service server determines whether to perform a corresponding flow limiting action on the service request based on the corresponding number of locally available tokens (less than or equal to 0).

Thus, a distributed cache may also be included in the system 100, and each service server may synchronize stand-alone token data into the distributed cache and cluster token data from the distributed cache for calculation of the number of locally available tokens.

The flow control method of the embodiment of the present invention may be executed at a server in a cluster, and each server may be specifically implemented as the computing device 200 shown in fig. 2. As shown in FIG. 2, in a basic configuration 202, computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.

Depending on the desired configuration, the processor 204 may be any type of processing including, but not limited to: a microprocessor (μp), a microcontroller (μc), a digital information processor (DSP), or any combination thereof. Processor 204 may include one or more levels of cache, such as a first level cache 210 and a second level cache 212, a processor core 214, and registers 216. The example processor core 214 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations, the memory controller 218 may be an internal part of the processor 204.

Depending on the desired configuration, system memory 206 may be any type of memory including, but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The system memory 106 may include an operating system 220, one or more applications 222, and program data 224. The application 222 is in effect a plurality of program instructions for instructing the processor 204 to perform a corresponding operation. In some implementations, the application 222 can be arranged to cause the processor 204 to operate with the program data 224 on an operating system.

Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to basic configuration 202 via bus/interface controller 230. The example output device 242 includes a graphics processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. The example peripheral interface 244 may include a serial interface controller 254 and a parallel interface controller 256, which may be configured to facilitate communication via one or more I/O ports 258 and external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.). The example communication device 246 may include a network controller 260 that may be arranged to facilitate communication with one or more other computing devices 262 over a network communication link via one or more communication ports 264.

The network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media in a modulated data signal, such as a carrier wave or other transport mechanism. A "modulated data signal" may be a signal that has one or more of its data set or changed in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or special purpose network, and wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR) or other wireless media. The term computer readable media as used herein may include both storage media and communication media.

In a computing device 200 according to the present invention, the application 222 includes a flow control apparatus 400, the apparatus 400 including a plurality of program instructions that may instruct the processor 104 to perform the flow control method 300.

Fig. 3 shows a flow chart of a flow control method 300 according to one embodiment of the invention. The method 300 is suitable for execution in a server (e.g., the aforementioned computing device 200), which may obtain configuration data regarding the flow restriction from a configuration center, parse the configuration data into flow restriction rules, and construct a flow restrictor according to the flow restriction rules. Typically, the configuration data includes a plurality of flow restriction rules, so that a plurality of flow restrictors are generated at the server, each flow restrictor corresponding to a service scenario (flow restriction scenario), and a flow restriction algorithm and flow restriction action associated with the flow restriction scenario.

In one implementation, the restriction rule is a 9-tuple (Target, property, type, count, unit, scope, tag, reference, action). Target represents a current-limited object, which may include a device, a product, and a tenant. In the field of the Internet of things, the equipment refers to a hardware terminal connected to a server cluster of the Internet of things, and belongs to specific equipment under a certain product; a product is a collection of devices, typically a group of devices having the same function, e.g., all having the same model number identification; the tenant is a user who uses the internet of things server cluster. In this way, a tenant may have a plurality of devices of multiple product models, and multiple devices may belong to one product model.

Property represents properties of an object, which may include IP address, connection, message, etc.

Type represents the Type of attribute (i.e., the Type of current limit), which includes bandwidth, number of times, frequency, etc.

Amount represents the upper flow limit (i.e., the upper limit of the restriction).

Unit represents a Unit of time for limiting a flow, for example, every second, every minute, every hour, etc.

Scope represents the range of values of the object (i.e. the range of objects that are current limited), for example, scope may be a certain tenant id if Target is tenant, or all are represented by wild cards.

Tag is an extension Tag, which is an identifier for further defining the above Target and/or Property. In one implementation, the expansion tag is used to further define the geographic area of the Target, and/or further define the traffic direction of Property. For example, when the value of the Target is the device ID and the Property value is a message from the device, if the value of the Target is < i uplink in the Hangzhou city of Zhejiang, the geographic area of the Target is further defined as "Hangzhou city of Zhejiang", and the traffic direction of the Property is further defined as "uplink traffic". For another example, if the Tag takes a value of < smart >, the Tag further defines a device in the Target as a smart home device.

Refer to a throttling algorithm, including a token generation algorithm and a locally available token count calculation method.

The Action represents a current limiting Action including a rejection process, a delay process, and a silence process. Silence processing refers to waiting for a timeout at the device end without responding to or rejecting a service request.

In the 9-tuple of the current limit rule, the first 7 elements (Target, property, type, count, unit, scope, tag) define a current limit scene, which has the following specific meaning: and for the Target object designated by Scope, the Type value of the Property attribute with the Tag label cannot exceed the local residual token number generated by the preference in the statistical time period Unit, and if so, the Action current limiting Action is executed.

For example, if the Target value is the device ID, the Property value is the message from the device, the Type value is the number of messages, the Amount value is 1000, the unit value is 1 minute, the Scope value is [1,10000], and the Tag value is < uplink in hangzhou, zhejiang province. The current limiting scenario may be expressed as: for each device numbered [1,10000], the number of upstream messages that can be sent to Hangzhou, zhejiang province per minute is limited to be no more than 1000.

That is, the current limiting scenario may be represented by a set of traffic parameters including at least one of the following parameters: the method comprises the steps of limiting the flow, the attribute of the object, the type of the attribute, the upper limit of the flow, the time unit of the flow limit, the value range of the object and the expansion label.

After the server generates a plurality of restrictors, the method 300 proceeds to step S310. In step S310, the server receives a service request sent by the device side, and determines a requested service scenario (request scenario) according to the service request. In the embodiment of the present invention, the request scene is represented by a request parameter set, for example, the request scene may be represented as a 4-tuple (Target, property, type, tag), and the meaning of each element in the 4-tuple is the same as or similar to the meaning of the corresponding element in the 9-tuple, specifically, the Target represents an object initiating the request, the Property represents an attribute of the object, the Type represents the attribute and the Tag extension Tag.

After acquiring the requested service scenario, the method proceeds to step S320. In step S320, a target restrictor is matched from a plurality of restrictors in the server according to the requested service scenario. In one implementation, by matching the request scenario with the current limiting scenario corresponding to each current limiter in turn, when the request scenario is included in a certain current limiting scenario, the current limiter corresponding to the current limiting scenario is determined as the target current limiter.

Specifically, for each restrictor containing a restrictor scene 7 tuple (Target, property, type, count, unit, scope, tag), if the Target in the request scene 4 tuple (Target, property, type, tag) is the same as the Target in the restrictor scene 7 tuple (Target, property, type, tag), while the Scope Type in the restrictor scene 7 tuple is the same as or consistent with the Target in the request scene 4 tuple (compatibility is an out-of-order relationship specified in the configuration center, such as device < product < tenant, i.e., device compatible with product, device compatible with tenant), the request scene is contained in the restrictor scene.

It should be noted that when a service request arrives at a service end, the service request may be matched with a plurality of flow restriction rules, that is, may be matched with a plurality of flow restrictors. Thus, in one implementation of the invention, the throttling rules (or restrictors) are also prioritized in advance.

The sequencing manner of the plurality of restrictors is as follows: and carrying out priority sorting by taking the object of the current limitation as a primary key and taking the value range of the object as a secondary key. In the embodiment of the invention, the priority is also a partial order relation, and the partial order relation can be designated in a configuration center. For example, for Target, the partial order relationship is: device > product > tenant; the priority order of Target is: device > product > tenant. For another example, for Scope, the partial order relationship is: device > product > tenant; the Scope's priority order is: device > product > tenant. Target or Scope elements that have no partial order relationship are considered irrelevant.

Correspondingly, the implementation manner of matching the target current limiter from the plurality of current limiters according to the requested service scene is as follows: the target restrictors are matched out of the plurality of restrictors in the order of priority from high to low.

After matching to the target restrictor, the method proceeds to step S330. In step S330, the service request is subjected to a flow limiting process by the target flow limiter. Specifically, when the number of locally available tokens is less than or equal to 0, performing a throttling action associated with the target restrictor, the throttling action comprising: reject, delay and silence; when the number of locally available tokens is greater than 0, the service request is released. In addition, after the target restrictor performs the flow restricting processing on the service request, the local token consumption data can be synchronized into the distributed cache. The distributed cache may aggregate token consumption data of all servers in the cluster, thereby generating cluster token data, i.e. the number of consumed tokens of the whole cluster.

In one implementation, the target restrictor determines the number of locally available tokens as follows:

firstly, generating tokens in a token bucket based on a token generation algorithm associated with a target restrictor, wherein the number of tokens in the token bucket is a cluster global token number;

then, obtaining the number of consumed tokens of the cluster and the number of local consumed tokens;

finally, the number of locally available tokens is calculated based on the number of cluster global tokens, the number of cluster consumed tokens, and the number of locally consumed tokens.

In the embodiment of the invention, the target current limiter generates the global token according to the token generation algorithm represented by the reference in the 9-tuple of the current limit rule, and the current time cluster global token number can be represented as g=f (a, u, dt), wherein a is the current limit upper limit value amountin the 9-tuple of the current limit rule; u is a time period Unit in the stream constraint 9 tuple; dt is the current relative time, meaning that the difference between the current time and the beginning of the current limit period Unit, for example, unit is 1 minute (60 seconds), and then dt takes values of 1, 2, 3 to 60, respectively.

The embodiment of the invention also provides the following token generation algorithms:

algorithm1

For any dt, g=a, within the u interval. Indicating that at the beginning of each time period u, a global tokens are issued at once, g being a constant value a. According to the algorithm, the number of tokens issued in the token bucket is the impulse response curve of time dt, so that the algorithm is applicable to the expected bursty traffic.

Algorithm2

For any dt within the u interval, g=a/u. Indicating that the number of tokens issued in the token bucket increases linearly with time dt for each u time period. The algorithm is applicable to the desired uniform flow.

Algorithm 3

The number of tokens issued in the token bucket increases exponentially with time dt, g=ex, e being the euler number 2.71828, x=f (a, u) dt, x being the map of dt, the value principle being that g=a when dt reaches the end of each u time period. The algorithm is applicable to the case of slow or fast heating of the expected request traffic.

In addition, the token generation algorithm may be another extended algorithm, for example, a corresponding algorithm may be designed according to a curve such as sine distribution, cosine distribution, normal distribution, etc. of the distribution statistical characteristic of the request flow.

In the embodiment of the invention, the health of the cluster health server can be considered when the local token number is calculated. Specifically, the number of available tokens of the cluster can be determined according to the number of global tokens of the cluster and the number of consumed tokens of the cluster; then, determining the number of tokens which can be distributed to the server side in the available token numbers of the clusters according to the health degree of the server side and the cluster health degree; finally, the number of locally available tokens is calculated according to the number of tokens which can be allocated to the local server side and the number of locally consumed tokens.

For example, the number of locally available tokens in of the server can be calculated (at time point dt) as follows:

ln＝(g-base)*li-acc；

wherein g is the global token number of the cluster, base is the consumed token number of the cluster, li is a coefficient determined according to the health condition of each server in the cluster, and acc is the local consumed token number. The li is used for distributing the number of remaining tokens (g-base) of the cluster as fairly as possible according to the load and health of the current server in the cluster. The healthier the unit, the more tokens remain available. In one implementation, li=v/N, v=h/Hc, H is the health of the present server, hc is the cluster health, and N is the number of servers included in the cluster (the data may be obtained from the load balancing server). As described above, the cluster health is an average value of health of each server in the cluster, and the health of each server is determined based on the resource load condition of the server.

When the distributed cache is not available, since the number of consumed tokens of the cluster cannot be obtained, the number ln of locally available tokens can be calculated according to the following formula:

ln＝g/N-acc；

wherein g is the global token number of the cluster, N is the number of service ends included in the cluster, and acc is the number of locally consumed tokens.

It can be seen that in the embodiment of the present invention, the token generation algorithm is characterized in that the global token number of the cluster is calculated on a local single machine, and the available token number of the single machine is weakly dependent on the distributed cache system. When the distributed cache system is available, the consumed token number base of the cluster increases along with the increase of dt in the u time period, so that the specific gravity of the deterministic part of the available token number ln of the single computer is increased, the ln is continuously subjected to feedback superposition and compensation correction of the base, the feedback superposition and compensation correction are more accurate, and the final flow curve is converged on the Amount.

In addition, the coefficient li utilizes the health degree of the health center to enable the flow curve to be more uniform and smooth on each single machine in the cluster, and makes up an ln error caused by untimely synchronization base value of the single machine and the distributed cache. When the distributed cache system is not available, errors in ln will increase because base is not compensated, but most of the inaccuracy current limit requirements can still be met.

Fig. 4 shows a schematic diagram of a flow control apparatus 400 according to one embodiment of the invention, the apparatus 400 residing in a server (e.g., the aforementioned computing device 200) to cause the computing device to perform the flow control method of the invention (e.g., the aforementioned method 300). As shown in fig. 4, the apparatus 400 includes.

A request scene determining unit 410, adapted to determine a requested service scene according to a service request sent by the device side;

a matching unit 420, adapted to match a target restrictor from the plurality of restrictors according to the requested traffic scenario, wherein the requested traffic scenario is included in the traffic scenario of the restrictor associated with the target restrictor;

and a current limiting processing unit 430, adapted to perform current limiting processing on the service request through the target current limiter.

Specific processes performed by the request scene determining unit 410, the matching unit 420, and the current limit processing unit 430 may refer to steps S310, S320, and S330, respectively, and are not described herein.

An example of application of the present invention is given below.

The partial sequence relation of configuration center configuration is as follows:

target: deviceId > ProductId > userId (tenant Id)

Scope：deviceId>productId>userId

The server is provided with a plurality of restrictors, wherein the corresponding flow limiting rules of the two restrictors are rule 1 and rule 2 respectively.

Rule 1 (product id, connection, frequency, [ userId: a|b|c ], smart, algorithm1, sillence) means that for each product < product id > of tenant < userId > a or b or c, a restriction is made to the number of times < frequency > connection < smart > devices of smart home < smart > therein, limiting the number of connections of all devices within 1 minute < minute > per product to not more than 1000, and the number of connections distribution characteristics within 1 minute conform to a predefined algorithm 1< algorithm1> token generation algorithm. If the number of tokens remaining generated by algorithm1 is 0 when a connection request arrives for a device at a time, then silence is performed for the connection.

Rule 2 (product id, connection, frequency, [ product id: a.x | b.y | c.z ], smart, algorithm2, reject) means that smart home < smart > device connection < connection > times < frequency > of products < product id > a.x or b.y or c.z are limited, the number of connections of all devices within 1 minute < minute > per product < [ a.x | b.y | c.z ] > is limited, and the number of connections distribution characteristics within 1 minute are in accordance with a predefined algorithm 2< algorithm2> token generation algorithm. If the number of tokens remaining generated by algorithm2 is 0 when a connection request arrives for a device at a time, then reject processing < reject > is performed for that connection.

In particular, for the non-x product of tenant a, the non-y product of tenant b, and the non-z product of tenant c, still execute according to rule 1, no restrictions are placed on tenants other than a, b, and c.

The server side provides Message Queue Telemetry Transport (MQTT) protocol services. The service request received by the server end is: mqtt.connect (deviceid=111, password=222); indicating that a certain device is connected to a server through an MQTT protocol, the service parameters of the connection are (deviceid=111, password=222), and after the server performs verification on the password, the server checks that the product id= a.x corresponding to the device and the tenant userid=a corresponding to the product according to the deviceid=111. Then, a request scene 4 tuple is constructed as (product id, connection, frequency, smarthome), then the 4 tuple can potentially match the two rules, but since there is a partial order relationship product id > userId, and Scope of rule 1 is tenant userId, scope of rule 2 is product id, and targets of rule 1 and rule 2 are the same. Then rule 2 has a higher priority than rule 1 in terms of the ordering rule. So request scene 4 tuple (product id, connection, frequency, smarthome) matches rule 2 preferentially, its product id= a.x also falls just inside Scope [ product id: a.x | b.y | c.z ] of rule 2, thus executing the current limiting algorithm2 of rule 2.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Claims

1. A flow control method executed at a server, where the server is configured with a plurality of restrictors, each restrictor being associated with a service scenario, where the plurality of restrictors prioritize using a restricted object as a primary key and using a range of values of the object as a secondary key, the method comprising:

according to the requested service scene, matching a target restrictor from the plurality of restrictors according to the order of priority from high to low, wherein the requested service scene is contained in the service scene associated with the target restrictor;

2. The method of claim 1, wherein the traffic scenario associated with the flow restrictor is represented by a traffic parameter set comprising at least one of the following parameters: the method comprises the steps of limiting the flow, the attribute of the object, the type of the attribute, the upper limit of the flow, the time unit of the flow limit and the value range of the object.

3. The method of claim 2, wherein the requested traffic scenario is represented by a request parameter set comprising at least one of the following parameters: the object from which the request originates, the attributes of the object, and the type of attribute.

4. The method of claim 3, wherein the objects comprise devices, products, and tenants; the attributes of the object include an IP address, a connection and a message; the types of attributes include bandwidth, number of times, and frequency.

5. The method of claim 4, wherein the set of business parameters and request parameters further comprise an extension tag, the extension tag being an identification for further defining the object and/or an attribute of the object.

6. The method of claim 1, wherein the throttling the service request by the target restrictor comprises: when the number of locally available tokens is less than or equal to 0, a throttling action associated with the target restrictor is performed.

7. The method of claim 6, wherein the current limiting action comprises: reject, delay, and silence.

8. The method of claim 6, wherein the target restrictor determines the number of locally available tokens as follows:

generating tokens in a token bucket based on a token generation algorithm associated with a target restrictor, wherein the number of tokens in the token bucket is a cluster global token number;

acquiring the number of consumed tokens of the cluster and the number of local consumed tokens;

the number of locally available tokens is calculated based on the cluster global token number, the cluster consumed token number, and the local consumed token number.

9. The method of claim 8, wherein the calculating the local available token count based on the cluster global token count, the cluster consumed token count, and the local consumed token count comprises:

determining the available token number of the cluster according to the global token number of the cluster and the consumed token number of the cluster;

determining the number of tokens which can be distributed to the server side in the available token numbers of the clusters according to the health degree of the server side and the cluster health degree;

calculating the number of locally available tokens according to the number of tokens which can be allocated to the local server and the number of locally consumed tokens;

the cluster health degree is an average value of health degrees of all the service terminals in the cluster, and the health degree of each service terminal is determined based on the resource load condition of the service terminal.

10. A flow control device applied to a server, wherein the server is configured with a plurality of restrictors, each restrictor is associated with a service scene, the restrictors are prioritized by taking a restricted object as a primary key and taking a value range of the object as a secondary key, and the device comprises:

the matching unit is suitable for matching the target current limiter from the plurality of current limiters according to the requested service scenes and the order of priority from high to low, wherein the requested service scenes are contained in the service scenes associated with the target current limiter;

11. A computing device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-9.

12. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-9.