CN112395091A

CN112395091A - Cloud service request response method and device, electronic equipment and storage medium

Info

Publication number: CN112395091A
Application number: CN202011331362.0A
Authority: CN
Inventors: 韩秋明; 李建; 符柱; 陈家园
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2021-02-23
Also published as: WO2022110796A1

Abstract

The application provides a cloud service request response method and device, electronic equipment and a storage medium. The method is applied to the cloud service system. The method can comprise the step of obtaining the cloud service total amount which is applied by the tenant to the cloud service system. The cloud service system comprises a system constructed based on a distributed architecture. And distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the cloud service total amount, so that the working node responds to a cloud service request initiated by the tenant according to the working amount corresponding to the working node.

Description

Cloud service request response method and device, electronic equipment and storage medium

Technical Field

The application relates to a computer technology, in particular to a cloud service request response method and device, an electronic device and a storage medium.

Background

As the internet develops more and more, more and more tenants select cloud services. When selecting a cloud service, a tenant usually applies for a certain cloud service total amount from a cloud service system, and initiates a cloud service request within the cloud service total amount range.

After the cloud service system receives a cloud service request initiated by a tenant, the cloud service request needs to be responded after the request of the tenant is determined to be within the range of the total limit of the cloud service.

Disclosure of Invention

In view of the above, the present application at least discloses a cloud service request response method, which is applied to a cloud service system; the method comprises the following steps:

acquiring a cloud service total amount applied by a tenant to the cloud service system; the cloud service system comprises a system constructed based on a distributed architecture;

and distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the cloud service total amount, so that the working node responds to a cloud service request initiated by the tenant according to the working amount corresponding to the working node.

In some embodiments, the allocating work quota to each work node included in the distributed architecture based on the cloud service total quota includes:

and distributing work quota to the work node corresponding to the tenant and included in the distributed architecture based on partial quota in the cloud service total quota.

In some embodiments shown, the allocating work quota to each work node corresponding to the tenant in the distributed architecture based on a part of quota in the cloud service total quota includes:

determining the cloud service request response amount which can be reached by the working node within a preset time length according to the processing capacity corresponding to the working node; the processing capacity indicates the achievable cloud service request response amount in unit time length;

and allocating a working amount to the working node according to the cloud service request response amount corresponding to the working node.

In some embodiments shown, the allocating work quota to a work node corresponding to the tenant included in the distributed architecture based on a partial quota in the cloud service total quota includes:

determining the quota weight corresponding to the working node included in the distributed architecture;

and distributing the working quota matched with the quota weight corresponding to the working node for the working node based on part of the quota in the cloud service total quota.

In some embodiments shown, the determining the quota weight corresponding to each working node included in the distributed architecture includes:

determining the quota weight corresponding to each working node according to a preset quota weight determination rule based on the configuration information of each working node; or the like, or, alternatively,

and determining the quota weight corresponding to each working node based on the processing capacity corresponding to each working node.

In some illustrative embodiments, the method further comprises:

if receiving a request for requesting the quota from any working node, distributing the working quota to the working node based on the surplus quota; wherein, the surplus quota includes the quota left after the allocated work quota is removed from the cloud service total quota.

In some embodiments, the allocating work quota to the work node based on the remaining quota includes:

and distributing the working amount matched with the cloud service request response amount to the working node according to the cloud service request response amount which can be reached by the working node within the preset time length based on the surplus amount.

In some illustrated embodiments, the responding, by the worker node, to the cloud service request initiated by the tenant according to the work amount corresponding to the worker node includes:

after receiving a cloud service request initiated by a tenant, the working node responds to the cloud service request to provide cloud service calculation when the working limit corresponding to the working node is surplus, and adjusts the surplus working limit according to the consumption limit corresponding to the calculation.

In some illustrated embodiments, the work node responds to the cloud service request initiated by the tenant according to its corresponding work limit, and further includes:

after the working node receives a cloud service request initiated by a tenant, if the working limit corresponding to the working node is not remained, a limit application request is provided for the cloud service system, and when the cloud service total limit is still remained, the working limit distributed to the working node by the cloud service system based on the remaining limit is received to respond to the cloud service request.

In some illustrative embodiments, the method further comprises:

after the working node provides the quota application request to the cloud service system, if the cloud service total quota is not remained, the cloud service request is forwarded to other working nodes with surplus working quota for processing.

In some illustrative embodiments, the method further comprises:

the working node charges a cloud service request initiated by a tenant.

In some embodiments shown, the cloud services include AI cloud services; the acquiring of the cloud service total amount applied by the tenant to the cloud service system includes:

acquiring an AI cloud service total amount applied by the tenant to the cloud service system;

the allocating a work quota to a work node corresponding to the tenant included in the distributed architecture based on the cloud service total quota so that the work node responds to a cloud service request initiated by the tenant according to the work quota corresponding to the work node, includes:

and distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the AI cloud service total amount, so that the working node responds to an AI cloud service request initiated by the tenant according to the working amount corresponding to the working node.

The application also provides a cloud service request response device which is applied to the cloud service system; the above-mentioned device includes:

the acquisition module is used for acquiring a cloud service total amount applied by a tenant to the cloud service system; the cloud service system comprises a system constructed based on a distributed architecture;

and the distribution module is used for distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the cloud service total amount so that the working node responds to a cloud service request initiated by the tenant according to the working amount corresponding to the working node.

In some illustrated embodiments, the allocation module is specifically configured to:

In some embodiments shown, the above-described dispensing module comprises:

a first determining module, configured to determine, according to a processing capability corresponding to the working node, a cloud service request response amount that can be reached by the working node within a preset time period; the processing capacity indicates the achievable cloud service request response amount in unit time length;

and the distribution submodule is used for distributing a working amount to the working node according to the cloud service request response amount corresponding to the working node.

In some embodiments shown, the above-described dispensing module comprises:

the second determining module is used for determining the quota weight corresponding to the working node included in the distributed architecture;

and the distribution submodule is used for distributing the working quota matched with the quota weight corresponding to the working node for the working node based on part of the quota in the cloud service total quota.

In some illustrated embodiments, the second determining module is specifically configured to:

In some illustrated embodiments, the assignment module is further configured to:

based on the cloud service total amount, distributing a work amount to a work node corresponding to the tenant and included in the distributed architecture, so that after the work node receives a cloud service request initiated by the tenant, when the work amount corresponding to the work node is surplus, the work node responds to the cloud service request to provide cloud service calculation, and the surplus work amount of the work node is adjusted according to the consumption amount corresponding to the calculation.

based on the cloud service total amount, distributing a work amount to a working node corresponding to the tenant, wherein the working node is included in the distributed architecture, so that after the working node receives a cloud service request initiated by the tenant, if the work amount corresponding to the working node is not left, an amount application request is provided for the cloud service system, and when the cloud service total amount is left, the working amount distributed to the working node by the cloud service system based on the remaining amount is received to respond to the cloud service request.

and distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the cloud service total amount, so that after the working node provides an amount application request to the cloud service system, if the cloud service total amount is not left, the cloud service request is forwarded to other working nodes with the left working amounts for processing.

In some illustrative embodiments, the apparatus further comprises:

and the charging module is used for charging the cloud service request initiated by the tenant by the working node.

In some embodiments shown, the cloud services include AI cloud services; the acquisition module is specifically configured to:

the distribution module is specifically configured to:

The present application further provides an electronic device, the above device including:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to call the executable instructions stored in the memory to implement the cloud service request response method as shown in any one of the foregoing embodiments.

The present application further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the cloud service request response method shown in any one of the foregoing embodiments.

In the technical scheme, the cloud service system constructed by the distributed architecture can allocate work quota to each work node included in the distributed architecture based on the total cloud service quota applied by the tenant to the system, so that each work node autonomously responds to a cloud service request initiated by the tenant according to the corresponding work quota, and the cloud service system is prevented from frequently communicating with each work node to read and write the cloud service request amount of the tenant, thereby avoiding frequent network I/O operation of the cloud service system and locking operation of reading and writing public storage, ensuring the response speed of the cloud service request of the system, and further ensuring the experience of the tenant.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate one or more embodiments of the present application or technical solutions in the related art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in one or more embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive exercise.

Fig. 1 is a flowchart illustrating a method for responding to a cloud service request according to the present application;

fig. 2 is a schematic interaction diagram of an AI cloud service system and a tenant shown in the present application;

FIG. 3 is a schematic diagram illustrating a cloud service total allocation shown in the present application;

FIG. 4 is a schematic diagram illustrating a cloud service total allocation shown in the present application;

fig. 5 is a schematic structural diagram of a cloud service request responding apparatus shown in the present application;

fig. 6 is a schematic diagram of a hardware structure of an electronic device shown in the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.

In the related art, in order to determine whether a cloud service request initiated by a tenant is within a cloud service total amount range applied by the tenant, a cloud service system may count a cloud service request amount of the tenant. Wherein, the tenant may include a plurality of users. The user can use the tenant account allocated to the user to apply for the cloud service.

In practical application, if the type of the cloud service request is cloud service invocation, the cloud service system can count the cloud service request quantity of the tenant by taking the cloud service invocation times as dimensions. If the type of the cloud service request is stream data processing, the cloud service system can take the byte number of the processed flow as the cloud service request amount of the dimension counting tenant.

The following description takes a cloud service request initiated by a tenant as a cloud service call request as an example.

For example, when a tenant initiates a cloud service call request, the cloud service system may determine whether the currently counted number of cloud service call requests of the tenant reaches a total cloud service quota (total cloud service call request number) applied by the tenant, and if not, respond to the request; otherwise, the request is restricted.

It is understood that, when the cloud service system is a single-node system (specifically, the single-node system is a system that provides cloud service only through one node), since it is relatively convenient to acquire the number of times of calling the tenant service or the number of bytes of processing flow, it is not complicated to count the cloud service request amount of the tenant, and the speed of responding to the request by the cloud service system will not be affected. When the cloud service system is constructed based on a distributed architecture, the cloud service request quantity of the statistical tenant is complicated due to the distributed architecture, and the speed of the cloud service system responding to the request is influenced.

For example, when the cloud service system is a system constructed based on a distributed architecture, the system may allocate, to a tenant, a cloud service total amount for storing a tenant application and a shared space (e.g., a shared cache or a shared database) indicating a usage amount of the number of times the tenant initiates a call.

When the cloud service system receives a cloud service call request initiated by a tenant, the request may be distributed to any node a under the distributed architecture. After receiving the request, the node a reads the total amount of cloud services stored in the shared space and the used amount of the tenant (the number of calls initiated by the tenant) through I/O. After reading the cloud service total amount and the usage amount, the tenant may determine whether the cloud service total amount is greater than the usage amount. If yes, the node A responds to the calling request and increases the using amount. Then, the node A can write the increased usage amount into the shared space through I/O.

It is easy to find that, when the cloud service system is a system constructed based on a distributed architecture, because a cloud service call request or a traffic processing request initiated by a tenant may be distributed to any node under the distributed architecture, the cloud service system must frequently communicate with each node under the distributed architecture to read and write the cloud service request amount of the tenant. Frequent network I/O operations and locking operations of reading and writing the public storage may cause response efficiency of cloud service requests of the system to be low, and delay is caused, so that tenant experience is affected.

In view of this, the present application provides a method for responding to a cloud service request, which is applied to a cloud service system. The cloud service system comprises a system constructed based on a distributed architecture.

According to the method, the cloud service total amount applied by the tenant is distributed to each working node under the distributed architecture, so that each working node autonomously determines whether to respond to a cloud service request initiated by the tenant, the cloud service system is prevented from frequently communicating with each working node to read and write the cloud service request amount of the tenant, frequent network I/O operation and locking operation of reading and writing public storage of the cloud service system are avoided, the response speed of the cloud service request of the system is guaranteed, and the experience of the tenant is further guaranteed.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for responding to a cloud service request according to the present application.

As shown in fig. 1, a method for responding to a cloud service request, which is shown in the present application, may include:

s102, acquiring a cloud service total amount applied by a tenant to the cloud service system; the cloud service system comprises a system constructed based on a distributed architecture;

and S104, distributing a working amount to a working node corresponding to the tenant and included in the distributed architecture based on the cloud service total amount, so that the working node responds to a cloud service request initiated by the tenant according to the working amount corresponding to the working node.

The cloud service system (hereinafter referred to as "system") is specifically a system for providing cloud services to tenants. The system may include a certain number of hardware devices or software devices to provide cloud services, and the application does not limit the types of the hardware devices and the software devices included in the system.

In practical application, a tenant may apply for a certain cloud service total amount from the cloud service system. In some examples, the cloud service total amount can be counted by taking the number of cloud service calls that the tenant can initiate as a dimension. The tenant can initiate a cloud service call request to the cloud service system within the cloud service total limit range so as to enjoy the service provided by the cloud service system.

The cloud service system comprises a system constructed based on a distributed architecture. The distributed architecture may specifically be an architecture including a plurality of working nodes. The working node (hereinafter referred to as "node") may be a terminal or a server (the terminal or the server may be a laptop, a desktop, a PAD terminal, etc., and the device type and model of the terminal or the server are not limited in the present application).

The distributed architecture provides computing power through the work nodes, so that the cloud service system can provide cloud services for tenants. It should be noted that the cloud service type may be cloud service invocation, traffic storage, or the like, and the cloud service type is not limited in the present application.

In some embodiments, the cloud service system may include an AI cloud service system.

Referring to fig. 2, fig. 2 is a schematic view illustrating an interaction between an AI cloud service system and a tenant according to the present application. As shown in fig. 2, the AI cloud service system is a system constructed based on a distributed architecture. Wherein the distributed architecture includes worker nodes A, B, C. The cloud service system shown in fig. 2 is only an exemplary illustration and is not particularly limited.

In the AI cloud service scenario illustrated in fig. 2, a tenant may apply for a cloud service total amount for a certain number of calls from the AI cloud service system. Then, the tenant may initiate a service call request such as model training to the AI cloud service system by means of a call interface (e.g., HTTP call). After receiving the call request, the AI cloud service system may distribute the call request task to a target work node a under a distributed architecture according to a pre-stored distribution rule (e.g., a load balancing distribution rule), so that the node a may respond to a cloud service request initiated by a tenant according to a working amount corresponding to the node a, and return a response result to the tenant.

The cloud service total amount is specifically a total service amount provided by a cloud service system that a tenant can enjoy.

In practical application, if the cloud service type applied by the tenant is cloud service invocation, the cloud service system may count the total service amount of the tenant with the cloud service invocation frequency as a dimension. If the cloud service type applied by the tenant is stream data processing, the cloud service system can count the total service quantity of the tenant by taking the byte number of the processed flow as the dimension.

It should be noted that, on one hand, the statistical dimension of the cloud service total amount is not limited in the present application. The following description will take the cloud service type as a cloud service call request as an example. On the other hand, in some examples, the tenant may apply for the total amount by paying for the purchase. In some examples, the tenant may apply for the above-mentioned total amount by applying for a trial. The method and the system for the cloud service total amount do not limit the mode of applying for the cloud service total amount by the tenant.

The working node can respond to the cloud service request initiated by the tenant according to the working limit corresponding to the working node.

In some examples, the worker node may charge for the cloud service request initiated by the tenant. For example, a worker node may maintain an amount summary table corresponding to a tenant. The sum table can count the combined remaining amount and used amount. After a certain working node responds to the cloud service request initiated by the tenant, the used quota can be increased to complete the charging of the cloud service request initiated by the tenant.

The working amount specifically refers to a cloud service request amount that the working node can respond to. When the working node receives a cloud service request initiated by a tenant, whether the working node responds to the cloud service request can be determined by judging whether the working limit is remained.

When the working node responds to the cloud service request once, the corresponding cloud service request amount can be consumed correspondingly. For example, when the cloud service request amount is counted by the number of calls, every time the worker node responds to a call request initiated by the tenant, a work limit consuming 1 unit may be responded.

In some examples, the work quota may include a two-sided quota. First, the working amount may be a working amount initially allocated to each node by the system after the tenant applies for the cloud service total amount, so that each node may operate. Secondly, the working amount may be a working amount applied to the system after the distributed working amount is consumed in the operation process of each node, so that each node can supplement the working amount and continue to operate.

The cloud service request is specifically a cloud service request initiated by a tenant to the system. The cloud service request may include a cloud service call request and/or a stream data processing request.

It should be noted that, in general, the type of the cloud service request initiated by the tenant is related to the type of the cloud service requested by the tenant.

For example, when the cloud service type applied by the tenant is a cloud service call, the tenant may initiate a cloud service call request. When the cloud service type applied by the tenant includes both cloud service invocation and stream data processing, the tenant may initiate a cloud service invocation request or a traffic processing request.

In some embodiments, after receiving a cloud service request initiated by a tenant, the working node may respond to the cloud service request to provide cloud service computation when a working amount corresponding to the working node is still surplus, and adjust a remaining working amount according to a consumption amount corresponding to the computation.

For example, referring to fig. 2, after receiving a cloud service request initiated by a tenant, the worker node a may determine whether its own work quota is left. If the working limit of the node A is remained, the node A can respond to the secondary cloud service request and consume 1 unit of working limit. If the working amount does not remain, the node A can limit the cloud service request.

It should be noted that, the present application does not limit the way in which the working node determines whether the working quota is left. In some embodiments, the worker node may store the work quota allocated by the system, as well as the cloud service request volume that the node has responded to. At this time, when determining whether the working amount is remained, the working amount may be subtracted by the currently responded cloud service request amount to obtain a corresponding result. If the result is larger than 0, the working amount is determined to be remained, otherwise, no residue is left. In some embodiments, the working node may store a surplus amount. Namely, the initial value of the surplus limit is the work limit allocated by the system, and the surplus limit value is adjusted every time the work node responds to the cloud service request. At this time, when determining whether the working amount unit is left, it may be determined whether the remaining amount unit is greater than 0, if so, it is determined that the working amount unit is left, otherwise, it is not left.

In some embodiments, after receiving a cloud service request initiated by a tenant, the working node may send an quota application request to the system if a working quota corresponding to the working node is not left, and when the cloud service total quota is still left, receive a working quota allocated to the working node by the cloud service system based on the remaining quota to respond to the cloud service request.

For example, referring to fig. 2, after receiving a cloud service request initiated by a tenant, the worker node a may determine whether its own work quota is left. If the working quota is not remained, the node A can firstly provide a quota application request to the system. After receiving the request for requesting the quota, the system can determine whether the cloud service total quota corresponding to the tenant still remains, and if the remaining quota still exists, the system can continuously allocate the working quota to the node A. After receiving the working limit, the node a will continue to respond to the secondary cloud service request.

In some embodiments, after the working node sends an quota applying request to the cloud service system, if the cloud service total quota is not left, the cloud service request is forwarded to other working nodes with surplus working quotas for processing.

For example, please refer to fig. 2, it is assumed that the system stores the working status of each working node (the working status refers to whether the node can respond to the request, i.e. whether there is still a working amount). After a certain working node A receives a cloud service request initiated by a tenant, if the working quota corresponding to the working node A is not left and the cloud service total quota is not left, the working node A can inquire the working states of other working nodes through the system. If node B is queried, which still can perform cloud service response, node a may route the request to node B, so that node B responds to the request.

In this embodiment, after receiving a cloud service request initiated by a tenant, each working node forwards the cloud service request to other working nodes with remaining working lines for processing if the working line corresponding to the working node does not remain and the cloud service total line does not remain, so that the cloud service system can provide cloud services to the tenant in the total line range applied by the tenant as much as possible, thereby improving the experience of the tenant.

Certainly, after each working node receives a cloud service request initiated by a tenant, if the working quota corresponding to the working node is not left, the cloud service total quota is not left, and no working quota has a remaining working node, the cloud service request of the tenant is limited.

In some embodiments, when the system performs the S104 to allocate work quota to the work node corresponding to the tenant included in the distributed architecture based on the cloud service total quota, the system may allocate work quota to the work node corresponding to the tenant included in the distributed architecture based on a part of quota in the cloud service total quota.

Here, since only a part of the total amount is used when allocating the working amount to the working node, the working amount can be allocated to the working node for a plurality of times, thereby avoiding the problem of unreasonable allocation caused by one-time allocation.

In some embodiments, when a work quota is allocated to a work node corresponding to the tenant included in the distributed architecture based on a part of quotas in the cloud service quota, the cloud service quota is distributed to each work node according to quota weight corresponding to each work node.

In practical application, the system may first obtain the quota weight corresponding to the working node included in the distributed architecture. After determining the quota weight corresponding to the working node, the system may assign the working quota matching the quota weight corresponding to the working node based on a part of the quota in the cloud service total quota.

The quota weight corresponding to each working node may be a preset fixed value. For example, the quota weight corresponding to each working node may be set to the same value. At this time, when allocating the working quota, the total quota may be equally distributed to each working node.

In some embodiments, when determining the quota weight, the quota weight corresponding to each working node is determined according to a preset quota weight determination rule based on configuration information of each working node.

For example, when the system is constructed, a configuration information table corresponding to each working node may be maintained. For example, the working node CPU, GPU processing capability, hard disk model, etc. When determining the quota weight corresponding to each working node included in the distributed architecture, the configuration information table corresponding to each working node may be queried to determine the configuration information of each working node.

After determining the configuration information corresponding to each working node, the system can determine the quota weight corresponding to each working node according to a preset quota weight determination rule.

In some embodiments, the quota weight determination rule may be to score configuration information of each working node. And then, carrying out weighted summation on each score to obtain a total score corresponding to each working node. And finally, determining the weight of each working node according to the total score corresponding to each working node.

When the quota weight corresponding to each working node included in the distributed architecture is determined, the system can determine the quota weight corresponding to each working node according to a preset quota weight determination rule based on the configuration information of each working node, so that the working quota can be reasonably distributed for each working node, more working quotas can be distributed for nodes with high configuration, the response speed of the cloud service system is increased, and the tenant experience is improved.

In some embodiments, when determining the credit weights corresponding to the working nodes included in the distributed architecture, the system may determine the credit weights corresponding to the working nodes based on the processing capabilities corresponding to the working nodes; wherein the processing capacity indicates an achievable cloud service request response per unit time length.

For example, the system can determine the response amount (processing capacity) of the cloud service request, which can be achieved by each working node in a unit time length, through a test mode. After the processing capacity corresponding to each working node is determined, the system can determine the quota weight of each working node according to the processing capacity corresponding to each working node.

Because when distributing the work quota for each work node, the work quota can be distributed according to the processing capacity of each work node, so that the work quota can be reasonably distributed for each work node, and the nodes with strong processing capacity can distribute more work quota, thereby improving the response speed of the cloud service system and improving the experience of tenants.

In some examples, when allocating work credits to each work node corresponding to the tenant in the distributed architecture based on a partial credit in the cloud service total credit, the system may determine a cloud service request response amount reachable by the work node within a preset time period according to a processing capability corresponding to the work node. Wherein the processing capacity indicates an achievable cloud service request response per unit time length. After determining the cloud service request response amount corresponding to the working node, the system may allocate a working amount to the working node according to the cloud service request response amount corresponding to the working node.

In some examples, the system may determine the value of the portion of the quota participating in the initial allocation according to a sum of response amounts of the cloud service request, which can be achieved by each working node within a preset time period. When allocating a work amount to each work node according to the cloud service request response amount corresponding to each work node, the system may determine the cloud service request response amount corresponding to each work node as a work amount corresponding to each work node, and allocate the work amount to each work node.

The preset time period may be a value set empirically. For example, 1 minute.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating allocation of a cloud service total amount according to the present application.

As shown in fig. 3, the cloud service request response amount that can be achieved within 1 minute corresponding to the working node a included in the distributed architecture is a dark gray square. The cloud service request response amount which can be reached within 1 minute corresponding to the working node B is a light gray square. The response amount of the cloud service request which can be achieved within 1 minute corresponding to the working node C is a black square.

When the system first distributes the work quota, the work quota indicated by the dark grey square can be distributed to the working node A from the total quota, the work quota indicated by the light grey square can be distributed to the working node B, and the work quota indicated by the black square can be distributed to the working node C.

When distributing the working quota to each working node included in the distributed architecture based on a part of quota in the total quota of the cloud service, the system may determine the cloud service request response amount that can be reached by each working node within a preset time period according to the processing capability corresponding to each working node. Wherein the processing capacity indicates an achievable cloud service request response per unit time length. After the cloud service request response amount corresponding to each working node is determined, the system can determine the cloud service request response amount corresponding to each working node as a working amount corresponding to each working node and distribute the working amount to each working node, so that a reasonable part amount participating in initial distribution can be determined, an initial working amount is reasonably distributed to each node, and the working efficiency of cloud service is further improved.

In practical situations, because the rates of consuming the work quota by the nodes are different, if the cloud service total quota is allocated at one time, the work of part of the work nodes is already consumed, but part of the work nodes are in the remaining work quota state, so that part of the work nodes are idle, and the work efficiency of the marketing cloud service system is improved.

In order to improve the situation, in some embodiments, when allocating the total amount to each working node, the total amount does not need to be allocated once, but when any working node applies for the amount, allocation is performed, so that the working node with a high working amount consumption rate can receive the allocation of the working amount for multiple times, and thus, the response speed of the cloud service system is improved, and the experience of the tenant is improved.

In practical application, when the system allocates the work quota to each work node included in the distributed architecture based on the cloud service total quota, the system may allocate the work quota to each work node included in the distributed architecture based on a part of the quota in the cloud service total quota. And when the system receives the quota application request from any one of the working nodes, distributing the working quota to the working node based on the remaining quota.

Wherein, the surplus quota includes the quota left after the allocated work quota is removed from the cloud service total quota.

In some embodiments, the system may determine the initial allocated value of the partial quota and the allocation rule when performing the total quota allocation. For example, a rule may be specified that one-third of the total amount is initially allocated, and that an even allocation is employed. At this time, the system can equally distribute one third of the total amount to each working node.

Then, the system receives the request of requesting for the quota from any working node, and can inquire whether the remaining quota is available, if yes, the system can distribute the working quota to the working node.

When the total amount is distributed to each working node, the total amount does not need to be distributed at one time, and when any working node applies for the amount, the total amount is distributed again, so that the working node which consumes the working amount at a high rate can receive the distribution of the working amount for many times, the response speed of a cloud service system is improved, and the experience of tenants is improved.

In some embodiments, when allocating a work amount to the work node based on a remaining amount, the system may allocate a work amount matching the cloud service request response to the work node according to the cloud service request response that can be achieved by the work node within a preset time period.

The preset time period may be a value set empirically. For example, 1 minute.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating a cloud service total quota allocation shown in the present application.

As shown in FIG. 4, the diagonal line frame represents that the corresponding quota of work of the node is consumed. When the node A finishes the consumption of the allocated work quota and initiates quota application to the system, the system can divide the cloud service request response amount which can be reached by the node A within 1 minute from the remaining quota. The system may then assign a work quota (dark grey box in fig. 4) corresponding to the request response to node a.

When the working amount is distributed to the working nodes, the system can distribute the working amount matched with the cloud service request response amount to the working nodes according to the cloud service request response amount which can be reached by the working nodes within the preset time length, so that the system can distribute the working amount which accords with the processing capacity of the nodes to the working nodes, the nodes with strong processing capacity can distribute more working amounts, the response speed of the cloud service system is improved, and the experience of tenants is improved.

It can be understood that when the total amount is consumed, the cloud service total amount applied by the tenant is consumed.

Corresponding to any one of the above embodiments, the present application further provides a cloud service request responding apparatus.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a cloud service request responding apparatus shown in the present application.

As shown in fig. 5, the above apparatus 50 may include:

an obtaining module 51, configured to obtain a cloud service total amount that a tenant applies to the cloud service system; the cloud service system comprises a system constructed based on a distributed architecture;

the distribution module 52 is configured to distribute a work quota to a work node corresponding to the tenant included in the distributed architecture based on the cloud service total quota, so that the work node responds to a cloud service request initiated by the tenant according to the work quota corresponding to the work node.

In some illustrated embodiments, the assignment module 52 is specifically configured to:

In some of the illustrated embodiments, the above-described assignment module 52 includes:

In some illustrated embodiments, the assignment module 52 is further configured to:

In some of the illustrated embodiments, the apparatus 50 further comprises:

In some embodiments shown, the cloud services include AI cloud services; the obtaining module 51 is specifically configured to:

the allocating module 52 is specifically configured to:

The embodiment of the cloud service request responding device shown in the application can be applied to electronic equipment. Accordingly, the present application discloses an electronic device, which may comprise: a processor.

A memory for storing processor-executable instructions.

The processor is configured to call the executable instructions stored in the memory to implement the cloud service request response method as shown in any one of the embodiments.

Referring to fig. 6, fig. 6 is a schematic diagram of a hardware structure of an electronic device shown in the present application.

As shown in fig. 6, the electronic device may include a processor for executing instructions, a network interface for making a network connection, a memory for storing operation data for the processor, and a non-volatile memory for storing instructions corresponding to the cloud service request responding apparatus.

The embodiment of the cloud service request responding apparatus may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading, by a processor of the electronic device where the device is located, a corresponding computer program instruction in the nonvolatile memory into the memory for operation. In terms of hardware, in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 6, the electronic device in which the apparatus is located in the embodiment may also include other hardware according to an actual function of the electronic device, which is not described again.

It is to be understood that, in order to increase the processing speed, the cloud service request responding apparatus may also directly store the corresponding instruction in the memory, which is not limited herein.

The present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is used to execute the cloud service request response method shown in any of the above embodiments.

One skilled in the art will recognize that one or more embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

"and/or" in this application means having at least one of the two, for example, "a and/or B" may include three schemes: A. b, and "A and B".

The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the data processing apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.

The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Embodiments of the subject matter and functional operations described in this application may be implemented in the following: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware that may include the structures disclosed in this application and their structural equivalents, or combinations of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.

The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

Computers suitable for executing computer programs may include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.

Computer-readable media suitable for storing computer program instructions and data can include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disk or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Although this application contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or of what may be claimed, but rather as merely describing features of particular disclosed embodiments. Certain features that are described in this application in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Further, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

The above description is only for the purpose of illustrating the preferred embodiments of the present application and is not intended to limit the present application to the particular embodiments of the present application, and any modifications, equivalents, improvements, etc. made within the spirit and principles of the present application should be included within the scope of the present application.

Claims

1. A cloud service request response method is applied to a cloud service system; characterized in that the method comprises:

acquiring a cloud service total amount applied by a tenant to the cloud service system; wherein the cloud service system comprises a system constructed based on a distributed architecture;

based on the cloud service total amount, distributing a work amount to a work node corresponding to the tenant and included in the distributed architecture, so that the work node responds to a cloud service request initiated by the tenant according to the work amount corresponding to the work node.

2. The method of claim 1, wherein the allocating work quota to each work node included in the distributed architecture based on the cloud service total quota comprises:

and distributing work quota to the work nodes corresponding to the tenants and included in the distributed architecture based on partial quota in the cloud service total quota.

3. The method of claim 2, wherein the allocating work quota to each work node corresponding to the tenant in the distributed architecture based on a partial quota of the cloud service total quota comprises:

determining the cloud service request response amount which can be reached by the working node within a preset time length according to the processing capacity corresponding to the working node; wherein the processing capacity indicates an achievable cloud service request response per unit time length;

and distributing a working amount to the working node according to the cloud service request response amount corresponding to the working node.

4. The method of claim 2, wherein the allocating work quota to a work node corresponding to the tenant included in the distributed architecture based on a partial quota of the cloud service total quota comprises:

and distributing a working amount matched with the amount weight corresponding to the working node for the working node based on part of amounts in the cloud service total amount.

5. The method of claim 4, wherein the determining the quota weight corresponding to each working node included in the distributed architecture comprises:

determining an amount weight corresponding to each working node according to a preset amount weight determination rule based on configuration information of each working node; or the like, or, alternatively,

6. The method according to any one of claims 2-5, further comprising:

if a quota application request provided by any working node is received, distributing a working quota to the working node based on a surplus quota; the surplus quota comprises a quota remaining after the allocated work quota is removed from the cloud service total quota.

7. The method of claim 6, wherein the assigning work credits to the work nodes based on credits comprises:

and distributing a working amount matched with the cloud service request response amount to the working node according to the cloud service request response amount which can be reached by the working node within a preset time length based on the surplus amount.

8. The method according to any one of claims 1 to 7, wherein the step of the working node responding to the cloud service request initiated by the tenant according to the working quota corresponding to the working node comprises the following steps:

9. The method according to any one of claims 1 to 8, wherein the working node responds to the cloud service request initiated by the tenant according to its corresponding working quota, further comprising:

after the working node receives a cloud service request initiated by a tenant, if the working limit corresponding to the working node is not remained, the working node provides a limit application request to the cloud service system, and when the cloud service total limit is still remained, the working limit distributed to the working node by the cloud service system based on the remaining limit is received to respond to the cloud service request.

10. The method of claim 9, further comprising:

after the working node provides an amount application request to the cloud service system, if the cloud service total amount is not remained, the cloud service request is forwarded to other working nodes with remained working amounts for processing.

11. The method according to any one of claims 1-10, further comprising:

and the working node charges a cloud service request initiated by a tenant.

12. The method of any of claims 1-11, wherein the cloud service comprises an AI cloud service; the acquiring of the cloud service total amount applied by the tenant to the cloud service system comprises the following steps:

acquiring an AI cloud service total amount applied by a tenant to the cloud service system;

the allocating work quota to the work node corresponding to the tenant and included in the distributed architecture based on the cloud service total quota, so that the work node responds to the cloud service request initiated by the tenant according to the work quota corresponding to the work node, includes:

13. A cloud service request response device is applied to a cloud service system; characterized in that the device comprises:

the acquisition module is used for acquiring a cloud service total amount applied by a tenant to the cloud service system; wherein the cloud service system comprises a system constructed based on a distributed architecture;

14. An electronic device, characterized in that the device comprises:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to invoke executable instructions stored in the memory to implement the cloud service request response method of any of claims 1-12.

15. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the cloud service request response method according to any one of claims 1 to 12.