CN117648981A - Reasoning method and related device - Google Patents

Reasoning method and related device Download PDF

Info

Publication number
CN117648981A
CN117648981A CN202210962642.4A CN202210962642A CN117648981A CN 117648981 A CN117648981 A CN 117648981A CN 202210962642 A CN202210962642 A CN 202210962642A CN 117648981 A CN117648981 A CN 117648981A
Authority
CN
China
Prior art keywords
feature
local
features
client
network data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210962642.4A
Other languages
Chinese (zh)
Inventor
邵云峰
吴骏
郑青
卢嘉勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210962642.4A priority Critical patent/CN117648981A/en
Priority to PCT/CN2023/103784 priority patent/WO2024032214A1/en
Publication of CN117648981A publication Critical patent/CN117648981A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an inference method and a related device. The method is applied to an inference system, the inference system comprises a server side and a plurality of clients, and the method comprises the following steps: the first federal learning client collects first network data of a first time slot, which is related to a target network resource, extracts first local features of the first network data and sends the first local features to the federal learning server, the federal learning server collects the local features uploaded by a plurality of clients to calculate global priori features and sends the global priori features to each client, the first federal learning client can infer second network data of a second time slot according to the global priori features, and in the reasoning process, information of data of other clients except the local data is utilized to improve the accuracy of reasoning results.

Description

Reasoning method and related device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an inference method and a related apparatus.
Background
Federal learning is a distributed machine learning paradigm in which multiple parties use all of their respective data to cooperatively train an artificial intelligence model without aggregating the original data of the multiple parties (different organizations or users). Conventional machine learning paradigms require a large amount of raw data to be aggregated for training of a model, with the raw data used for training likely coming from a number of different organizations or users. The aggregation of raw data from multiple different organizations or different users is highly likely to pose a risk of data leakage, exposing information assets to organizations, and personal privacy to individual users. The existence of the above problems presents a serious challenge to the training of artificial intelligence models, and federal learning techniques have been developed to solve the above problems. Federal learning allows multiparty raw data to remain local without multiparty data aggregation, with the multiparty co-training artificial intelligence models by way of collaborative computing (secure) interactive intermediate computing results. Through the federal learning technology, multiparty user data is protected, multiparty data can be fully utilized to cooperatively train the model, and therefore a more powerful model is obtained.
Typical federation learning can be divided into three paradigms, namely a horizontal federation mode, a vertical federation mode and a federation migration mode (federation domain self-adaption), according to scenes, and the three paradigms respectively solve three typical scenes.
In a horizontal federation learning architecture, there is typically one server and different clients that participate in the horizontal federation. In the training stage, the client uses local data to train and uploads the trained model to the server; the server side performs weighted average on the models uploaded by all the clients, so as to obtain a global model. The global model is issued to the client for reasoning at the client.
In the reasoning stage, the client performs reasoning based on a global model issued by the server and manages the resources of the equipment according to the reasoning result; however, since only local data is used for reasoning, the reasoning result of the client is not accurate enough, and thus the resource management of the equipment is disordered.
Disclosure of Invention
The application provides an reasoning method and a related device, wherein the method utilizes information of data of other clients except local data in the reasoning process, so as to improve the accuracy of a reasoning result.
The first aspect of the present application provides an inference method, which includes: the method comprises the steps that a first federal learning client sends first local features to a federal learning server, wherein the first local features are extracted from first network data, the first network data are data which are acquired by the first federal learning client in a first time slot and related to target network resources, and the target network resources are network resources managed by the first federal learning client; the first federation learning client receives global priori features from the federation learning server, the global priori features are obtained according to the first local features and second local features, and the second local features are provided by a second federation learning client; the first federal learning client performs reasoning according to the global priori feature and second network data to obtain a reasoning result, wherein the second network data is data related to the target network resource, which is acquired by the first federal learning client in a second time slot, and the reasoning result is used for managing the target network resource, and the second time slot is the same as or after the first time slot.
In the above aspect, the first federal learning client collects the first network data of the first time slot related to the target network resource, extracts the first local feature of the first network data, sends the first local feature to the federal learning server, and the federal learning server collects the local features uploaded by the plurality of clients to calculate the global priori feature and sends the global priori feature to each client.
In a possible embodiment, the first network data is a sampling value of data related to the target network resource in the first time slot or a statistical value from the third time slot to the first time slot, the second network data is a sampling value of data related to the target network resource in the second time slot, and the third time slot is before the first time slot.
In a possible implementation, the global a priori feature is a feature vector or a first machine learning model, which is used for reasoning about the second network data.
In a possible implementation manner, in a case that the global prior feature is a feature vector, the first federal learning client performs reasoning according to the global prior feature and the second network data to obtain a reasoning result, where the reasoning result includes: the first federal learning client performs reasoning according to the global priori features, the second network data and a local second machine learning model to obtain a reasoning result, wherein the second machine learning model is used for reasoning of the second network data.
In the possible implementation manner, the first federal learning client needs to input the global priori features and the second network data into the local second machine learning model to perform reasoning, and uses the result output by the second machine learning model as the reasoning result to perform reasoning by using the trainable second machine learning model, so that the accuracy of the scheme can be improved.
In one possible implementation, the first federal learning client performs reasoning according to the global prior feature, the second network data, and the local second machine learning model to obtain a reasoning result includes: the first federal learning client inputs the second network data into the third learning model to obtain a plurality of characteristics of the second network data output by the third learning model; the first federal learning client inputs the global prior feature into a second machine learning model to obtain weights of each of a plurality of features of the second network data; the first federal learning client determines an inference result according to the plurality of features of the second network data and the weights of the plurality of features of the second network data.
In the foregoing possible implementation manner, before the second network data is input into the second machine learning model, the first federal learning client may further input the second network data into the third machine learning model, so as to obtain a plurality of features that may embody characteristics of the second network data, and may save computing resources. The method that the first federal learning client inputs the global priori feature and the second network data into the local second machine learning model to perform reasoning can be that the first federal learning client inputs the global priori feature into the second machine learning model, so that weights corresponding to multiple features of the second network data output by the second machine learning model can be obtained, namely each feature corresponds to one weight, then an reasoning result can be determined according to the multiple features of the second network data and the weights of the multiple features of the second network data, different features correspond to different weights, and the reasoning accuracy is improved.
In one possible implementation, the second machine learning model includes a plurality of first task models; the first federal learning client performs reasoning according to the global priori features, the second network data and the local second machine learning model to obtain a reasoning result, wherein the reasoning result comprises: the first federal learning client calculates the weight of each of the plurality of first task models according to the global priori features; the first federal learning client inputs the characteristics of the second network data into a plurality of first task models to obtain the reasoning characteristics output by the plurality of first task models; and the first federal learning client obtains an reasoning result according to the weights of the first task models and the reasoning characteristics output by the first task models.
In the foregoing possible implementation manner, the second machine learning model includes a plurality of first task models, the plurality of features of the second network data may be respectively processed, specifically, after the first federal learning client obtains the global priori feature, the first federal learning client may determine weights of each of the plurality of first task models from the global priori feature, the first federal learning client may input the features of the second network data into the plurality of first task models, where the input features of the plurality of first task models may be default type features, that is, the features of the second network data are classified according to the category and input into the first task models of the corresponding category, and the first federal learning client may perform weighted average on the inference features output by the plurality of first task models in combination with the weights corresponding to the plurality of first task models to obtain the inference result. Training different types of features in different first task models and weighting average based on weights can improve the accuracy of reasoning.
In one possible implementation, the computing, by the first federal learning client, the weights for each of the plurality of first task models based on the global prior feature includes: the first federal learning client calculates the weights of the first task models according to the global priori features and the second network data.
In the possible implementation manner, the first federal learning client calculates the weight of the first task model by combining the local data of the client and the global priori features, and adopts the second network data with real-time performance to improve the accuracy of the weight.
In a possible embodiment, the method further comprises: the first federal learning client extracts features of the second network data through a third machine learning model.
In the possible implementation manner, the features of the second network data are adopted for reasoning, so that the reasoning calculation amount is reduced.
In a possible implementation, the third machine learning model includes a plurality of second task models; the first federal learning client extracting features of the second network data through the third machine learning model includes: the first federal learning client determines the weight of each of the plurality of second task models according to the second network data; the first federation learning client inputs the second network data into a plurality of second task models to obtain sub-features of the second network data output by the plurality of second task models; and obtaining the characteristics of the second network data according to the weights of the second task models and the sub-characteristics of the second network data output by the second task models.
In the above possible implementation manner, the first federal learning client determines weights corresponding to the plurality of second task models according to the local data or the second network data, then inputs the second network data into the second task models, so as to obtain sub-features of the second network data output by the second task models, and the first federal learning client can perform weighted average according to the weights corresponding to the plurality of second task models and the sub-features output by the second task models to obtain features of the second network data, trains different types of features in different first task models, and then can improve the accuracy of reasoning based on the weighted average of the weights.
In a possible implementation manner, each second task model is a layer of self-encoder, and the reconstruction target of the r-th task model in the plurality of second task models is the residual error of the r-1-th task model, wherein r is an integer greater than 1 and represents the number of the second task models.
In a possible implementation manner, in a case that the global prior feature is the first machine learning model, the first federal learning client performs reasoning according to the global prior feature and the second network data to obtain a reasoning result, where the reasoning result includes: the first federal learning client extracts the characteristics of the second network data; the first federal learning client inputs the features of the second network data into the first machine learning model to obtain an inference result output by the first machine learning model.
In the possible implementation manner, the federal learning server may directly issue the first machine learning model as the inference model, so as to improve the flexibility of the scheme.
In a possible implementation manner, in a case that the global prior feature is the first machine learning model, the first federal learning client performs reasoning according to the global prior feature and the second network data to obtain a reasoning result, where the reasoning result includes: the first federal learning client trains the first machine learning model by using sample data; the first federal learning client extracts the characteristics of the second network data; the first federal learning client inputs the features of the second network data into the trained first machine learning model to obtain an inference result output by the trained first machine learning model.
In the possible implementation manner, when the federal learning server directly issues the first machine learning model as the inference model, the client may further train based on the local sample data, so as to improve the accuracy of the inference model.
In a possible embodiment, the method further comprises: the first federal learning client sends grouping information to the federal learning server, wherein the grouping information indicates a grouping where the first local feature is located, so that the federal learning server obtains a global priori feature according to the first local feature, the grouping where the first local feature is located, the second local feature and the grouping where the second local feature is located.
In the foregoing possible implementation manner, the first federal learning client may further send packet information to the server, where the packet information indicates a packet in which the local feature is located, so that the server obtains the global priori feature according to the local features from the multiple clients and the packet in which the local feature is located, and it is avoided that the data of all the clients together determine the global priori feature and may affect the output between the groups.
In a possible embodiment, the method further comprises: the first federation learning client receives task synchronization information from the federation learning server, wherein the task synchronization information is used for indicating a first time slot; the first federal learning client selects first network data from the local data according to the task synchronization information.
In the above possible implementation manner, there is a time slot to which the federal learning server indicates the local feature uploaded by the client, so as to improve the synchronism of the data uploaded by the client.
A second aspect of the present application provides an inference method, the method comprising: the federation learning server receives a first local feature from a first federation learning client, the first local feature is extracted from first network data, the first network data is data related to a target network resource acquired by the first federation learning client in a first time slot, and the target network resource is a network resource managed by the first federation learning client; the federation learning server obtains global priori features according to the first local features and the second local features, wherein the second local features are provided by a second federation learning client; the federal learning server side sends global priori features to the first federal learning client side, so that the first federal learning client side performs reasoning according to the global priori features and second network data to obtain a reasoning result, the second network data is data related to target network resources, which is acquired by the first federal learning client side in a second time slot, and the reasoning result is used for managing the target network resources, wherein the second time slot is identical to or after the first time slot.
In a possible embodiment, the first network data is a sampling value of data related to the target network resource in the first time slot or a statistical value from the third time slot to the first time slot, the second network data is a sampling value of data related to the target network resource in the second time slot, and the third time slot is before the first time slot.
In a possible implementation, the global a priori feature is a feature vector or a first machine learning model, which is used for reasoning about the second network data.
In a possible embodiment, the method further comprises: the federation learning server receives grouping information from a first federation learning client, wherein the grouping information from the first federation learning client indicates a grouping in which a first local feature is located; the federal learning server obtains global prior features according to the first local features and the second local features, wherein the obtaining the global prior features comprises: the federal learning server obtains global prior features according to the first local features, the group in which the first local features are located, the second local features and the group in which the second local features are located, wherein the group in which the second local features are located is indicated by group information from the second federal learning client.
In a possible implementation manner, the first local feature includes a first sub-feature and a second sub-feature, the second local feature includes a third sub-feature and a fourth sub-feature, the packet information from the first federal learning client indicates a packet in which the first sub-feature is located and a packet in which the second sub-feature is located, the packet information from the second federal learning client indicates a packet in which the third sub-feature is located and a packet in which the fourth sub-feature is located, and the packet in which the first sub-feature is located and the packet in which the third sub-feature is located are the same; the obtaining, by the federal learning server, the global priori feature according to the first local feature, the group in which the first local feature is located, the second local feature, and the group in which the second local feature is located includes: based on the fact that the group in which the first sub-feature is located is the same as the group in which the third sub-feature is located, the federal learning server side processes the first sub-feature and the third sub-feature to obtain an intermediate feature; the federal learning server obtains a global priori feature according to the intermediate feature, the second sub-feature, the fourth sub-feature, the group in which the second sub-feature is located, and the group in which the fourth sub-feature is located.
In the foregoing possible implementation manner, the first local feature uploaded by the first federation learning client includes a first sub-feature and a second sub-feature, the second local feature uploaded by the second federation learning client includes a third sub-feature and a fourth sub-feature, the packet information from the first federation learning client may indicate a packet in which the first sub-feature is located and a packet in which the second sub-feature is located, the packet information from the second federation learning client indicates a packet in which the third sub-feature is located and a packet in which the fourth sub-feature is located, and if the packet in which the first sub-feature is located and the packet in which the third sub-feature is located are the same, the federation learning server may process the first sub-feature and the third sub-feature first to obtain an intermediate feature, and then process the intermediate feature and the second sub-feature according to the packet, and the processing result obtained after processing the intermediate feature and the fourth sub-feature to obtain the global priori feature. Features uploaded for different types of clients can be processed separately, reducing the impact between different types of clients.
In one possible implementation, the obtaining, by the federal learning server, the global prior feature according to the first local feature and the second local feature includes: the federal learning server obtains global priori features according to the first local features, the second local features, the historical local features from the first federal learning client and the historical local features from the second federal learning client.
In the possible implementation manner, the federal learning server determines the global priori feature, and besides the first local feature and the second local feature, the global priori feature may be determined by combining the historical local feature from the first federal learning client and the historical local feature from the second federal learning client, so as to improve the accuracy of the global priori feature.
In one possible implementation, the obtaining, by the federal learning server, the global prior feature according to the first local feature, the second local feature, the historical local feature from the first federal learning client, and the historical local feature from the second federal learning client includes: the federation learning server calculates the similarity between the local features of the current reasoning process and a plurality of groups of historical local features, wherein the local features of the current reasoning process comprise a first local feature and a second local feature, and each group of historical local features comprise the historical local features from the first federation learning client and the historical local features from the second federation learning client in one historical reasoning process; and the federal learning server performs weighted summation on the historical prior features corresponding to the plurality of groups of historical local features according to the similarity between the local features of the current reasoning process and the plurality of groups of historical local features so as to obtain global prior features.
In the above possible implementation manner, the federation learning server may calculate the similarity between the local features of the current reasoning process and the multiple sets of historical local features, that is, determine the similarity between the first local feature and the historical local features from the first federation learning client in each set of historical local features, determine the similarity between the second local feature and the historical local features from the second federation learning client in each set of historical local features, and then perform weighted summation on the historical prior features corresponding to the multiple sets of historical local features based on the similarity, so as to obtain the required global prior feature, and improve the accuracy of the global prior feature.
In one possible implementation, the plurality of sets of historical local features all have labels, which are the actual results of each set of historical local features that are artificially labeled; the method further comprises the steps of: the federal learning server receives an reasoning result from the first federal learning client; the federation learning server determines a target reasoning result according to the reasoning result of the first federation learning client and the reasoning result of the second federation learning client; under the condition that the similarity between the local feature of the current reasoning process and the historical local feature of the target group is greater than or equal to a threshold value, the federal learning server updates the historical local feature of the target group according to the similarity between the local feature of the current reasoning process and the historical local feature of the target group, wherein the historical local feature of the target group is a target reasoning result in a plurality of groups of historical local features; under the condition that the similarity between the local features of the current reasoning process and the historical local features of the target group is smaller than a threshold value, the federal learning server adds a group of historical local features on the basis of a plurality of groups of historical local features, and the added group of historical local features are the local features of the current reasoning process.
In the possible implementation manner, the plurality of sets of historical local features may also have labels, which are actual results of each set of historical local features that are artificially labeled, and may be used to compare with the inference results to determine the accuracy of the inference results. After the client obtains the reasoning results, the client can upload the reasoning results to the federal learning server, the federal learning server can determine a target reasoning result corresponding to the current global priori feature according to the reasoning results of the plurality of clients, then the target group historical local feature with the label being the target reasoning result can be selected from a plurality of groups of historical local features, the similarity between the local feature of the current reasoning process and the target group historical local feature is greater than or equal to a threshold value and is updated based on the similarity, and if the similarity is smaller than the threshold value, the new group historical local feature is used as a new group of historical local feature, a sample library can be updated and supplemented at any time, and the effectiveness of the sample library is improved.
In a possible embodiment, the method further comprises: the federation learning server side sends task synchronization information to the first federation learning client side, wherein the task synchronization information is used for indicating a first time slot, so that the first federation learning client side selects first network data from local data according to the task synchronization information.
A third aspect of the present application provides an inference apparatus, which can implement the method of the first aspect or any of the possible implementation manners of the first aspect. The apparatus comprises corresponding units or modules for performing the above-described methods. The units or modules included in the apparatus may be implemented in a software and/or hardware manner. The device may be, for example, a network device, a chip system, or a processor that supports the network device to implement the method, or a logic module or software that can implement all or part of the functions of the network device.
A fourth aspect of the present application provides an inference apparatus, which can implement the method of the second aspect or any of the possible embodiments of the second aspect. The apparatus comprises corresponding units or modules for performing the above-described methods. The units or modules included in the apparatus may be implemented in a software and/or hardware manner. The device may be, for example, a network device, a chip system, or a processor that supports the network device to implement the method, or a logic module or software that can implement all or part of the functions of the network device.
A fifth aspect of the present application provides a computer device comprising: a processor coupled to a memory for storing instructions that when executed by the processor cause the computer device to implement the method of the first aspect or any of the possible implementations of the first aspect. The computer device may be, for example, a network device, or a chip system supporting the network device to implement the above method.
A sixth aspect of the present application provides a computer device comprising: a processor coupled to a memory for storing instructions that when executed by the processor cause the computer device to implement the method of the second aspect or any of the possible implementations of the second aspect. The computer device may be, for example, a network device, or a chip system supporting the network device to implement the above method.
A seventh aspect of the present application provides a computer readable storage medium having instructions stored therein which, when executed by a processor, implement a method as provided by the foregoing first aspect or any one of the possible implementations of the first aspect, the second aspect or any one of the possible implementations of the second aspect.
An eighth aspect of the present application provides a chip system comprising at least one processor for executing a computer program or instructions stored in a memory, which when executed on the at least one processor implements the method provided by the foregoing first aspect or any one of the possible implementations of the first aspect, the second aspect or any one of the possible implementations of the second aspect.
A ninth aspect of the present application provides a computer program product comprising computer program code for implementing the method of the first aspect or any one of the possible implementation manners of the first aspect, the second aspect or any one of the possible implementation manners of the second aspect when the computer program code is executed on a computer.
Drawings
FIG. 1 is a schematic diagram of a computer system according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of an inference method according to an embodiment of the present application;
fig. 3 is a schematic diagram of an inference structure provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of another inference structure provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of another inference structure provided in an embodiment of the present application;
FIG. 6 is a schematic diagram of a calculation chart according to an embodiment of the present application;
fig. 7 is a schematic diagram of a knowledge graph according to an embodiment of the present application;
FIG. 8 is a schematic diagram of another inference structure provided in an embodiment of the present application;
fig. 9 is a schematic diagram of an acquisition mode of local data of a federal learning server according to an embodiment of the present application;
FIG. 10 is a schematic flow chart of a fixed parameter task selector generation according to an embodiment of the present application;
FIG. 11 is a schematic flow chart of task model selection according to an embodiment of the present application;
FIG. 12 is a schematic flow chart of model pre-training according to an embodiment of the present application;
FIG. 13 is a schematic flow chart of model training according to an embodiment of the present disclosure;
FIG. 14 is a schematic flow chart of model strategically training provided in an embodiment of the present application;
fig. 15 is a schematic structural diagram of an inference apparatus according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of another reasoning apparatus according to an embodiment of the present application;
fig. 17 is a schematic structural diagram of a computer device according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of another computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms first, second and the like in the description and claims of the present application and in the above-described figures, and the like, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely illustrative of the manner in which the embodiments of the application described herein have been described for objects of the same nature. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
In the description of the present application, "/" means or, unless otherwise indicated, for example, a/B may represent a or B; the term "and/or" in this application is merely an association relation describing an association object, and means that three kinds of relations may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, in the description of the present application, "at least one" means one or more items, and "multiple" means two or more items. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
FIG. 1 is a schematic diagram of a computer system suitable for use in embodiments of the present application, as shown in FIG. 1, including a federal learning server and a plurality of clients. The federal learning server cooperates with a plurality of clients for model training and model-based reasoning. The number of the clients can be adjusted according to actual requirements.
It should be appreciated that in the inference phase, a certain client only uses local data to perform inference, and does not use data of other clients to perform inference, and although privacy of other clients can be protected, an inference result may be inaccurate.
Therefore, the embodiment of the application provides an reasoning method, in which a server participates in a reasoning process, specifically, a federal learning server obtains global priori features according to local features of a plurality of clients, and sends the global priori features to each client, so that each client uses the global priori features and local data to perform reasoning. Because the global priori features are obtained according to the local features of the clients, and the local features are extracted from the first network data, for each client, the information of the data of other clients is utilized in the reasoning process, and the accuracy of the reasoning result can be improved.
The application scenarios of the reasoning method provided by the embodiment of the application can be various, for example, the application scenarios can be applied to voice recognition scenarios and image recognition scenarios.
Taking a speech recognition scene as an example, a person on a concert site performs speech-to-text operation by using a speech assistant on the concert site, a single device cannot determine whether received speech is from the background (lyrics of the concert) or from a person who is recording speech, and at this time, speech information acquired by devices of other users of the same concert is required to determine whether speech received by the device is from the background (lyrics of the concert) or from the voice being recorded by the speech information acquired by the devices of the other users. In this scenario, the reasoning method provided by the embodiment of the present application may be used to infer from the voice information collected by the devices of multiple users, so as to infer the source of the voice (background or user input).
Taking an image recognition scene as an example, a plurality of vehicles run on a road, and a camera on each vehicle shoots an image of the road. By using the reasoning method provided by the embodiment of the application, the traffic condition (no congestion, normal congestion or congestion caused by traffic accidents) of the current road can be deduced through the reasoning of the images shot by the cameras of the vehicles.
The data (including the local data, the first network data, the second network data, and the like) used by the reasoning method provided by the embodiment of the application may be various forms of data such as images, voices, characters, and the like.
It should be noted that, a part of models is needed in the reasoning process, and the part of models can be obtained through pre-training, so that the reasoning method provided in the embodiment of the present application will be described first, and then the training method of the models used in the reasoning method will be described.
The reasoning method provided by the embodiment of the application is described first.
Fig. 2 is a schematic flow chart of an inference method provided in an embodiment of the present application, which is applied to an inference system, where the inference system includes a server and a plurality of clients, and may specifically be the inference system shown in fig. 1.
In this embodiment, taking one of a plurality of clients as an example, this embodiment specifically includes:
step 201, the server sends task synchronization information to the plurality of clients, so that the plurality of clients select first network data from the local data according to the task synchronization information, and the task synchronization information represents a time requirement on the first network data.
The task synchronization information is used to enable a plurality of clients to select first network data related to target network resources from the local data so as to infer on the premise of meeting the requirements of users. Wherein the target network resource may be a current state of the target network, e.g. the target network resource may be a fault state of the vehicle, the first network data may comprise at least one of: the location data, the driving speed data, the driving mileage data, the battery power data, the battery voltage data, the brake data, the accelerator data, the voltage and current data of the motor, the insulation resistance data, the temperature data and the like of the first vehicle are not limited.
The content of the task synchronization information is not specifically limited, for example, the task synchronization information may be a specific time, so that the time of the first network data selected by the plurality of client task synchronization information from the local data is the same, or so that the time of the first network data selected by the plurality of client task synchronization information from the local data is relatively close, or so that the number of data pieces of data after dimension reduction is a period of time before the time of the first network data selected by the plurality of client task synchronization information from the local data, such as an average value, a median value, or a principal component value obtained by dimension reduction through principal component analysis and the like of the plurality of pieces of data, a clustering center value of the cluster, and the like.
The first federal learning client is one of a plurality of clients, and the first federal learning client can receive task synchronization information from the server, wherein the task synchronization information represents a time requirement for first network data in local data of the first federal learning client.
Step 202, the first federal learning client selects first network data from the local data according to the task synchronization information.
In this embodiment, the task synchronization information indicates a time point, and the first federal learning client may select data at the time point from the local data as the first network data.
The first network data is a sampling value of data related to the target network resource in a first time slot or a statistical value from a third time slot to the first time slot, and the third time slot is before the first time slot.
For example, if the task synchronization information includes a time point of 10 minutes, the first federal learning client may select local data at a time point of 10 minutes as the first network data according to the task synchronization information, or the first federal learning client may select local data at a time point closer to the time point of 10 minutes or data obtained by reducing the local data in the data dimension as the first network data according to the task synchronization information, for example, the first federal learning client may select local data within a time point of 9 minutes to 10 minutes as the first network data.
Step 203, the first federal learning client extracts the features of the first network data from the first network data, where the features of the first network data include the first local features.
The characteristics of the first network data may include one characteristic or may include a plurality of characteristics; when the characteristic of the first network data comprises a plurality of characteristics, the local characteristic may be one or more of the plurality of characteristics.
The features of the first network data are generally represented by feature vectors, and by taking the example that the features of the first network data include three features, the features of the first network data can be represented asWherein the local feature may be one or more of the three features, which feature or features to transmit may be agreed in advance, or may be determined according to the current first network data characteristic, the characteristic of the first network data, the client computing resource, the client and/or server network resource, and the policy of the determination may be manual logic or communicationModel obtained by machine learning, the embodiment of the application uses +.>For example, where n is an identification of the first federal learning client, the numbers 1, 2, 3, etc., different n refers to different clients. The characteristics of the first network data are also referred to as implicit characteristics hereinafter. The first network data characteristic refers to an original data characteristic, for example, if the original data information quantity is small, one characteristic is directly transmitted, and if the original data information quantity is large, a plurality of characteristics are transmitted. The information amount of the original data can be calculated by entropy, or can be calculated by a neural network or a machine learning model.
The first federal learning client may extract the features of the first network data through a local feature extraction model, where the types of the local feature extraction model include a plurality of types, and the embodiment of the present application is not specifically limited thereto.
As an implementation manner, when the first network data feature is of a plurality of types, the local feature extraction model may include a plurality of second task models, and accordingly, step 203 may specifically be: the first federation learning client inputs the first network data into the plurality of second task models respectively to obtain a plurality of characteristics of the first network data output by the plurality of second task models. The first federation learning client inputs the first network data into different second task models according to types, and each second task model outputs the corresponding characteristics of the types.
The kind of the second task model is not specifically limited in the embodiment of the present application.
As one implementation, each second task model is a layer of self-encoder, and the reconstruction target of the r-1 st task model is the residual of the r-1 st task model, where r is an integer greater than 1 and represents the number of second task models.
Specifically, the self-encoder is a neural network with the same input and learning targets, and the structure of the self-encoder is divided into two parts of an encoder and a decoder. After the first network data is input, an implicit feature output by the encoder, i.e. "encoded feature", may be regarded as a representation of the first network data.
The self-encoder may be implemented using a deep neural network (deep neural network, DNN), convolutional neural network (convolutional neural network, CNN), or a transducer neural network, among others.
Specifically, taking 3 second task models as an example, the local feature extraction model contains a 3-layer self-encoder. Layer 1 self-encoder input as original feature X n By means of an encoderGet implicit feature->Encoder->Deep neural networks may be employed. Implicit features->Dimension ratio of (2) to original feature X n Is low in the dimension of (a). Implicit features->By and encoder->Corresponding decoder->Obtaining original characteristic X n Reconstruction characteristics of->Decoder->Deep neural networks may be employed.
Layer 2 self-encoder input as original feature X n By means of an encoderGet implicit feature->Encoder->Deep neural networks may be employed. Implicit features->Dimension ratio of (2) to original feature X n Is low in the dimension of (a). Implicit features->By and encoder->Corresponding decoder->Obtaining original characteristic X n Reconstructing with layer 1 model->Difference of->Reconstruction characteristics of->
Layer 3 self-encoder input as original feature X n By means of an encoderGet implicit feature->Encoder->Deep neural networks may be employed. The dimension of the implicit feature h3 is compared with the original feature X n Is low in the dimension of (a). The implicit feature h3 obtains the original feature X through the decoder D3 corresponding to the encoder E3 n Reconstructing with layer 1 model->And layer 2 model reconstruction->Difference of sum->Reconstruction characteristics of->
The pre-training process of the self-encoder can be as follows:
(1) The federal learning server side transmits the n-th layer self-encoder model; (2) The client receives the n-layer self-encoder model, uses the local data training model for a plurality of rounds and then uploads the n-layer self-encoder model to the federal learning server; (3) Repeating (1) to (2) until the self-encoder of the n-th layer converges; (4) Repeating (1) to (3) to train the self-encoder of each layer by layer.
Step 204, the first federal learning client sends a first local feature to the server, the first local feature is extracted from the first network data, and the first federal learning client is any one of a plurality of clients. Accordingly, the server receives local features from the plurality of clients, the local features being extracted from the first network data.
In this embodiment, the first federation learning client may send the first federation learning client to the server after selecting the first local feature from the features of the first network data, and other clients may correspondingly extract the local feature from the local first network data and send the local feature to the server.
Step 205, the server obtains global priori features according to local features from a plurality of clients.
In this embodiment, the federal learning server collects local features reported by multiple clients, and then processes the local features, and determines global priori features based on the features of the multiple clients, where the global priori features are obtained by multi-client information fusion, so as to improve accuracy of the global priori features.
Wherein, the federal learning server receives local features uploaded by a plurality of clientsLocal feature uploading multiple clients +.>And (5) splicing the input features of the global prior feature extraction model, inputting the global prior feature extraction model, and obtaining the global prior features corresponding to the output clients.
The global priori feature extraction model adopts DNN, CNN, transformer or RNN model, and the input of the model has the last output value except the spliced local features. The global prior feature extraction model may also be implemented by using a look-up table, for example, using the spliced local feature as a key and the corresponding global prior feature as a value.
Stitching refers to concatenating data together along a certain dimension or multiple dimensions of a feature. For example Is a 3-dimensional tensor of D x W x H, and if the dimensions of W are stitched, the stitched 3-dimensional tensor of D x NW x H. For example->Is a D x W x H3-dimensional tensor if the dimensions of W and HSplicing, namely splicing the dimension D multiplied by N 1 W×N 2 3-dimensional tensor of H, where N 1 N 2 =N。
Since the data of the client has an influence only within the packet, there is no influence between the packets. Packet input is required if the global a priori features are determined together for all clients' data, which affects the inter-group output. As an achievable manner, the first federal learning client may further send packet information to the server, where the packet information indicates a packet in which the local feature is located, so that the server obtains the global priori feature according to the local features from the plurality of clients and the packet in which the local feature is located.
The first local feature uploaded by the first federation learning client comprises a first sub-feature and a second sub-feature, the second local feature uploaded by the second federation learning client comprises a third sub-feature and a fourth sub-feature, the grouping information from the first federation learning client can indicate the grouping of the first sub-feature and the grouping of the second sub-feature, the grouping information from the second federation learning client indicates the grouping of the third sub-feature and the grouping of the fourth sub-feature, and if the grouping of the first sub-feature and the grouping of the third sub-feature are the same, the federation learning server can process the first sub-feature and the third sub-feature to obtain an intermediate feature, and then process the intermediate feature and the second sub-feature according to the grouping to obtain a processing result and the fourth sub-feature to obtain a global priori feature. Features uploaded for different types of clients can be processed separately, reducing the impact between different types of clients.
Specifically, the grouping information is information for grouping clients, the clients with the same land property (farmland, road and town) correspond to each other in the group, and the clients with different land properties are arranged between the groups. Different land properties have different impact on the transmission of signals, so performing packet processing reduces impact between different land clients. For example, on an expressway, data of vehicles (1, 2, 3) in the same direction or on the same lane are the same group, data of vehicles (4, 5, 6) in different directions are another group, and then grouping information uploaded by the vehicles (1, 2, 3) indicates the same group, and grouping information uploaded by the vehicles (4, 5, 6) indicates the same group.
The federal learning server receives local features uploaded by multiple clientsAnd corresponding grouping information->The grouping information can be obtained by +_ in step 103>Clustering is carried out respectively to obtain the product.
Will beIdentical->The method comprises the steps of splicing together, adopting different models according to a global priori feature extraction model, such as DNN, CNN, RNN, transformer, and corresponding to different splicing modes:
for models requiring simultaneous input of input data such as DNN and CNN, the model is used forThe positions not selected by the set are concatenated by global position, replaced by default values, as shown in the table below. For example, each client- >The global positions are shown in table 1 below:
TABLE 1
If it isAnd->Belongs to the same group; />A packet; />One packet is shown in table 2, table 3 and table 4 after the data is spliced:
TABLE 2
TABLE 3 Table 3
TABLE 4 Table 4
If RNN and transducer models are used, the input data may be input to the model one by one. Can belong to the same groupGlobal position codes are added and then input to the model. Global position codes are shown in the following table.
The sequence of the input model is
Inputting the spliced local features into a first layer model of the global priori feature extraction model to obtain implicit features of the first layer Representing implicit features of the mth group of the first layer.
Implicit features that output a first layer model of a global prior feature extraction modelAnd local features->Splice was performed as shown in table 5 below:
TABLE 5
Grouping the spliced local features into a second layer model of the global priori feature extraction model according to the first layer mode to obtain each group of the second layerThe grouping of the second layer may be based on the grouping of the first layer or may be based on the regrouping of all local features.
Implicit features that output a second layer model of a global prior feature extraction modelAnd local features- >Splicing, namely inputting the spliced result into a third layer model of the global priori feature extraction model to obtain the global priori features corresponding to each output client side>
Step 206, the federal learning server side sends global priori features to the plurality of clients respectively, and accordingly, the first federal learning client side receives the global priori features.
In this embodiment, after determining the global priori feature, the federal learning server may issue the global priori feature to the connected multiple clients.
Step 207, the first federal learning client performs reasoning according to the global priori features and the second network data to obtain a reasoning result.
In this embodiment, the first federal learning client may infer local second network data based on the global priori feature, to obtain an inference result of the second network data. Wherein the second time slot is the same as or after the first time slot, i.e. the second network data may be at the same time as the first network data or may be data after the time of the first network data, for example, assuming that the first network data is 9:00 data, the second network data may be 9: 00. by any time between 9:15, the embodiments of the present application are not limited in this regard.
In the reasoning process, the second network data may be directly processed, implicit features extracted from the second network data may be inferred, or one or more features may be selected from the implicit features to infer, which is not limited in the embodiment of the present application.
The global priori features are feature vectors or inference models, and the inference models are used for outputting inference results of the second network data according to the second network data.
And for the scene with the global priori features as feature vectors, the first federal learning client performs reasoning according to the global priori features, the second network data and a local second machine learning model to obtain a reasoning result, wherein the second machine learning model is used for reasoning of the second network data. Specifically, the first federal learning client needs to input the global priori features and the second network data into a local second machine learning model to perform reasoning, and the result output by the second machine learning model is used as a reasoning result.
The first federal learning client may also input the second network data to a third machine learning model (e.g., a local feature extraction model) before inputting the second network data to the second machine learning model, so as to obtain a plurality of features that may embody features of the second network data, and may save computing resources. The method that the first federal learning client inputs the global priori feature and the second network data into the local second machine learning model to perform reasoning may be that the first federal learning client inputs the global priori feature into the second machine learning model, so that weights corresponding to multiple features of the second network data output by the second machine learning model, that is, each feature corresponds to one weight, can be obtained, and then an reasoning result can be determined according to the multiple features of the second network data and the weights of the multiple features of the second network data.
Fig. 3 is a schematic diagram of an inference structure provided in the embodiment of the present application, as shown in fig. 3, a client calculates weight values corresponding to different implicit features according to global prior features, and specifically, a local feature extraction model adopts a multi-layer self-encoder.
Client C n Receiving global priori features issued by federal learning serverThe reasoning model (second machine learning model) adopts a DNN model, and a client C n Global a priori feature->Inputting an inference model, wherein the inference model outputs a plurality of implicit features +.>Corresponding weight value +.>And save the weight value/>
Client C n Acquiring data X (t) to be inferred at local t moment through a collector n Via data to be inferred X (t) n Multiple implicit features output by local feature extraction modelMultiple implicit features outputted according to local feature extraction model +.>Weight value outputted by inference model +.>And local anchor data setCalculating the distance r (t) between each anchor point and each classification n,m Wherein->
Selecting r (t) corresponding to all anchor point data sets n,m Y corresponding to the smallest value n,m As X (t) n Is provided.
Computing a plurality of implicit features of a local feature extraction model outputThe respectively corresponding classification results are used as evaluation indexes of output stability, namely
Selecting all anchor point data sets to correspond to Y corresponding to the smallest value n,m As X (t) n Implicit features->Selecting the class of +.>Y corresponding to the smallest value n,m As X (t) n Implicit features->Is selected from the class of all anchor data sets>Y corresponding to the smallest value n,m As X (t) n Implicit features->Is a category of (2).
Evaluation index of output stability = maximum number of same class/total number of implicit feature vectors.
Stability evaluation index=1 if three categories agree, stability evaluation index=2/3 if two categories agree, and stability evaluation index=1/3 if none of the categories agree.
The second machine learning model described above is a task model, the task model does not need to process the second network data, and the task model needs to process the second network data in the following description, where the second machine learning model includes a plurality of first task models, multiple features of the second network data may be processed respectively, specifically, after the first federal learning client obtains global priori features, the first federal learning client may determine weights of each of the plurality of first task models from the global priori features, the first federal learning client may input features of the second network data into the plurality of first task models, where the input features of the plurality of first task models may be default type features, that is, the features of the second network data are classified according to categories and input into the first task models of corresponding categories, and the first federal learning client may combine the features output by the plurality of first task models with the weights corresponding to the plurality of first task models to obtain the reasoning results.
The method that the first federal learning client determines the weights of the first task models according to the global priori features may be that the first federal learning client needs to combine the global priori features and the second network data to calculate the weights of the first task models, that is, calculate the weights by combining the features of the data needing to be inferred, so as to improve the accuracy of the weights.
The first federal learning client may also input the second network data to a third learning model (e.g., a local feature extraction model) before inputting the second network data to the second machine learning model, so as to obtain a plurality of features that may embody characteristics of the second network data, and may save computing resources.
Fig. 4 is a schematic diagram of another reasoning structure provided in an embodiment of the present application, please refer to fig. 4, client C n Receiving global priori features issued by federal learning serverAnd preserve the global a priori features->Wherein the inference model employs a moderating expert model, the hybrid expert model comprising a modelA model selector and a plurality of first task models, such as task model 1 through task model N in the figure.
Client C n Acquiring data X (t) to be inferred at local t moment through a collector n The reasoning model (second machine learning model) adopts a mixed expert model to make the data X (t) to be inferred n Global a priori featuresAnd (5) inputting a mixed expert model to calculate an inference result.
The mixed expert model consists of a model selector, a task model group consisting of a plurality of task models and a classifier: the model selector inputs global prior characteristics (second network data can be added) and outputs weights (the selection can be a weight of 0-1) to a plurality of task models in a task model group; the task model group consists of N task models, and each task model is output as an implicit characteristic vector according to input data; the classifier inputs the implicit characteristic vector after weighted summation of the weights output by the model selector, and outputs a classification result.
The method comprises the following specific steps:
the first step: the global a priori features (which may also be added to the second network data) calculate the weight values of the multiple task models through a "model selector".
The global prior feature is first input into a model selector model (the model can be realized by DNN, CNN, transformer), and model parameters are usedAnd (3) representing. The input-output relationship of the model selector model is expressed as: Or->
Then calculate the distance of g (t)' to each anchor pointThe distance may use a euclidean distance, a cosine distance, or a custom distance. The anchor set of model selector models is:
wherein the method comprises the steps ofAnd representing inputting the kth anchor point corresponding to the mth task model. />And representing an anchor point set corresponding to the mth task model.
Then selecting the optimal distance from each anchor point in the set of anchor points corresponding to each task model to g (t)', andthe optimal distance is related to the specific way the distance is calculated, and if the Euclidean distance is used, the optimal distance is the reciprocal of the minimum value of the Euclidean distance; if a cosine distance is used, the optimal distance is the cosine distance maximum.
Finally, calculating a weight vector g= [ g ] output by the model selector 1 g 2 …g n,m …g m,M ],g n,m Representing weights to an mth task network
And a second step of: and calculating an implicit feature extraction value of the task model selected by the local original feature through a model selector. Parameter of mth task networkRepresenting the input-output relationship of the mth task network as: />
And a third step of: the "weighted summer" performs weighted summation of the M weights output by the "model selector" on the M implicit feature vectors of the "task model group" (the implicit features of the unselected task models may not participate in the computation),
Fourth step: the classifier is realized by DNN, and the model parameters are as followsThe input-output relationship is:
in this embodiment, the global priori feature issued by the federal learning server may also be directly a model selector in the hybrid expert model, which is not limited herein, and the local feature extraction model may also be a hybrid expert model. The local feature extraction model (third machine learning model) may include a plurality of second task models, and the mode that the first federal learning client extracts the features of the second network data through the third machine learning model may be that the first federal learning client determines weights corresponding to the plurality of second task models according to the local data or the second network data, then inputs the second network data into the second task models, so as to obtain the sub-features of the second network data output by the second task models, and the first federal learning client may perform weighted average according to the weights corresponding to the plurality of second task models and the sub-features output by the second task models, so as to obtain the features of the second network data.
Fig. 5 is a schematic diagram of another reasoning structure provided in the embodiment of the present application, referring to fig. 5, the federal learning server generates task synchronization information, and issues the task synchronization information to a plurality of clients.
Client C n Receiving task synchronization information issued by a federal learning server side, and finding data X meeting the task synchronization information from local data n . Wherein meeting task synchronization information refers to data X n Is closest to the time of task synchronization information.
Client C n Implicit features are obtained through a local feature extraction modelWherein->As client C n Is a local feature of (a). The local feature extraction model is realized by mixing expert models, namely, the model is composed of a plurality of task models i and a model selector. The data selects a task model through a model selector to extract the characteristics to obtain local characteristics +.>
The task model i is obtained by decoupling characterization learning training, namely the local characteristics output by the task modelThe method can be divided into K groups, the characteristics in the groups have similarity, and the characteristics among the groups have specificity. Different neural network structures>Is of a different type and therefore->Either a scalar or a vector, a distance or a tensor, is specified as follows:
the local feature extractor employs a fully connected neural network or RNN, e.gThenIs a scalar, wherein W1, W2 are model parameters, which are a matrix; if it is At->Is a vector in which>And W1 is a model parameter, which is a matrix. Expression of introduction tensor calculation>W2 is a three-dimensional tensor, and DEG is tensor calculation; w2 can also be a higher tensor, in this case +.>Obtained->May be a matrix or tensor.
The local feature extractor adopts a convolutional neural network or a transducer neural networkIs a matrix, taking the image as an example, the image is obtained after passing through the convolutional layer of CNN>At this time->The feature map representing the mth channel corresponding to the convolution kernel m is a matrix. If a channel is to be formedThe data in the memory is averaged, then +.>Is a scalar.
The features in the groups are similar, the features between the groups are specific, and the similarity and the specificity can be measuredAnd->The distance between them is obtained, if m1 and m2 belong to the same feature group, their distance value is small, if they belong to different feature groups, their distance value is large.
Client C n UploadingTo the federal learning server, the federal learning server receives local features uploaded by a plurality of clients correspondingly>The federal learning server can upload local features +.>And the input features of the total prior feature extraction model are spliced, the global prior feature extraction model is input, the global prior feature is output, and the global prior feature is a task selector of each client reasoning model. And transmits the global priori features (task selector of inference model) corresponding to each client to the corresponding client C n
Client C n Receiving global priori features (task selector of reasoning model) issued by a federal learning server, and acquiring data X (t) to be reasoning at local t moment through a collector n Data X (t) to be inferred are said n Obtaining local characteristics h (t) n Local feature h (t) n Input reasoning modelAnd obtaining an inference result.
The reasoning model adopts a mixed expert model, namely, the model selector and the task model. The model selector is a global prior feature (task selector of the inference model). Each task model corresponds to a local feature h (t) n Features in each feature set have similarity. I.e. the input to the task model 1 is h (t) n Is set 1, the input to task model 2 is h (t) n Is set 2 features of task model 2, the input to task model 2 is h (t) n Is the nth set of features of (a).
Global a priori features can also be directly inference models, such as a classifier or regressor with model parametersWhere z represents the input of the classifier or regressor, < >>Representing classifier or regressor model parameters. The classifier or regressor of each client may be the same or different. This->Can be +.>
FIG. 6 is a schematic diagram of a computation graph provided in an embodiment of the present application, as shown in FIG. 6, the global prior feature may be a computation graph with a set of feature values and computation logic relationships to For example, wherein->The kth feature vector representing the nth client can be used as a part of model parameters to learn through data in the training process, or can be a manually set value; c (C) n Representing a computational flow graph of the nth client. The calculation diagrams of the clients may be the same or different. The calculation map can be obtained through data training or knowledge graph.
The left graph shows the input eigenvector v and the eigenvalues in the calculation graph to calculate the advance distance, i.e. v andeach value in (2) is obtained by calculating cosine distance
Then calculate the following logic operation
The right graph is compared with the threshold value after the cosine distance is calculated, and is 1 larger than the threshold value and 0 smaller than the threshold value, namely
/>
Then calculate the logical operation, the addition of which corresponds toIs an or operation within the logic, and the multiplication corresponds to an and operation within the logic:
the global a priori feature may be a knowledge graph, for example image recognition. The local feature extractor is obtained by decoupling characterization learning training, that is, the implicit features obtained by the local feature extractor can be grouped into learning with different self concepts, such as wheels, vehicle doors, vehicle windows, vehicle lamps, eyes of a person, nose of a person, mouth of a person and the like, that is, the implicit features h= { h (1) =wheels, h (2) =vehicle doors, h (3) =vehicle windows, h (4) =vehicle lamps, h (5) =eyes of a person, h (6) =nose of a person, h (7) =mouth of a person }.
Fig. 7 is a schematic diagram of a knowledge graph provided in an embodiment of the present application, as shown in fig. 7, where the local feature extractor calculates not only the values of the features but also the positions of the features when participating in reasoning. The activation represents the sum of the values of a certain channel of the local feature extractor or a combination of channels or a certain pixel value within a channel being larger than a threshold value.
When the client directly uses the trained local feature extraction model and the inference model:
client C n Global priori features issued by a federal learning server are received, and a graph is calculatedOr classifier/regressor->Or knowledge graph, and storing the calculation graph locally +.>Or classifier/regressor->Or a knowledge graph.
Client C n Acquiring local by collectorData X (t) to be inferred at time t n Data X (t) to be inferred n Multiple implicit features output by local feature extraction modelThen will->Input calculation mapOr classifier/regressor->Or obtaining an output result by the knowledge graph.
When the client also needs to retrain the inference model: the first federal learning client trains the first machine learning model by using the sample data, extracts the characteristics of the second network data, and then inputs the characteristics of the second network data into the trained first machine learning model to obtain an inference result output by the trained first machine learning model.
Specifically, client C n Global priori features issued by a federal learning server are received, and a graph is calculatedOr classifier/regressor->Or knowledge graph, initializing local reasoning local feature extraction model parameters, and using +.>Or classifier/regressor->Or the knowledge graph is used as a classifier or a regressive to form an inference model, and then the local data training is usedTraining reasoning model->And save the training-ended reasoning model->Wherein, the local data are history data and corresponding history reasoning results.
Client C n Acquiring data X (t) to be inferred at local t moment through a collector n Data to be inferred X (t) n By inference modelAnd obtaining an output result.
The federal learning server determines that the global prior feature requires the first local feature and the second local feature, and may also determine in combination with the historical local feature from the first federal learning client and the historical local feature from the second federal learning client. The federation learning server side can calculate the similarity between the local features of the current reasoning process and the plurality of groups of historical local features, namely, determine the similarity between the first local feature and the historical local features from the first federation learning client side in each group of historical local features, determine the similarity between the second local feature and the historical local features from the second federation learning client side in each group of historical local features, and then weight and sum the historical prior features corresponding to the plurality of groups of historical local features based on the similarity to obtain the required global prior feature.
In one possible embodiment, the plurality of sets of historical local features may also have labels that are actual results of each set of historical local features that are artificially labeled and that can be compared to the inference results to determine the accuracy of the inference results. After the client obtains the reasoning results, the client can upload the reasoning results to the federal learning server, the federal learning server can determine a target reasoning result corresponding to the current global priori feature according to the reasoning results of the plurality of clients, then the target group historical local feature labeled as the target reasoning result can be selected from a plurality of groups of historical local features, if the similarity between the local feature of the current reasoning process and the target group historical local feature is greater than or equal to a threshold value, the target group historical local feature can be updated according to the similarity between the local feature of the current reasoning process and the target group historical local feature, and if the similarity between the local feature of the current reasoning process and the target group historical local feature is smaller than the threshold value, the local feature of the current reasoning process can be stored as a new group of historical local features.
Fig. 8 is a schematic diagram of another inference structure provided in an embodiment of the present application, please refer to fig. 8, in which a federal learning server receives local features uploaded by a plurality of clients Computing local features by global prior model>And +.>Is based on the similarity of the pair +.>The corresponding hg (m) is weighted and summed asIs a global a priori feature of (c).
The sample library of the federal learning server is And (3) representing a historical local feature set corresponding to the mth sample data, and hg (m) representing a historical global priori feature corresponding to the mth sample data.
Similarity can be measured using Euclidean, cosine distance, or trained neural networks, taking cosine distance as an example:
the weighted summation is:
the federal learning server side transmits global priori features hg to each client side, each client side obtains data to be inferred, and a local inference result is obtained by combining the global priori features hg. And uploading the reasoning result to the federal learning server, and uploading a label marked afterwards to the federal learning server if the client has a post-observation marking function.
The federal learning server receives the reasoning result or label uploaded by the client and counts the labelClass y with the largest number of corresponding classes, from +.>All of the classes found are the same samples, i.e., y (m) is the same as y. Calculate->The similarity of the samples of the same class as the class, such as Euclidean distance, cosine distance or neural network, for the sample data with similarity greater than the threshold value, updating the corresponding hg (m) according to the similarity, if all the selected samples do not meet the threshold value, the similarity is equal to the threshold value >As a new example.
Taking cosine distance as an example:
similarity:
updating hg (m): hg (m) =r (m) hg+ (1-r (m))hg (m).
After obtaining the reasoning result and the artificially marked labels, the client can train the global priori model, the local feature extraction model and the reasoning model in fig. 3, specifically, each client calculates errors of the reasoning result and the labels, calculates errors of the global priori features by using error back propagation, and uploads the errors of the global priori features to the federal learning server. The federal learning server receives the errors of the global priori features uploaded by the clients, calculates errors of the local features by using error back propagation, and sends the errors to the clients. And each client and the federal learning server use respective error calculation model gradients and update models, wherein the client updates the inference model according to the inference result and the errors of the labels, updates the local feature extraction model according to the errors of the local features, and the federal learning server updates the global priori local feature extraction model according to the errors of the global priori features.
The federation learning server may further store part of the data of the plurality of clients, use the part of the data as local data of the federation learning server, and train an inference initial model based on the local data of the federation learning server, which is not limited herein.
Fig. 9 is a schematic diagram of an acquisition mode of local data of a federal learning server according to an embodiment of the present application, and as shown in fig. 9, the federal learning server trains a "differential decision model" by using the local data and issues the differential decision model to a client side. The client side calculates a difference index by using a difference judgment model issued by the federal learning server side, samples the difference index, and uploads the sampled difference index to the federal learning server side. The federal learning server collects the difference indexes for clustering to obtain a global difference index class center, and then the global difference index class center is issued to the client.
The client side receives the differential index type center issued by the cloud side, classifies the local data of the client side to various centers, counts the classification result, counts the data in various centers, counts the number of data far from the centers and the number of important data, and uploads the statistics result to the federal learning server. Wherein the client side local data includes general data and important data, the important data may be intrusion data, error case data, voting inconsistency data, or data far from an anchor point, which is not limited herein.
The federal learning server receives the statistical result, generates an acquisition strategy (different from each client side) by combining local data, and issues the acquisition strategy to each client.
The client side receives the acquisition strategy issued by the federation learning server side and acquires data, samples the acquired data, and uploads the sampled data to the federation learning server side. The federal learning server adopts an active learning mode to label the uploaded data and issues the data to each client side.
The federal learning server trains the tagging model using the local data and issues the client side. The client side uses the labeling model to label the local data, and then trains the local reasoning model according to the labeled local data and the labeled data issued by the federal learning server side.
The federation learning server may also compress local data of the federation learning server, e.g., compress data using a training data generator. The federal learning server can also train an inference model based on the local data as an initial inference model of the client, which is not described in detail herein.
For training of the hybrid expert model in fig. 4, training is required in a manner of fixed parameter task selector generation, task model selective delivery, model pre-training, model training and model strategic training.
Fig. 10 is a schematic flow chart of a fixed parameter task selector generation according to an embodiment of the present application, and specifically please refer to fig. 10:
step 1001, generating clustered K1 class center information by the federal learning server according to the class.
Step 1002, the federal learning server side issues K1 class center information to the client side.
And 1003, receiving K1 clustering center information by the client, classifying the K-class data into the K1 clustering center information according to the clustering center information, counting the mean value and the number of the data of each clustering center, and uploading the data to the federation learning server.
Step 1004, the client uploads the mean value and the number of the data of each clustering center to the federation learning server.
Step 1005. The federation learning server calculates the mean value of the data of each global clustering center to obtain new clustering center information.
Step 1006. The federation learning server determines whether the new cluster center information is converged, if not, the step 1002 is shifted, and if so, the step 1007 is shifted. The condition for convergence may be that the new cluster center information has a smaller amount of change than the previous cluster center information than a threshold value, or that the number of iterations is reached.
Step 1007. The federal learning server issues new K1 cluster center information.
Step 1008, the client counts the number of the K classes of data corresponding to the K1 clustering center information respectively.
Step 1009, the client uploads the data of K classes to the federal learning server according to the number of the K1 clustering center information.
Step 1010. The federal learning server receives the number of K classes of data uploaded by each client and corresponding to the K1 clustering center information respectively, and generates a fixed parameter task selector rule.
Exemplary, scheme 1: for scenes that require privacy protection:
the number of K classes of data uploaded by the client n and respectively corresponding to K1 clustering center information is matrix A n
And the kth cluster center information which belongs to the kth 1 and is uploaded by the nth client is represented.
The first step: calculating A of the global K category data corresponding to K1 clustering center information:
wherein the method comprises the steps ofa k =[a k,1 …a k,k1 ]Representing the global data volume of the kth class corresponding to the K1 cluster center information.
And a second step of: and selecting the cluster number corresponding to the cluster center information corresponding to the topK1 with the largest quantity for each category as an expert for the category selection.
/>
Wherein b k =[b k,1 …b k,k1 ]In which there are topK1 values other than zero, b k The position of non-zero in the sequence is a k The position corresponding to the middle topK 1. b k The value of the non-zero position in (a) may be 1 or may be a weighted value of the data amount, where the weighted value is:
In order to meet the task balance among the experts on the basis, each expert is set to be available for topK2 at most, and when the number of times the expert is selected exceeds topK2, the expert is not selected at the time of selecting topK 1. The order of expert selection may be as followsThe expert is sorted by the selected data amount, starting with the expert with the highest data amount, such as a k,k1 If the value of (2) is maximum, selecting an expert from the kth class.
B as fixed parameter task selector rules.
Scheme 2: as few scenes as possible as optimization targets for the number of experts selected by each user:
the first step: pair A n Normalizing the data belonging to different clustering center information in each class to obtain the probability that each class of data belongs to different clustering center information:
wherein the method comprises the steps of
And a second step of: will beClustering K3 pieces of cluster center information, belonging to the same cluster center information>Divided into one group.
And a third step of: as in scheme 1, only the number of experts in each group is limited to be different by too many topks 4, and if the number of the experts exceeds the number of the experts corresponding to the topks 4, only the expert with the largest data amount is selected as the expert of the corresponding category.
Fourth step: for the matching relationship between the inconsistent classes and the experts in the group, for example, if 10 groups are assumed in the mode that the matching between the classes and the experts is the most, if the number of groups corresponding to the expert 2 is the most, the expert 2 is selected.
Fig. 11 is a schematic flow chart of task model selection provided in the embodiment of the present application, please refer to fig. 11:
step 1101. The federal learning server issues fixed parameter task selector rules.
Step 1102. The client receives the fixed parameter task selector rule, and counts the task model ID number corresponding to the local data.
Step 1103. The client uploads the task model ID to the federation learning server.
Step 1104. The federation learning server matches each client task model ID and matches the corresponding task model and parameter trainable task model selector.
Fig. 12 is a schematic flow chart of model pre-training provided in the embodiment of the present application, please refer to fig. 12:
step 1201. The federal learning server issues a task model and a task model selector with trainable parameters.
Step 1202. The client receives a task model selector with trainable parameters, and counts task model ID numbers corresponding to the local data.
Step 1203, the client uploads the task model ID to the federation learning server.
Step 1204. The federation learning server matches each client task model ID and matches the corresponding task model.
Step 1205. The federal learning server issues a task model.
Step 1206, the client uses the local data, uses the task model of the federation learning server as an initial value of the task model of the client, selects the task model through the fixed parameter task selector, and trains a plurality of rounds of task models of the client.
Step 1207, the client uses the local data, uses the task model selector with trainable parameters of the federal learning server as an initial value of the task model selector with trainable parameters of the client, marks the data with pseudo labels through the task selector with fixed parameters, and trains the task model selector with trainable parameters for a plurality of rounds.
Step 1208. The client uploads the updated task model of the client and the task model selector with trainable parameters.
Step 1209, the federation learning server receives the task model and the parameter trainable task model selector of the client updated by each client, and obtains the task model and the parameter trainable task model selector of the federation learning server by weighted average.
Step 1210. Repeat steps 1201 to 1209 several times.
Fig. 13 is a schematic flow chart of model training provided in the embodiment of the present application, please refer to fig. 13:
step 1301, the federation learning server transmits a task model selector with trainable parameters to the federation learning server.
Step 1302. The client receives a task model selector with trainable parameters of the federation learning server, and counts task model ID numbers corresponding to the local data.
Step 1303, the client uploads the task model ID number to the federal learning server.
Step 1304, the federation learning server receives the task model IDs of the clients and matches the task models of the corresponding federation learning server.
Step 1305, the federation learning server issues a task model of the corresponding federation learning server.
Step 1306, the client uses the local data, takes the task model of the federal learning server as an initial model, selects the task model through a task model selector which can be trained by parameters of the federal learning server, and trains a task model N round of the client.
Step 1307, the client side uploads the updated task model of the client side to the federation learning server side.
Step 1308, the federation learning server receives the updated task model of the client from each client, and obtains the task model of the federation learning server by weighted average.
Step 1309. The federation learning server side issues the updated task model of the federation learning server side to the client side.
Step 1310, the client uses the local data, takes the task model of the federal learning server as the task model of the client, takes the task model selector with trainable parameters of the federal learning server as the initial value of the task model selector with trainable parameters of the client, selects the task model through the task model selector with trainable parameters of the client, and trains the task model selector with trainable parameters of the client for N rounds.
Step 1311. Client and upload updated client parameter trainable task model selectors.
Step 1312. The federation learning server receives the task model selectors trainable by the updated client parameters, and performs weighted average to obtain the task model selectors trainable by the federation learning server parameters.
Step 1313. Repeat steps 1301 through 1312 several times.
Fig. 14 is a schematic flow chart of model training according to a policy provided in an embodiment of the present application, please refer to fig. 14:
step 1401. The federation learning server transmits a task model selector with trainable parameters to the federation learning server.
Step 1402, the client receives a task model selector with trainable parameters of the federation learning server, and counts the number of each task model processing various data of the local data.
Step 1403, the client side adds the traditional counting result to the federal learning server side.
Step 1404. The federal learning server receives the number of the various task models of each client to process various data, and counts the number of the various task models of the global to process various data.
Step 1405. The federal learning server generates a task model and various data correspondence.
Step 1406. The federal learning server issues a task model and a corresponding relationship between the task model and various data.
Step 1407, the client uses local data, uses a task model of the federal learning server as an initial value of the task model of the client, selects the task model of the client as a student according to the corresponding relation between the task model and various data, uses the task model of the federal learning server as a teacher model, and trains a task model N round of the client through knowledge distillation.
Step 1408. The client uses the local data, uses the task model selector with trainable parameters of the federal learning server as an initial value of the task model selector with trainable parameters of the client, and trains the task model selector with trainable parameters by labeling the data with pseudo labels according to the corresponding relation between the task model and various data.
Step 1409, the client uploads the updated task model of the client and the task model selector with trainable parameters.
Step 1410, the federation learning server receives each task model selector with trainable updated client parameters, and performs weighted average to obtain the task model selector with trainable federation learning server parameters.
Step 1411. Repeat steps 1401 to 1410 several times.
In the embodiment of the application, the first federal learning client collects first network data of a first time slot and related to a target network resource, extracts first local features of the first network data and sends the first local features to the federal learning server, the federal learning server collects local features uploaded by a plurality of clients to calculate global priori features and sends the global priori features to each client, and the first federal learning client can infer second network data of a second time slot according to the global priori features, so that information of data of other clients except the local data is utilized in an inference process, and accuracy of an inference result is improved.
The above teaches an inference method and the following describes the means for performing the method.
Fig. 15 is a schematic structural diagram of an inference apparatus provided in an embodiment of the present application, referring to fig. 15, the apparatus 150 includes:
the transceiver 1501 is configured to send a first local feature to the federation learning server, where the first local feature is extracted from first network data, the first network data is data related to a target network resource acquired by a first federation learning client in a first slot, the target network resource is a network resource managed by the first federation learning client, the global priori feature is received from the federation learning server, the global priori feature is obtained according to the first local feature and a second local feature, and the second local feature is provided by the second federation learning client;
the processing unit 1502 is configured to perform reasoning according to the global priori feature and second network data to obtain a reasoning result, where the second network data is data related to the target network resource acquired by the first federal learning client in a second time slot, and the reasoning result is used to manage the target network resource, where the second time slot is the same as or after the first time slot.
The transceiver unit 1501 is configured to perform steps 204 and 206 in the method embodiment of fig. 2, and the processing unit 1502 is configured to perform step 207 in the method embodiment of fig. 2.
Optionally, the first network data is a sampling value of data related to the target network resource in the first time slot or a statistical value from the third time slot to the first time slot, the second network data is a sampling value of data related to the target network resource in the second time slot, and the third time slot is before the first time slot.
Optionally, the global a priori feature is a feature vector or a first machine learning model, the first machine learning model being used for reasoning about the second network data.
Optionally, in the case that the global a priori feature is a feature vector, the processing unit 1502 is specifically configured to:
and carrying out reasoning according to the global priori features, the second network data and a local second machine learning model to obtain a reasoning result, wherein the second machine learning model is used for reasoning the second network data.
Optionally, the processing unit 1502 is specifically configured to: inputting the second network data into a third learning model to obtain a plurality of characteristics of the second network data output by the third learning model; inputting the global prior feature into a second machine learning model to obtain weights for each of a plurality of features of the second network data; and determining an inference result according to the plurality of characteristics of the second network data and the weights of the plurality of characteristics of the second network data.
Optionally, the second machine learning model includes a plurality of first task models; the processing unit 1502 is specifically configured to: according to the global priori features, calculating the weight of each of the plurality of first task models; inputting the characteristics of the second network data into a plurality of first task models to obtain the reasoning characteristics output by the plurality of first task models; and obtaining an inference result according to the weights of the first task models and the inference characteristics output by the first task models.
Optionally, the processing unit 1502 is specifically configured to: and calculating the weight of each of the plurality of first task models according to the global priori features and the second network data.
Optionally, the processing unit is further configured to: and extracting the characteristics of the second network data through a third machine learning model.
Optionally, the third machine learning model includes a plurality of second task models; the processing unit 1502 is specifically configured to: determining the weight of each of a plurality of second task models according to the second network data; inputting the second network data into a plurality of second task models to obtain sub-features of the second network data output by the plurality of second task models; and obtaining the characteristics of the second network data according to the weights of the second task models and the sub-characteristics of the second network data output by the second task models.
Optionally, each second task model is a layer of self-encoder, and the reconstruction target of the r-th task model in the plurality of second task models is a residual error of the r-1-th task model, wherein r is an integer greater than 1 and represents the number of the second task models.
Alternatively, in the case where the global a priori feature is the first machine learning model, the processing unit 1502 is specifically configured to: extracting characteristics of the second network data; and inputting the characteristics of the second network data into the first machine learning model to obtain an inference result output by the first machine learning model.
Alternatively, in the case where the global a priori feature is the first machine learning model, the processing unit 1502 is specifically configured to: training a first machine learning model using the sample data; extracting characteristics of the second network data; and inputting the characteristics of the second network data into the trained first machine learning model to obtain an inference result output by the trained first machine learning model.
Optionally, the transceiver unit 1501 is further configured to: and sending grouping information to the federal learning server, wherein the grouping information indicates a grouping where the first local feature is located, so that the federal learning server obtains a global priori feature according to the first local feature, the grouping where the first local feature is located, the second local feature and the grouping where the second local feature is located.
Optionally, the transceiver unit 1501 is further configured to: receiving task synchronization information from a federal learning server, wherein the task synchronization information is used for indicating a first time slot; and selecting the first network data from the local data according to the task synchronization information.
Fig. 16 is a schematic structural diagram of another inference apparatus provided in the embodiment of the present application, referring to fig. 16, the apparatus 160 includes:
the transceiver 1601 is configured to receive a first local feature from a first federal learning client, where the first local feature is extracted from first network data, the first network data is data related to a target network resource acquired by the first federal learning client in a first slot, and the target network resource is a network resource managed by the first federal learning client;
a processing unit 1602, configured to obtain a global prior feature according to a first local feature and a second local feature, where the second local feature is provided by a second linkage learning client;
the transceiver 1601 is configured to send the global priori feature to the first federal learning client, so that the first federal learning client performs reasoning according to the global priori feature and second network data to obtain a reasoning result, where the second network data is data related to the target network resource acquired by the first federal learning client in a second time slot, and the reasoning result is used to manage the target network resource, and the second time slot is the same as or after the first time slot.
Wherein the transceiver unit 1601 is configured to perform step 204 and step 206 in the method embodiment of fig. 2, and the processing unit 1602 is configured to perform step 205 in the method embodiment of fig. 2.
Optionally, the first network data is a sampling value of data related to the target network resource in the first time slot or a statistical value from the third time slot to the first time slot, the second network data is a sampling value of data related to the target network resource in the second time slot, and the third time slot is before the first time slot.
Optionally, the global a priori feature is a feature vector or a first machine learning model, the first machine learning model being used for reasoning about the second network data.
Optionally, the transceiver unit 1601 is further configured to: receiving packet information from a first federal learning client, the packet information from the first federal learning client indicating a packet in which the first local feature is located;
the processing unit 1602 is specifically configured to: and obtaining the global prior feature according to the first local feature, the group in which the first local feature is located, the second local feature and the group in which the second local feature is located, wherein the group in which the second local feature is located is indicated by group information from the second linkage learning client.
Optionally, the first local feature includes a first sub-feature and a second sub-feature, the second local feature includes a third sub-feature and a fourth sub-feature, the packet information from the first federal learning client indicates a packet in which the first sub-feature is located and a packet in which the second sub-feature is located, the packet information from the second federal learning client indicates a packet in which the third sub-feature is located and a packet in which the fourth sub-feature is located, and the packet in which the first sub-feature is located and the packet in which the third sub-feature is located are the same;
the processing unit 1602 is specifically configured to: based on the fact that the group in which the first sub-feature is located is the same as the group in which the third sub-feature is located, the federal learning server side processes the first sub-feature and the third sub-feature to obtain an intermediate feature; and obtaining the global priori feature according to the intermediate feature, the second sub-feature, the fourth sub-feature, the group in which the second sub-feature is located and the group in which the fourth sub-feature is located.
Optionally, the processing unit 1602 is specifically configured to: and obtaining a global priori feature according to the first local feature, the second local feature, the historical local feature from the first federal learning client and the historical local feature from the second federal learning client.
Optionally, the processing unit 1602 is specifically configured to: calculating the similarity of local features of a current reasoning process and a plurality of groups of historical local features, wherein the local features of the current reasoning process comprise a first local feature and a second local feature, and each group of historical local features comprise historical local features from a first federal learning client and historical local features from a second federal learning client in one historical reasoning process; and according to the similarity between the local features of the current reasoning process and the multiple groups of historical local features, weighting and summing the historical prior features corresponding to the multiple groups of historical local features to obtain the global prior features.
Optionally, the plurality of groups of history local features are provided with labels, and the labels are actual results of each group of history local features marked artificially; the processing unit 1602 is also configured to: receiving an inference result from a first federal learning client; determining a target reasoning result according to the reasoning result of the first federation learning client and the reasoning result of the second federation learning client; under the condition that the similarity between the local feature of the current reasoning process and the historical local feature of the target group is greater than or equal to a threshold value, updating the historical local feature of the target group according to the similarity between the local feature of the current reasoning process and the historical local feature of the target group, wherein the historical local feature of the target group is a target reasoning result in a plurality of groups of historical local features; and under the condition that the similarity between the local features of the current reasoning process and the historical local features of the target group is smaller than a threshold value, adding a group of historical local features on the basis of a plurality of groups of historical local features, wherein the added group of historical local features are the local features of the current reasoning process.
Optionally, the transceiver unit 1601 is further configured to: and sending task synchronization information to the first federal learning client, wherein the task synchronization information is used for indicating a first time slot so that the first federal learning client can select first network data from local data according to the task synchronization information.
Fig. 17 is a schematic diagram of a possible logic structure of a computer device 170 according to an embodiment of the present application. The computer device 170 includes: a processor 1701, a communication interface 1702, a memory system 1703, and a bus 1704. The processor 1701, communication interface 1702, and storage system 1703 are interconnected by a bus 1704. In the embodiment of the present application, the processor 1701 is configured to control and manage the actions of the computer device 170, for example, the processor 1701 is configured to perform the steps performed by the transmitting end in the method embodiment of fig. 2. The communication interface 1702 is used to support communication by the computer device 170. A memory system 1703 for storing program codes and data for the computer device 170.
The processor 1701 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor 1701 may also be a combination that performs computing functions, such as including one or more microprocessors, a combination of digital signal processors and microprocessors, and the like. The bus 1704 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 17, but not only one bus or one type of bus.
The transceiving unit 1501 in the apparatus 150 corresponds to the communication interface 1702 in the computer device 170, and the processing unit 1502 in the apparatus 150 corresponds to the processor 1701 in the computer device 170.
The computer device 170 of this embodiment may correspond to the first federal learning client in the embodiment of the method of fig. 2, and the communication interface 1702 in the computer device 170 may implement the functions and/or the various steps implemented by the first federal learning client in the embodiment of the method of fig. 2, which are not described herein for brevity.
Fig. 18 is a schematic diagram of another possible logic structure of a computer device 180 according to an embodiment of the present application. The computer device 180 includes: a processor 1801, a communication interface 1802, a storage system 1803, and a bus 1804. The processor 1801, the communication interface 1802, and the storage system 1803 are interconnected by a bus 1804. In the embodiment of the present application, the processor 1801 is configured to control and manage the actions of the computer device 180, for example, the processor 1801 is configured to perform the steps performed by the receiving end in the method embodiment of fig. 2. The communication interface 1802 is used to support communication by the computer device 180. A storage system 1803 for storing program code and data for computer device 180.
The processor 1801 may be a central processor unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor 1801 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessors, a combination of digital signal processors and microprocessors, and the like. The bus 1804 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 18, but not only one bus or one type of bus.
The transceiving unit 1601 in the apparatus 160 corresponds to the communication interface 1802 in the computer device 180, and the processing unit 1602 in the apparatus 160 corresponds to the processor 1801 in the computer device 180.
The computer device 180 of the present embodiment may correspond to the receiving end in the foregoing method embodiment of fig. 2, and the communication interface 1802 in the computer device 180 may implement the functions and/or the various steps implemented by the receiving end in the foregoing method embodiment of fig. 2, which are not described herein for brevity.
It should be understood that the division of the units in the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated when actually implemented. And the units in the device can be all realized in the form of software calls through the processing element; or can be realized in hardware; it is also possible that part of the units are implemented in the form of software, which is called by the processing element, and part of the units are implemented in the form of hardware. For example, each unit may be a processing element that is set up separately, may be implemented as integrated in a certain chip of the apparatus, or may be stored in a memory in the form of a program, and the functions of the unit may be called and executed by a certain processing element of the apparatus. Furthermore, all or part of these units may be integrated together or may be implemented independently. The processing element described herein may in turn be a processor, which may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in a processor element or in the form of software called by a processing element.
In one example, the unit in any of the above apparatuses may be one or more integrated circuits configured to implement the above methods, for example: one or more specific integrated circuits (application specific integrated circuit, ASIC), or one or more microprocessors (digital singnal processor, DSP), or one or more field programmable gate arrays (field programmable gate array, FPGA), or a combination of at least two of these integrated circuit forms. For another example, when the units in the apparatus may be implemented in the form of a scheduler of processing elements, the processing elements may be general-purpose processors, such as a central processing unit (central processing unit, CPU) or other processor that may invoke the program. For another example, the units may be integrated together and implemented in the form of a system-on-a-chip (SOC).
In another embodiment of the present application, there is further provided a computer readable storage medium having stored therein computer executable instructions that, when executed by a processor of a device, perform the method performed by the first federal learning client in the method embodiment described above.
In another embodiment of the present application, there is further provided a computer readable storage medium, where computer executable instructions are stored, where when executed by a processor of a device, the device performs a method performed by the federal learning server in the method embodiment described above.
In another embodiment of the present application, there is also provided a computer program product comprising computer-executable instructions stored in a computer-readable storage medium. When the processor of the device executes the computer-executable instructions, the device performs the method performed by the first federal learning client in the method embodiment described above.
In another embodiment of the present application, there is also provided a computer program product comprising computer-executable instructions stored in a computer-readable storage medium. When the processor of the device executes the computer-executable instructions, the device executes the method executed by the federal learning server in the method embodiment described above.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (30)

1. A method of reasoning, the method comprising:
the method comprises the steps that a first federal learning client sends first local features to a federal learning server, wherein the first local features are extracted from first network data, the first network data are data which are acquired by the first federal learning client in a first time slot and related to target network resources, and the target network resources are network resources managed by the first federal learning client;
The first federation learning client receives global priori features from the federation learning server, the global priori features are obtained according to the first local features and second local features, and the second local features are provided by a second federation learning client;
the first federal learning client performs reasoning according to the global priori feature and second network data to obtain a reasoning result, wherein the second network data is data related to the target network resource, which is acquired by the first federal learning client in a second time slot, and the reasoning result is used for managing the target network resource, and the second time slot is the same as or after the first time slot.
2. The method of claim 1, wherein the first network data is a sampled value of the data related to the target network resource at the first time slot or a statistical value from a third time slot to the first time slot, and wherein the second network data is a sampled value of the data related to the target network resource at the second time slot, the third time slot preceding the first time slot.
3. The method according to claim 1 or 2, characterized in that the global a priori feature is a feature vector or a first machine learning model for reasoning about the second network data.
4. A method according to claim 3, wherein, in the case where the global a priori feature is the feature vector, the first federal learning client infers from the global a priori feature and second network data to obtain an inference result comprises:
the first federal learning client performs reasoning according to the global priori feature, the second network data and a local second machine learning model to obtain a reasoning result, wherein the second machine learning model is used for reasoning of the second network data.
5. The method of claim 4, wherein the first federal learning client infers from the global prior feature, second network data, and a local second machine learning model to obtain an inference result comprises:
the first federal learning client inputs second network data into a third learning model to obtain a plurality of features of the second network data output by the third learning model;
the first federal learning client inputs the global prior feature into the second machine learning model to obtain weights for each of a plurality of features of the second network data;
The first federal learning client determines the reasoning result according to the plurality of features of the second network data and the weights of the plurality of features of the second network data.
6. The method of claim 4, wherein the second machine learning model comprises a plurality of first task models;
the first federal learning client performs reasoning according to the global priori feature, the second network data and the local second machine learning model to obtain a reasoning result, wherein the reasoning result comprises:
the first federal learning client calculates the weight of each of the plurality of first task models according to the global priori features;
the first federal learning client inputs the characteristics of the second network data into the plurality of first task models to obtain the reasoning characteristics output by the plurality of first task models;
and the first federal learning client obtains an inference result according to the weights of the plurality of first task models and the inference characteristics output by the plurality of first task models.
7. The method of claim 6, wherein the first federal learning client computing weights for each of the plurality of first task models based on the global prior feature comprises:
And the first federal learning client calculates the respective weights of the plurality of first task models according to the global priori features and the second network data.
8. The method according to claim 6 or 7, characterized in that the method further comprises:
the first federal learning client extracts features of the second network data through a third machine learning model.
9. The method of claim 8, wherein the third machine learning model comprises a plurality of second task models;
the first federal learning client extracting features of the second network data through a third machine learning model includes:
the first federal learning client determines the weight of each of the plurality of second task models according to the second network data;
the first federation learning client inputs the second network data into the plurality of second task models to obtain sub-features of the second network data output by the plurality of second task models;
and obtaining the characteristics of the second network data according to the weights of the second task models and the sub-characteristics of the second network data output by the second task models.
10. The method of claim 9, wherein each second task model is a layer of self-encoders, and the reconstruction target of an r-1 st task model of the plurality of second task models is a residual of an r-1 st task model, wherein r is an integer greater than 1 and represents the number of second task models.
11. A method according to claim 3, wherein, in the case where the global a priori feature is the first machine learning model, the first federal learning client infers from the global a priori feature and second network data to obtain an inference result comprises:
the first federal learning client extracts the features of the second network data;
the first federal learning client inputs the characteristics of the second network data into the first machine learning model to obtain an inference result output by the first machine learning model.
12. A method according to claim 3, wherein, in the case where the global a priori feature is the first machine learning model, the first federal learning client infers from the global a priori feature and second network data to obtain an inference result comprises:
The first federal learning client trains the first machine learning model with sample data;
the first federal learning client extracts the features of the second network data;
the first federal learning client inputs the characteristics of the second network data into the trained first machine learning model to obtain an inference result output by the trained first machine learning model.
13. The method according to any one of claims 1 to 12, further comprising:
the first federal learning client sends grouping information to the federal learning server, wherein the grouping information indicates a group in which the first local feature is located, so that the federal learning server obtains a global priori feature according to the first local feature, the group in which the first local feature is located, the second local feature and the group in which the second local feature is located.
14. The method according to any one of claims 1 to 13, further comprising:
the first federation learning client receives task synchronization information from the federation learning server, wherein the task synchronization information is used for indicating the first time slot;
And the first federal learning client selects the first network data from the local data according to the task synchronization information.
15. A method of reasoning, the method comprising:
the method comprises the steps that a federation learning server receives first local features from a first federation learning client, wherein the first local features are extracted from first network data, the first network data are data related to target network resources acquired by the first federation learning client in a first time slot, and the target network resources are network resources managed by the first federation learning client;
the federation learning server obtains global priori features according to the first local features and the second local features, wherein the second local features are provided by a second federation learning client;
the federal learning server side sends the global priori feature to the first federal learning client side, so that the first federal learning client side performs reasoning according to the global priori feature and second network data to obtain a reasoning result, the second network data is data related to the target network resource, which is acquired by the first federal learning client side in a second time slot, and the reasoning result is used for managing the target network resource, wherein the second time slot is the same as or after the first time slot.
16. The method of claim 15, wherein the first network data is a sampled value of the data related to the target network resource at the first time slot or a statistical value from a third time slot to the first time slot, and wherein the second network data is a sampled value of the data related to the target network resource at the second time slot, the third time slot preceding the first time slot.
17. The method according to claim 15 or 16, wherein the global a priori feature is a feature vector or a first machine learning model for reasoning about the second network data.
18. The method according to any one of claims 15 to 17, further comprising:
the federation learning server receives grouping information from the first federation learning client, wherein the grouping information from the first federation learning client indicates a grouping in which the first local feature is located;
the federal learning server obtains global prior features according to the first local features and the second local features, wherein the obtaining global prior features comprises:
the federation learning server obtains global priori features according to the first local features, the group in which the first local features are located, the second local features and the group in which the second local features are located, wherein the group in which the second local features are located is indicated by grouping information from the second federation learning client.
19. The method of claim 18, wherein the first local feature comprises a first sub-feature and a second sub-feature, the second local feature comprises a third sub-feature and a fourth sub-feature, the packet information from the first federal learning client indicates a packet in which the first sub-feature is located and a packet in which the second sub-feature is located, the packet information from the second federal learning client indicates a packet in which the third sub-feature is located and a packet in which the fourth sub-feature is located, and the packet in which the first sub-feature is located and the packet in which the third sub-feature is located are the same;
the obtaining, by the federal learning server, a global priori feature according to the first local feature, the group in which the first local feature is located, the second local feature, and the group in which the second local feature is located includes:
based on the fact that the group in which the first sub-feature is located is the same as the group in which the third sub-feature is located, the federal learning server side processes the first sub-feature and the third sub-feature to obtain an intermediate feature;
and the federal learning server obtains a global priori feature according to the intermediate feature, the second sub-feature, the fourth sub-feature, the group in which the second sub-feature is located and the group in which the fourth sub-feature is located.
20. The method of any one of claims 15 to 19, wherein the obtaining, by the federal learning server, a global prior feature from the first local feature and the second local feature comprises:
and the federation learning server obtains global priori features according to the first local features, the second local features, the historical local features from the first federation learning client and the historical local features from the second federation learning client.
21. The method of claim 20, wherein the federation learning server obtains global prior features from the first local features, the second local features, historical local features from the first federation learning client, and historical local features from the second federation learning client, comprising:
the federation learning server calculates the similarity between the local features of the current reasoning process and a plurality of groups of historical local features, wherein the local features of the current reasoning process comprise the first local features and the second local features, and each group of historical local features comprise the historical local features from the first federation learning client and the historical local features from the second federation learning client in one historical reasoning process;
And the federal learning server performs weighted summation on the historical prior features corresponding to the plurality of groups of historical local features according to the similarity between the local features of the current reasoning process and the plurality of groups of historical local features so as to obtain the global prior feature.
22. The method of claim 21, wherein each of the plurality of sets of historical local features has a label that is an actual result of each set of the historical local features that is artificially labeled;
the method further comprises the steps of:
the federal learning server receives an inference result from the first federal learning client;
the federation learning server determines a target reasoning result according to the reasoning result of the first federation learning client and the reasoning result of the second federation learning client;
under the condition that the similarity between the local feature of the current reasoning process and the historical local feature of the target group is greater than or equal to a threshold value, the federal learning server updates the historical local feature of the target group according to the similarity between the local feature of the current reasoning process and the historical local feature of the target group, wherein the historical local feature of the target group is the historical local feature of the plurality of groups, and the tag is the target reasoning result;
And under the condition that the similarity between the local features of the current reasoning process and the historical local features of the target group is smaller than the threshold value, the federal learning server adds a group of historical local features on the basis of the plurality of groups of historical local features, and the added group of historical local features are the local features of the current reasoning process.
23. The method according to any one of claims 15 to 22, further comprising:
the federation learning server side sends task synchronization information to the first federation learning client side, wherein the task synchronization information is used for indicating the first time slot, so that the first federation learning client side selects the first network data from local data according to the task synchronization information.
24. An inference apparatus, applied to a client, comprising:
the receiving and transmitting unit is used for transmitting first local features to the federal learning server, the first local features are extracted from first network data, the first network data are data related to target network resources acquired by the first federal learning client in a first time slot, and the target network resources are network resources managed by the first federal learning client;
The receiving and transmitting unit is used for receiving global priori features from the federation learning server, the global priori features are obtained according to the first local features and second local features, and the second local features are provided by a second federation learning client;
the processing unit is used for carrying out reasoning according to the global priori feature and second network data to obtain a reasoning result, wherein the second network data is data related to the target network resource, which is acquired by the first federal learning client in a second time slot, and the reasoning result is used for managing the target network resource, and the second time slot is the same as or after the first time slot.
25. An inference apparatus, applied to a server, the apparatus comprising:
the receiving and transmitting unit is used for receiving first local features from a first federal learning client, wherein the first local features are extracted from first network data, the first network data are data related to target network resources acquired by the first federal learning client in a first time slot, and the target network resources are network resources managed by the first federal learning client;
The processing unit is used for obtaining a global prior feature according to the first local feature and the second local feature, wherein the second local feature is provided by a second linkage learning client;
the receiving and transmitting unit is configured to send the global priori feature to the first federal learning client, so that the first federal learning client performs reasoning according to the global priori feature and second network data to obtain a reasoning result, where the second network data is data related to the target network resource, which is acquired by the first federal learning client in a second time slot, and the reasoning result is used to manage the target network resource, and the second time slot is the same as or after the first time slot.
26. A computer device, the computer device comprising: a memory and a processor;
the processor for executing a computer program or instructions stored in the memory to cause the computer device to perform the method of any one of claims 1-14.
27. A computer device, the computer device comprising: a memory and a processor;
The processor for executing a computer program or instructions stored in the memory to cause the computer device to perform the method of any of claims 15-23.
28. A computer readable storage medium having program instructions which, when executed directly or indirectly, cause the method of any one of claims 1 to 23 to be implemented.
29. A chip system comprising at least one processor for executing a computer program or instructions stored in a memory, which when executed in the at least one processor, causes the method of any one of claims 1 to 23 to be implemented.
30. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 23.
CN202210962642.4A 2022-08-11 2022-08-11 Reasoning method and related device Pending CN117648981A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210962642.4A CN117648981A (en) 2022-08-11 2022-08-11 Reasoning method and related device
PCT/CN2023/103784 WO2024032214A1 (en) 2022-08-11 2023-06-29 Reasoning method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210962642.4A CN117648981A (en) 2022-08-11 2022-08-11 Reasoning method and related device

Publications (1)

Publication Number Publication Date
CN117648981A true CN117648981A (en) 2024-03-05

Family

ID=89850661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210962642.4A Pending CN117648981A (en) 2022-08-11 2022-08-11 Reasoning method and related device

Country Status (2)

Country Link
CN (1) CN117648981A (en)
WO (1) WO2024032214A1 (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768008B (en) * 2020-06-30 2023-06-16 平安科技(深圳)有限公司 Federal learning method, apparatus, device, and storage medium
US20220114475A1 (en) * 2020-10-09 2022-04-14 Rui Zhu Methods and systems for decentralized federated learning
CN112200321B (en) * 2020-12-04 2021-04-06 同盾控股有限公司 Inference method, system, device and medium based on knowledge federation and graph network
CN112989944A (en) * 2021-02-08 2021-06-18 西安翔迅科技有限责任公司 Intelligent video safety supervision method based on federal learning
CN113435604B (en) * 2021-06-16 2024-05-07 清华大学 Federal learning optimization method and device
CN114048838A (en) * 2021-10-26 2022-02-15 西北工业大学 Knowledge migration-based hybrid federal learning method
CN114580662A (en) * 2022-02-28 2022-06-03 浙江大学 Federal learning method and system based on anchor point aggregation
CN114861936A (en) * 2022-05-10 2022-08-05 天津大学 Feature prototype-based federated incremental learning method

Also Published As

Publication number Publication date
WO2024032214A1 (en) 2024-02-15

Similar Documents

Publication Publication Date Title
CN112465111B (en) Three-dimensional voxel image segmentation method based on knowledge distillation and countermeasure training
CN112308158B (en) Multi-source field self-adaptive model and method based on partial feature alignment
CN108388927B (en) Small sample polarization SAR terrain classification method based on deep convolution twin network
CN107832835A (en) The light weight method and device of a kind of convolutional neural networks
CN111126488A (en) Image identification method based on double attention
CN109816032A (en) Zero sample classification method and apparatus of unbiased mapping based on production confrontation network
CN110298374B (en) Driving track energy consumption analysis method and device based on deep learning
CN112784929A (en) Small sample image classification method and device based on double-element group expansion
CN112784031B (en) Method and system for classifying customer service conversation texts based on small sample learning
CN108510126A (en) A kind of Predictive Methods of Road Accidents based on PCA and BP neural network
CN106911591A (en) The sorting technique and system of network traffics
CN107679484A (en) A kind of Remote Sensing Target automatic detection and recognition methods based on cloud computing storage
CN115600137A (en) Multi-source domain variable working condition mechanical fault diagnosis method for incomplete category data
Wang et al. A multi-dimensional aesthetic quality assessment model for mobile game images
CN115051929A (en) Network fault prediction method and device based on self-supervision target perception neural network
CN113361928B (en) Crowd-sourced task recommendation method based on heterogram attention network
CN111144462A (en) Unknown individual identification method and device for radar signals
CN114154647A (en) Multi-granularity federated learning based method
CN110110628A (en) A kind of detection method and detection device of frequency synthesizer deterioration
CN116502709A (en) Heterogeneous federal learning method and device
CN117648981A (en) Reasoning method and related device
CN117033997A (en) Data segmentation method, device, electronic equipment and medium
CN114548297A (en) Data classification method, device, equipment and medium based on domain self-adaption
CN112364720A (en) Method for quickly identifying and counting vehicle types
CN111935259A (en) Method and device for determining target account set, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication