CN117812564A

CN117812564A - Federal learning method, device, equipment and medium applied to Internet of vehicles

Info

Publication number: CN117812564A
Application number: CN202410229646.0A
Authority: CN
Inventors: 曹敦; 熊佳斯; 黄世锐
Original assignee: Xiangjiang Laboratory
Current assignee: Xiangjiang Laboratory
Priority date: 2024-02-29
Filing date: 2024-02-29
Publication date: 2024-04-02
Anticipated expiration: 2044-02-29
Also published as: CN117812564B

Abstract

The method constructs a federal learning distributed training network for local training, cluster model aggregation and server global aggregation of vehicles, and constructs a problem of minimizing convergence time of global model aggregation in order to meet the requirements of vehicle federal learning service quality and support model convergence as fast as possible; the method comprises the steps of dividing a plurality of vehicles within a server coverage range into a plurality of vehicle clusters, adopting a strategy that the number of the vehicle clusters participating in each round of training of a global model is not fixed, realizing a dynamic semi-asynchronous clustering type federal learning method, reducing the influence of Non-IID data on a learning process, relieving the problem of a user who falls behind the head, reducing training time and resource cost, and simultaneously keeping learning accuracy.

Description

Federal learning method, device, equipment and medium applied to Internet of vehicles

Technical Field

The embodiment of the application relates to the technical field of Internet of vehicles, in particular to a federal learning method, a federal learning device, federal learning equipment and federal learning media applied to Internet of vehicles.

Background

In a car networking (IoV) system, federal Learning (FL) is a novel distributed method of processing real-time vehicle data that can train a shared learning model while ensuring data privacy. However, existing federal learning still faces the following challenges in the field of internet of vehicles:

the existing federal learning aggregation method can be divided into synchronous and asynchronous methods according to different aggregation types, and for a synchronous federal learning protocol, a server needs to collect all parameters acquired from a vehicle user before executing an aggregation process, but a user's network resource or hardware resource of the vehicle user causes a dequeue effect; in contrast, in an asynchronous federal learning protocol, a parameter server can aggregate parameters, but the gradient divergence it causes can further degrade the performance of the model.

Disclosure of Invention

The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.

The embodiment of the invention mainly aims to provide a federal learning method, a federal learning device, federal learning equipment and federal learning media applied to the Internet of vehicles, which can reduce training time and resource cost.

To achieve the above object, a first aspect of the embodiments of the present invention provides a federal learning method applied to the internet of vehicles, which is applicable to an internet of vehicles clustered federal learning network model, the internet of vehicles clustered federal learning network model includes: a server and a plurality of vehicles located within a coverage area of the server; the vehicles are divided into a plurality of vehicle clusters, and each vehicle cluster has at least one vehicle as a leading node; the server is used for receiving and aggregating the vehicle cluster model parameters of the current wheel sent by the vehicle cluster participating in the training of the current wheel to obtain the global model parameters of the next wheel; the leading node of the vehicle cluster participating in the current wheel training is used for receiving local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to the server; the vehicles of the vehicle cluster participating in the current wheel training are used for training a local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel, and the local model parameters are sent to the leading node;

The federal learning method applied to the Internet of vehicles comprises the following steps:

under the strategy that the number of the vehicle clusters participating in each round of training of the global model is not fixed, constructing an objective function for minimizing the federal learning training time of the global model; the training time of each round of training of the global model at least comprises the local calculation time of the vehicle and the communication time between the leader node and the server;

and solving the objective function to obtain a solving result, and executing global model federation training according to the solving result.

An embodiment of the application provides a federal learning method applied to the Internet of vehicles, which constructs a federal learning distributed training network for local training, cluster model aggregation and server global aggregation of vehicles, and constructs a problem of minimizing convergence time of global model aggregation in order to meet the requirements of service quality of the federal learning of vehicles and support model convergence as soon as possible; the method comprises the steps of dividing a plurality of vehicles within a server coverage range into a plurality of vehicle clusters, adopting a strategy that the number of the vehicle clusters participating in each round of training of a global model is not fixed, realizing a dynamic semi-asynchronous clustering type federal learning method, reducing the influence of Non-IID data on a learning process, relieving the problem of a user who falls behind the head, reducing training time and resource cost, and simultaneously keeping learning accuracy.

In some embodiments, the plurality of vehicles is divided into a plurality of vehicle clusters, including:

calculating the stay time of the vehicle in the coverage area of the server according to the constant speed of the vehicle;

determining the shortest stay time and a first neighborhood vehicle of the vehicle, wherein the first neighborhood vehicle of the vehicle refers to the rest vehicles of which the distance between the current time and the vehicle is not more than a first threshold value; after progressing from the current time to the shortest stay time, selecting remaining vehicles from the vehicles in the first field of vehicles, the distance between the vehicles being not greater than the first threshold, to form a second neighborhood vehicle of the vehicles;

building the vehicle cluster:

if the number of second neighborhood vehicles of a first vehicle exceeds a second threshold, establishing an initial cluster between the first vehicle and the second neighborhood vehicles of the first vehicle; the first vehicle is any one of the plurality of vehicles;

if the number of second neighborhood vehicles of the second vehicle exceeds the second threshold value, adding the vehicles which are not added into the initial cluster in the second neighborhood vehicles of the second vehicle into the initial cluster; the second vehicle is any one of second neighborhood vehicles of the first vehicle;

If the number of the second neighborhood vehicles of the third vehicle exceeds the second threshold value, adding the vehicles which are not added into the initial cluster in the second neighborhood vehicles of the third vehicle into the initial cluster; the third vehicle is any one of second neighborhood vehicles of the second vehicle;

and so on until the initial cluster is formed into a complete vehicle cluster.

In some embodiments, after forming the complete vehicle cluster, the federal learning method applied to the internet of vehicles further comprises:

selecting one vehicle from the vehicle cluster as a reference vehicle;

judging cosine similarity of local model parameters between any vehicle in the vehicle cluster and the reference vehicle;

and moving the corresponding vehicle with the cosine similarity smaller than a third threshold out of the vehicle cluster.

In some embodiments, the objective function is:

wherein the constraint conditionsRepresenting a stop condition in too long an edge iteration, +.>When->The global model will get an exact solution when +.>At the time, the global model does not evolve, +.>Indicate->The gradient of the global aggregation of the wheel,indicate->Gradient of global aggregation of the wheel; constraint->Representing the time constraints of each training round of federal learning, Maximum acceptable global training time per round representing federal learning,/>Express federal learning->Global training time of the wheel; constraint->Representing vehicle cluster->First->Total time consumption of round global aggregation +.>No more than vehicle cluster->Minimum stay time of the inner vehicle +.>，/>Indicating vehicle->Stay time at the server; constraint->Representation->Is a binary variable +.>Representing vehicle cluster->Do not participate in->Round federal learning aggregation, ->Representing vehicle cluster->Participation in->Round federal learning aggregation, ->Is a semi-asynchronous aggregation matrixThe method comprises the steps of carrying out a first treatment on the surface of the Constraint->Representing hyper-parameters->And->The sum is 1; />A partitioning strategy representing the plurality of vehicles into a plurality of vehicle clusters;

representing the number of clusters of vehicles +.>Representing the number of times the global model is trained.

In some embodiments, the solving the objective function includes:

converting the objective function into a markov decision process;

and adopting a TD3 algorithm to solve the Markov decision process.

In some embodiments, the leader node is the vehicle with the longest residence time within the coverage of the server.

In some embodiments, if the leader node does not respond, a vehicle having a residence time inferior to the longest residence time is selected as a new leader node.

To achieve the above object, a second aspect of the embodiments of the present invention provides a federal learning device applied to the internet of vehicles, which is applicable to a clustered federal learning network model of the internet of vehicles, the clustered federal learning network model of the internet of vehicles includes: a server and a plurality of vehicles located within a coverage area of the server; the vehicles are divided into a plurality of vehicle clusters, and each vehicle cluster has at least one vehicle as a leading node; the server is used for receiving and aggregating the vehicle cluster model parameters of the current wheel sent by the vehicle cluster participating in the training of the current wheel to obtain the global model parameters of the next wheel; the leading node of the vehicle cluster participating in the current wheel training is used for receiving local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to the server; the vehicles of the vehicle cluster participating in the current wheel training are used for training a local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel, and the local model parameters are sent to the leading node;

the federal learning device applied to the internet of vehicles comprises:

The function construction unit is used for constructing an objective function for minimizing the federal learning training time of the global model under the strategy that the number of the vehicle clusters participating in each round of training of the global model is not fixed; the training time of each round of training of the global model at least comprises the local calculation time of the vehicle and the communication time between the leader node and the server;

and the function solving unit is used for solving the objective function to obtain a solving result and executing global model federation training according to the solving result.

To achieve the above object, a third aspect of the embodiments of the present invention provides an electronic device, including: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a federal learning method for use with the internet of vehicles.

To achieve the above object, a fourth aspect of the embodiments of the present invention provides a computer-readable storage medium storing computer-executable instructions for causing a computer to perform a federal learning method applied to the internet of vehicles as described above.

It is to be understood that the advantages of the second to fourth aspects compared with the related art are the same as those of the first aspect compared with the related art, and reference may be made to the related description in the first aspect, which is not repeated herein.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the related art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort to a person having ordinary skill in the art.

FIG. 1 is a schematic flow chart of a federal learning method for Internet of vehicles according to an embodiment of the present disclosure;

FIG. 2 is a flow diagram of a clustering strategy provided by one embodiment of the present application;

FIG. 3 is a schematic flow chart of a kick-off vehicle in a cluster of selected vehicles according to cosine similarity according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an architecture of a clustered federal learning network model for Internet of vehicles according to an embodiment of the present application;

FIG. 5 is a schematic diagram of semi-asynchronous clustered federal learning provided in accordance with one embodiment of the present application;

FIG. 6 is a schematic diagram of a TD3 algorithm provided by one embodiment of the present application;

FIG. 7 is a schematic structural diagram of a federal learning device for use in Internet of vehicles according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Introducing related words;

federal study: as a distributed deep learning paradigm, vehicles are allowed to train their local deep learning models individually using local data and aggregate them into a global model, and the vehicles do not directly send the local data, only share local model parameters, protecting vehicle privacy to some extent. In addition, the process can integrate a network with global characteristics to realize information sharing among vehicles, so that the flexible learning method is more suitable for the Internet of vehicles.

Synchronous polymerization: in synchronous federal learning, each selected client needs to upload the trained model within a specified time slice, and the server needs to wait until all selected clients upload the local model or the time slice of the selected client runs out to start aggregation. Disadvantages of synchronous federal learning include: the server needs to wait for all clients to complete training and upload the model and then aggregate, and the device heterogeneity of the clients makes the time for them to complete training uneven, which means that the server needs to wait for devices with weak computing power or communication power all the time, which can cause the overall time for model training convergence to be prolonged, which is a later problem in synchronous federal learning. Another problem with the latter problem is that the server is idle in waiting for the latter, which results in a waste of computing resources.

Asynchronous aggregation: in asynchronous federal learning, the server first cooperates with those clients that complete training, rather than waiting for all clients to complete training and then aggregate. And the client communicates with the server immediately after training is finished, and the latest global model is obtained to start the next round of local training. Disadvantages of asynchronous federal learning include: on the one hand, the communication strategy makes the transmission amount of data large, which is equivalent to that each client end needs to communicate with the server end separately. On the other hand, the period of training a local model by a client with low computing power is long, during which the global model may have been communicated with the client with high computing power for several rounds, so that the global model used by the weak client is an old global model, and the trained local model is a lagged local model, and if the lagged local model parameters are uploaded to the global model, the global model quality may be reduced.

Non-independent co-distribution (Non-IID): in the machine learning optimization method, independent co-distribution assumption of data is important and necessary. It is because of this assumption that machine learning models can better predict non-occurring and non-learned scenarios by training limited samples. However, in the federal learning paradigm, devices typically generate and collect data in a non-uniformly distributed manner in a network, while the amount of data collected in the devices may also vary greatly. The most common sources of such data acquisition non-uniformities include: each user device corresponds to a different user, a difference in geographic location of the device, and a different data acquisition time of the device.

Reinforcement learning: reinforcement learning consists of two parts, namely an agent and an environment, and the biggest rewards are obtained through the interaction process of the agent and the environment. After the agent obtains a certain state in the environment, an action is output by using the state, and then the action is executed in the environment to output the next state and rewards brought by the current action. The goal of reinforcement learning is to make as much of the agent as possible to get the jackpot from the environment.

Description of the embodiments

Referring to fig. 1, in one embodiment of the present application, there is provided a federal learning method applied to the internet of vehicles, which is applicable to an internet of vehicles clustered federal learning network model, including: a server and a plurality of vehicles located within a coverage area of the server; the method comprises the steps that a plurality of vehicles are divided into a plurality of vehicle clusters, and each vehicle cluster has at least one vehicle as a leading node; the server is used for receiving and aggregating the vehicle cluster model parameters of the current wheel sent by the vehicle cluster participating in the training of the current wheel to obtain the global model parameters of the next wheel; the leading node of the vehicle cluster participating in the current wheel training is used for receiving the local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to a server; the vehicles of the vehicle cluster participating in the current wheel training are used for training the local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel, and the local model parameters are sent to the leader node;

The federal learning method applied to the internet of vehicles comprises the following steps of S110 and S120:

step S110, under the strategy that the number of the vehicle clusters which participate in each round of training of the global model is not fixed, constructing an objective function which minimizes the federal learning training time of the global model; wherein the training time of each round of training of the global model at least comprises the local calculation time of the vehicle and the communication time between the leading node and the server.

And step S120, solving the objective function to obtain a solving result, and executing global model federation training according to the solving result.

The following describes in detail a clustered federal learning network model for the internet of vehicles, the model comprising:

a server (parameter server or edge server) and vehicles within the coverage of the server, the present embodiment divides all vehicles into a plurality of vehicle clusters, each vehicle having a local model, one vehicle in one vehicle cluster as a leader node. Taking a round of training as an example, the roles of the server, vehicle and leader node are introduced:

(1) The server transmits global model parameters of the current wheel to corresponding leader nodes of each vehicle cluster participating in the training of the current wheel;

(2) The leading node sends global model parameters of the current wheel to each vehicle in the vehicle cluster;

(3) The vehicle trains the local model according to the global model parameters of the current wheel to obtain the trained local model parameters of the current wheel;

(4) The vehicle sends the local model parameters of the current wheel to the leading node;

(5) The leader node aggregates the local model parameters of the current wheels of all vehicles in the vehicle cluster to obtain the vehicle cluster model parameters of the current wheels;

(6) The leader node uploads the vehicle cluster model parameters of the current wheel to a server;

(7) The server aggregates the vehicle cluster model parameters of the current round uploaded by the leading nodes of all the vehicle clusters to obtain the global model parameters of the next round. And then starts the next training round.

In step S110, an objective function that minimizes the federal learning training time is constructed under a strategy in which the number of clusters of vehicles involved in each round of training of the global model is not fixed. That is, each round of training selects a fixed number of vehicle clusters to participate in the training, but each round of vehicle clusters to participate in the training needs to be obtained according to a solution objective function, for example: there are 5 vehicle clusters, the first wheel training is performed by the 2 nd and 3 rd vehicle clusters, and the second wheel training is performed by the 1 st and 4 th vehicle clusters. The training time of each round of training of the global model at least comprises the local calculation time and the communication time of the vehicle clusters, wherein the vehicles forming each vehicle cluster are different, the calculation capability between the vehicles is different, the distance between the vehicles and the server is different, and different communication time exists (such as communication between the leader node and the server, and the communication time between the vehicles and the leader node is negligible, so the communication time between the vehicles and the leader node is temporarily not considered in the embodiment), so that the participation of different vehicle clusters is selected, and the integral training time can be directly influenced.

Because the existing federal learning aggregation method can be divided into synchronous and asynchronous methods according to different aggregation types, for a synchronous federal learning protocol, a server needs to collect all parameters acquired from a vehicle user before executing an aggregation process, but a dequeue effect is caused by poor network resources or hardware resources of the vehicle user; in contrast, in an asynchronous federal learning protocol, a parameter server can aggregate parameters, but the gradient divergence it causes can further degrade the performance of the model;

the embodiment aims to reduce communication consumption of the system as much as possible and accelerate training time of the federal learning model. The method comprises the steps of firstly designing a vehicle networking clustering type federal learning network model, realizing a federal learning distributed training network of vehicle local training, cluster internal model aggregation and server global aggregation by the model, and then constructing a convergence time minimization problem of global model aggregation in order to meet the requirements of vehicle federal learning service quality and support model convergence as fast as possible; the method comprises the steps of dividing a plurality of vehicles in a server coverage range into a plurality of vehicle clusters, aiming at solving the influence caused by non-independent identical distribution (non-IID) of vehicle data and resource constraint, and adopting a strategy that the number of the vehicle clusters participating in each round of training of a global model is not fixed, so that a dynamic semi-asynchronous clustering type federation learning method is realized, the problem of a straggler is relieved to a certain extent, the training time and the resource cost of federation learning are reduced, and meanwhile, the learning accuracy is kept.

In the scene of internet of vehicles, vehicles have high maneuverability, so a clustering strategy is needed to consider not only the influence caused by non-independent co-distribution (non-IID) of vehicle data, but also the problem caused by high maneuverability of vehicles. Some embodiments of the present application provide a clustering method, referring to fig. 2, the method includes steps S210 to S230:

step S210, calculating the stay time of the vehicle in the coverage area of the server according to the constant speed of the vehicle.

Step S220, determining the shortest stay time and a first neighborhood vehicle of the vehicle, wherein the first neighborhood vehicle of the vehicle refers to the rest vehicles of which the distance between the current time and the vehicle is not more than a first threshold value; after progressing from the current time to the shortest stay time, the remaining vehicles whose distance from the vehicle is not greater than a first threshold value are selected from the first-area vehicles of the vehicle to form a second neighborhood vehicle of the vehicle.

Step S230, building a vehicle cluster:

if the number of the second neighborhood vehicles of the first vehicle exceeds a second threshold, establishing an initial cluster between the first vehicle and the second neighborhood vehicles of the first vehicle; the first vehicle is any one of a plurality of vehicles;

if the number of the second neighborhood vehicles of the second vehicle exceeds a second threshold value, adding the vehicles which are not added into the initial cluster in the second neighborhood vehicles of the second vehicle into the initial cluster; the second vehicle is any one of the second neighborhood vehicles of the first vehicle;

If the number of the second neighborhood vehicles of the third vehicle exceeds a second threshold value, adding the vehicles which are not added into the initial cluster in the second neighborhood vehicles of the third vehicle into the initial cluster; the third vehicle is any one of the second neighborhood vehicles of the second vehicle;

and so on until the initial cluster is formed into a complete vehicle cluster.

In step S210, the time taken for the vehicle to move from the current position to the server coverage edge, i.e., the stay time, is calculated from the constant speed of the vehicle, the remaining distance between the current position of the vehicle and the coverage edge.

In step S220, the shortest residence time is determined, and for any one of the vehicles n, the vehicle n corresponds to the first neighborhood vehicle being the vehicle n of all vehicles whose distance from the vehicle n is not greater than a first threshold valueIs a vehicle of (a). After the minimum residence time, the distance from the vehicle n in the first neighborhood vehicle of the vehicle n is not more than a first threshold value +.>Constitute a second neighborhood vehicle.

In step S230, taking vehicle n as an example, if the number of second neighboring vehicles of vehicle n exceeds a second threshold (generally set to 1), an initial cluster is established, the initial cluster including vehicle n and the second neighboring vehicles of vehicle n; and adding a second neighborhood vehicle of the vehicle n into the candidate set, sequentially detecting each vehicle in the candidate set, assuming that the candidate set comprises vehicles o, judging whether the number of the vehicles in the second neighborhood of the vehicles o exceeds 1, if so, adding the vehicles which are not added into the initial cluster in the vehicles in the second neighborhood of the vehicles o into the initial cluster, deleting the vehicles o in the candidate set, and meanwhile, adding the vehicles in the second neighborhood of the vehicles o into the candidate set until the candidate set is an empty set.

The embodiment designs a clustering strategy, which not only considers the influence of non-IID of vehicle data, but also increases the speed and residence time constraint to ensure that the vehicle users in the cluster do not leave the range of the cluster in the coverage area of the server, thereby solving the problem caused by high mobility of the vehicle to a certain extent and improving the stability of the clustering of the vehicle cluster.

In some embodiments of the present application, after forming the complete vehicle cluster, as in fig. 3, the federal learning method applied to the internet of vehicles further includes steps S310 to S330:

step S310, selecting a vehicle from the vehicle cluster as a reference vehicle.

And step S320, judging cosine similarity of local model parameters between any vehicle in the vehicle cluster and the reference vehicle.

And step S330, moving the corresponding vehicle with the cosine similarity smaller than the third threshold out of the vehicle cluster.

According to the method and the device, the Non-IID characteristics among the vehicle-mounted data are utilized, the fact that the vehicle data in the clusters belong to the same distribution is guaranteed through calculating cosine similarity among gradient updating of the vehicle model, and stability and reliability of the vehicle clusters are improved.

In one embodiment of the present application, the leader node is the vehicle with the longest residence time within the coverage of the server. The vehicle with the longest residence time is selected as the leader node to be the most stable.

In one embodiment of the present application, if the leader node does not respond, the vehicle with the next to the longest residence time is selected as the new leader node. In this embodiment, the security redundancy design is performed on the leader node, that is, a vehicle with a residence time inferior to that of the leader node is selected as an alternative to the leader node, and if the leader node does not respond, the alternative node performs the related work of the leader node.

Referring to fig. 4 to 6, the present application provides an embodiment, and provides a clustered federal learning network model for internet of vehicles and a training method thereof, including:

referring to fig. 4, consider a one-way, straight-going, multi-lane internet of vehicles scenario, where there are Edge Servers (ESs) and Vehicle Users (VUs), all vehicles are divided into several vehicle clusters, each with a leader node and a secondary leader node. Assume thatThe individual vehicles are randomly distributed to form a set->，/>The individual vehicles are divided into->Cluster, forming a cluster of vehicles. Assume that the federal learning global model is at +.>Round global post-aggregation convergence, wherein->. Vehicles within a certain safe distance are highly similar due to the information collected by the close distance, and under the condition that the trained models are also highly similar, the vehicle data sets in the vehicle clusters can be assumed to be identical and can be segmented within a period of time, and the vehicle data sets are called as shared data blocks (Shared Data Block, SDB). During the training process, the vehicles in the vehicle cluster only need to train the model by respective historical experience Data blocks (Data blocks, DB), and the original Data is not required to be transmitted.

Suppose that the vehicles are all alongAxle one-way driving, vehicle +_at time t>Is +.>Travel, position information isThe edge server where it is located +.>The fixing position is +.>The covering radius is +.>. Thus, the +.>The remaining distance of an individual vehicle within its ES coverage is defined as:

（1）

vehicle onlyParameters can be uploaded to the edge server only in the coverage area of the edge server. Thus, the vehicleThe residence time at the edge server is defined as:

（2）

to calculate arbitrary nodesAnd->(including vehicle user and edgeServer), introducing a euclidean distance formula:

（3）

calculation of arbitrary nodes using shannon capacity formulaAnd->Between (1->Data rate and node of round global aggregation +.>And->Functional relationship of distance between:

（4）

wherein,for node->Transmission power of>Is node->And->Distance between->Channel gain at>Is noise power->For node->Is provided.

In the first placeIn the global aggregation process of the wheels, the vehicle cluster +.>Uplink transmission time for transmitting local model parameters to the edge server:

（5）

edge server to vehicle clusterDownlink transmission time for transmitting global model parameters:

（6）

since the time for transmitting the parameters from the vehicles in the cluster to the leader node is short, the embodiment ignores the transmission time, so the edge server is the first Global aggregation of wheels, vehicle cluster->The communication time of the node is composed of uplink transmission time and downlink transmission time of each leader node:

（7）

(1) Training a vehicle model;

local areaLoss function: each VU trains a local model based on Shared Data Blocks (SDBs), wherein the VUFirst->The loss function of the wheel model is defined as:

（8）

wherein,representing the model at parameter->The following training set sample->Its corresponding label->Is a loss function of->And->For the number of local training history experience Data Blocks (DB) and VU samples after partitioning SDB。

Vehicle with a vehicle body having a vehicle body supportReceiving global model parameters->After that, execute->Iterating for local parameter updates:

（9）

wherein,for learning rate, for vehicle->Is in the parameters->The gradient calculated below.

Vehicle with a vehicle body having a vehicle body supportThe local calculation time and energy consumption of the federal study are as follows:

（10）

wherein,for sample->Floating point operands required, each VU +.>Is +.>Every turn (or) go(s)>For CPU frequency, ++>Is the computational power of the VU.

(2) Cluster-in-mold polymerization;

vehicle clusterModel parameters for the internal vehicle to be trained>Transferring to a leader node for cluster polymerization to obtain a vehicle cluster +.>Model parameters of (2):

（11）

to determine the convergence of the global model, the gradient of each vehicle cluster needs to be uploaded to an edge server. Thus, the vehicle cluster The gradient polymerization of (2) is:

（12）

vehicle clusterThe local calculation time of the federal learning is as follows:

（13）

(3) Global model aggregation;

transmitting the data to an edge server for final aggregation to obtain global model parameters:

（14）

wherein,is->All at round global aggregationTotal number of samples of participating vehicle nodes.

Global gradient aggregation is:

（15）

each vehicle cluster participating server dynamic semi-asynchronous aggregation matrix is defined as:

（16）/>

wherein,is a binary variable +.>Representing vehicle cluster->Do not participate in->The study and aggregation of the round of federal mode,representing vehicle cluster->Participation in->And (5) performing federal learning aggregation.

Assume that the number of local rounds per round isFirst->The difference between the start time of the global aggregation of the wheels and the end time of the local training is the vehicle cluster +.>Is expressed as:

（17）

wherein,for vehicle cluster->In->The round of global aggregation differs from the last round of global aggregation by several times of global aggregation, and each round of global aggregation selects 2 vehicle clusters to perform global aggregation, referring to fig. 5, namely:

；

in the first placeIn the global aggregation of the wheel edge servers, the vehicle cluster +.>Is the sum of the local computation time and the communication time, expressed as:

（18）

for parameter server NoThe round global aggregation time is +. >The longest time spent in a cluster of vehicles with wheels participating in the aggregation is denoted as:

（19）

the goal of federal learning is to train a machine learning model through the cooperation of a vehicle cluster to obtain the appropriate global model parameters, but the high mobility of the vehicle makes the time of the vehicle in the edge server limited, and the real-time service request of the internet of vehicles system requires the learning algorithm to converge quickly. The embodiment provides dynamic semi-asynchronous federal learning of vehicles to reduce training time and resource cost of federal learning while maintaining learning accuracy. The objective function P1 is designed according to the above-mentioned internet of vehicles clustered federal learning network model:

（20）/>

wherein the objective function P1 is a dynamic semi-asynchronous aggregate number matrixAnd clustering strategy->Under the condition that constraint conditions are met simultaneously, the federal learning training time is minimized, and as many vehicle clusters as possible participate in training. Constraint->Is a stop condition in too long an edge iteration, wherein +.>When->The time model will get an accurate solution +.>Meaning that the model has not progressed at all; />Representing the time constraint of each training round of federal learning, wherein +.>Representing a maximum acceptable global training time per round of federal learning; />Representing vehicle cluster- >First->The total time consumption of the global aggregation of the wheels does not exceed the shortest residence time of the vehicles in the cluster; />Representation->Is a binary variable +.>Representing vehicle cluster->Do not participate in->Round federal learning aggregation, ->Representing vehicle cluster->Participation in->Performing round federal learning aggregation; />Indicating that the sum of all super parameters is 1.

1. For equation (20), a clustering strategy is designed；

Aiming at the performances of cluster stability, reliability, high efficiency and the like, a cluster algorithm mechanism of DBSCAN is utilized, speed and residence time constraint is added to ensure that vehicle users in the cluster do not leave the range of the cluster in the coverage area of an edge server, the same distribution of the vehicle data in the cluster is ensured by calculating cosine similarity between gradient updates of the vehicle model by utilizing the Non-IID characteristics between the vehicle data, and finally, a stable leader node is selected by latest residence time and a secondary leader node is introduced to improve the stability of the model.

Mainly determined by four parameters:-neighborhood (i.e. first neighborhood vehicle as described above), density threshold +.>Vehicle stay +.>Constant speed->Vehicle initial model parameter set, similarity threshold +.>To determine, wherein->Andis a system-determined hyper-parameter. For vehicle- >It->-the neighborhood vehicle node is a vehicle set +.>All of which are->Distance is not more than->Is a vehicle of (1), namely:

（21）

determining the set by equation (2)The residence time between the vehicle users and comparing them to determineMinimum residence time of (2):

（22）

in a car networking system, computingThe middle node is associated with the vehicle after the shortest stay time +.>The distance of (2) is also satisfied at->The set of nodes of the neighborhood is denoted->A + -neighborhood (i.e. second neighborhood vehicle), i.e.:

（23）

if the vehicle isAt->The + -neighborhood contains at least +.>The other vehicles, namely:

（24）

establishing clustersVehicle +.>And->Added to the cluster, will->All nodes join the candidate set +.>Sequentially checkingMiddle vehicle node->Whether or not to meet->The + -neighborhood contains +.>Other vehicles, if->If the cluster is not added, adding the cluster +.>In the candidate set->Delete->At the same time it->Vehicle node in the + -neighborhood +.>Joining candidate set->Up to->. Checking the nodes of the vehicles which do not form clusters, if the number of other vehicles is not less than +.>(including vehicles), then a cluster and candidate set are created. Density threshold according to the system characteristics>=1。

Randomly selecting clustersIs>For reference, calculate cluster->Cosine similarity among the parameters of the vehicle models in the vehicle model, namely:

（25）

Judgment clusterWhether the user data distribution of the internal vehicle is similar, if +.>Indicating that the similarity is not high, thenRemove Cluster->Is +.>Sub-clusters, wherein->Similarity threshold value->The closer to 1 means the more similar +.>Closer to-1 means less similar.

(2) Solving the semi-asynchronous aggregation matrix for equation (20)；

After fixing the clustering strategy by the method described above, equation (20) is redetermined to solve the semi-asynchronous aggregation matrixIs to:

（26）

selecting more vehicle clusters to participate in federal learning can accelerate the convergence rate of the global model. However, selecting more clusters of vehicles in each federal learning global aggregation would likely result in parameter server number oneRound-robin global aggregate time delay. Therefore, it is necessary to consider the parameter server +.>And the global aggregation time is rolled, the number of clients is maximized, and the balance between the client and the client is achieved. Finally, according to P2 +.>The wheel problem is expressed as the formula:

（27）/>

since each round of globally aggregated computing resources and communication environments has markov properties, the optimization problem formula P3 is modeled as MDP, i.e., MDP < S, a, P, r > is defined as follows:

s is a state space containing:

vehicle clusterIn->Round global aggregation data set size for training +. >Vehicle cluster->In->Model precision achieved by round global aggregation +.>Vehicle cluster->In->Shortest residence time of round global aggregation +.>。

A is an action space containing:

ESin->The dynamic semi-asynchronous matrix of the vehicle cluster selected by the global aggregation of the wheels is +.>R is a bonus function, as shown in equation (28):

（28）

the optimization problem described above is an NP-hard problem. P is a state transition probability, and since it is difficult to predict the state transition probability, the MDP problem is handled by using the model-free deep reinforcement learning algorithm TD 3. The TD3 algorithm avoids the limitations of traditional heuristic algorithms in solving the scheduling problem. It does not require the researcher to make specific decision making processes and objective functions. Instead, it can train the initial network into a desired network with action space, state space, rewarding functions and some variable constraints, requiring less number of samples and faster computation when dealing with high-dimensional action space problems.

The TD3 algorithm adopted in this embodiment has six network structures, as shown in fig. 6. The algorithm adopts two key networks with the same structure to calculate the Q value, and selects a smaller value as an updating target. The TD3 algorithm adopts a delay updating method in strategy, the main network updates the target network after x times of updating, the updating times of the main network can be set by people so as to reduce error accumulation and variance, and random noise is added in the motion estimation of the target network, so that the updating of the value function is smooth.

In Critic networks, adjustments、/>Minimizing the mean square error loss function, optimizing the objective function as the formula:

（29）

wherein the method comprises the steps ofIs a discount factor->、/>And->Random parameters of Critic and Actor online networks, respectively.

The optimization objective of the TD3 algorithm is as follows:

（30）

the network error is represented by a TD residual. Under the current environment, the output of the Actor network is calculated through softmax, and a specific action value is selected to obtain the maximum expected value of the rewards by using a deterministic strategy [ mu ]. And finally, performing iterative updating by adopting a batch training method of deep learning.

The embodiment provides a vehicle networking clustering type federal learning network model and a training method thereof, which comprehensively consider the high mobility, the computing power and the Non-independent co-distribution (Non-IID) among vehicle-mounted data of a vehicle, construct a distributed training network for local training of the vehicle and global aggregation of a server under federal learning, support model convergence as fast as possible while meeting the requirements of vehicle federal learning service quality, construct the problem of minimizing convergence time of global model aggregation, and solve the model by adopting a TD3 reinforcement learning algorithm. Because of the problem of a user who falls behind and the problem of high mobility of vehicles, and considering the characteristics of similar data acquired by vehicles in space-time, and the like, a vehicle cluster clustering strategy is provided by using non-IID among vehicle-mounted data, information such as parameter similarity, speed, residence time and the like of a vehicle model is introduced as constraint, the problem of the user who falls behind is efficiently solved, the training of a local model is accelerated, a dynamic semi-asynchronous federal aggregation method is provided on the basis of determining a vehicle clustering method, and resources and communication cost are further reduced by adjusting the waiting time of a server.

Referring to fig. 7, in one embodiment of the present application, there is provided a federal learning device applied to the internet of vehicles, and applicable to an internet of vehicles clustered federal learning network model, where the internet of vehicles clustered federal learning network model includes: a server and a plurality of vehicles located within a coverage area of the server; the method comprises the steps that a plurality of vehicles are divided into a plurality of vehicle clusters, wherein each vehicle cluster at least comprises two vehicles, and at least one vehicle serves as a leading node; in each round of training of the global model, the server is used for receiving the vehicle cluster model parameters of the current wheel sent by part of vehicle clusters participating in the current wheel training, and aggregating the vehicle cluster model parameters of the current wheel to obtain the global model parameters of the next round; the leading node of the vehicle cluster participating in the current wheel training is used for receiving the local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to a server; the vehicles of the vehicle cluster participating in the current wheel training are used for training the local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel after the training is completed, and the local model parameters are sent to the corresponding leader nodes;

The federal learning device 1000 applied to the internet of vehicles includes:

the function construction unit 1100 is configured to construct an objective function that minimizes federal learning training time under a strategy that the number of vehicle clusters that participate in each round of training of the global model is not fixed; the training time of each round of training of the global model at least comprises the local calculation time of the vehicle cluster and the communication time between the vehicle cluster and the server.

The function solving unit 1200 is configured to solve an objective function, obtain a solution result, and execute global model federation training according to the solution result.

It should be noted that, the federal learning method device applied to the internet of vehicles and the federal learning method embodiment applied to the internet of vehicles provided in this embodiment are based on the same inventive concept, so that the relevant content of the federal learning method embodiment applied to the internet of vehicles is also applicable to the federal learning method device embodiment applied to the internet of vehicles, and is not described in detail herein.

As shown in fig. 8, the embodiment of the present application further provides an electronic device, where the electronic device includes:

at least one memory;

at least one processor;

at least one program;

the program is stored in the memory, and the processor executes at least one program to implement the federal learning method of the present disclosure for application to the internet of vehicles.

The electronic device can be any intelligent terminal including a mobile phone, a tablet personal computer, a personal digital assistant (Personal Digital Assistant, PDA), a vehicle-mounted computer and the like.

The electronic device according to the embodiment of the present application is described in detail below.

Processor 1600, which may be implemented by a general-purpose central processing unit (Central Processing Unit, CPU), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc., is configured to execute related programs to implement the technical solutions provided by the embodiments of the present invention;

the Memory 1700 may be implemented in the form of Read Only Memory (ROM), static storage, dynamic storage, or random access Memory (Random Access Memory, RAM). Memory 1700 may store an operating system and other application programs, program code that when implementing the techniques provided by the embodiments of the present specification by software or firmware is stored in memory 1700 and invoked by processor 1600 to perform the federal learning method of the embodiments of the present invention for application to the internet of vehicles.

An input/output interface 1800 for implementing information input and output;

The communication interface 1900 is used for realizing communication interaction between the device and other devices, and can realize communication in a wired manner (such as USB, network cable, etc.), or can realize communication in a wireless manner (such as mobile network, WIFI, bluetooth, etc.);

bus 2000, which transfers information between the various components of the device (e.g., processor 1600, memory 1700, input/output interface 1800, and communication interface 1900);

wherein processor 1600, memory 1700, input/output interface 1800, and communication interface 1900 enable communication connections within the device between each other via bus 2000.

The embodiment of the invention also provides a storage medium which is a computer readable storage medium, wherein the computer readable storage medium stores computer executable instructions for causing a computer to execute the federal learning method applied to the internet of vehicles.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiments described in the present invention are for more clearly describing the technical solutions of the embodiments of the present invention, and do not constitute a limitation on the technical solutions provided by the embodiments of the present invention, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present invention are applicable to similar technical problems.

It will be appreciated by persons skilled in the art that the embodiments of the invention are not limited by the illustrations, and that more or fewer steps than those shown may be included, or certain steps may be combined, or different steps may be included.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including multiple instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing a program.

While the preferred embodiments of the present application have been described in detail, the embodiments are not limited to the above-described embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the embodiments, and these equivalent modifications and substitutions are intended to be included in the scope of the embodiments of the present application as defined in the appended claims.

Claims

1. The federal learning method applied to the internet of vehicles is characterized by being applicable to an internet of vehicles clustered federal learning network model, and the internet of vehicles clustered federal learning network model comprises: a server and a plurality of vehicles located within a coverage area of the server; the vehicles are divided into a plurality of vehicle clusters, and each vehicle cluster has at least one vehicle as a leading node; the server is used for receiving and aggregating the vehicle cluster model parameters of the current wheel sent by the vehicle cluster participating in the training of the current wheel to obtain the global model parameters of the next wheel; the leading node of the vehicle cluster participating in the current wheel training is used for receiving local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to the server; the vehicles of the vehicle cluster participating in the current wheel training are used for training a local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel, and the local model parameters are sent to the leading node;

2. The federal learning method for use in internet of vehicles according to claim 1, wherein the plurality of vehicles are divided into a plurality of vehicle clusters, comprising:

building the vehicle cluster:

and so on until the initial cluster is formed into a complete vehicle cluster.

3. The federal learning method for use in the internet of vehicles according to claim 2, wherein after forming a complete vehicle cluster, the federal learning method for use in the internet of vehicles further comprises:

selecting one vehicle from the vehicle cluster as a reference vehicle;

4. The federal learning method for use in the internet of vehicles according to claim 2, wherein the objective function is:

；

wherein the constraint conditionsRepresenting a stop condition in too long an edge iteration, +.>When->The global model will get an exact solution when +.>At the time, the global model does not evolve, +. >Indicate->Gradient of global aggregation round,/>Indicate->Gradient of global aggregation of the wheel; constraint->Time constraint representing each training round of federal learning, < +.>Maximum acceptable global training time per round representing federal learning,/>Representing coupletWest of bang study>Global training time of the wheel; constraint->Representing vehicle cluster->First->Total time consumption of round global aggregation +.>No more than vehicle cluster->Minimum stay time of the inner vehicle +.>，/>Indicating vehicle->Stay time at the server; constraint->Representation->Is a binary variable +.>Representing vehicle cluster->Do not participate in->Round federal learning aggregation, ->Representing vehicle cluster->Participation in->Round federal learning aggregation, ->Is a semi-asynchronous aggregation matrix; constraint->Representing hyper-parameters->And->The sum is 1; />A partitioning strategy representing the plurality of vehicles into a plurality of vehicle clusters;

；

5. The federal learning method for use in internet of vehicles according to claim 4, wherein the solving the objective function comprises:

converting the objective function into a markov decision process;

and adopting a TD3 algorithm to solve the Markov decision process.

6. The federal learning method for use in internet of vehicles according to claim 2, wherein the leader node is a vehicle having the longest residence time within the coverage of the server.

7. The federal learning method for use in internet of vehicles according to claim 6, wherein if the leader node does not respond, a vehicle having a residence time inferior to the longest residence time is selected as a new leader node.

8. Be applied to car networking's federal learning device, its characterized in that is applicable to car networking cluster formula federal learning network model, car networking cluster formula federal learning network model includes: a server and a plurality of vehicles located within a coverage area of the server; the vehicles are divided into a plurality of vehicle clusters, and each vehicle cluster has at least one vehicle as a leading node; the server is used for receiving and aggregating the vehicle cluster model parameters of the current wheel sent by the vehicle cluster participating in the training of the current wheel to obtain the global model parameters of the next wheel; the leading node of the vehicle cluster participating in the current wheel training is used for receiving local model parameters of the current wheel sent by all vehicles in the vehicle cluster, carrying out aggregation to obtain the vehicle cluster model parameters of the current wheel, and uploading the vehicle cluster model parameters to the server; the vehicles of the vehicle cluster participating in the current wheel training are used for training a local model according to the global model parameters of the current wheel issued by the server, so as to obtain the local model parameters of the current wheel, and the local model parameters are sent to the leading node;

The federal learning device applied to the internet of vehicles comprises:

9. An electronic device, comprising: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the federal learning method for use in the internet of vehicles of any one of claims 1 to 7.

10. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the federal learning method for use in the internet of vehicles according to any one of claims 1 to 7.