WO2021227069A1 - 一种模型更新方法及装置、通信设备 - Google Patents

一种模型更新方法及装置、通信设备 Download PDF

Info

Publication number
WO2021227069A1
WO2021227069A1 PCT/CN2020/090663 CN2020090663W WO2021227069A1 WO 2021227069 A1 WO2021227069 A1 WO 2021227069A1 CN 2020090663 W CN2020090663 W CN 2020090663W WO 2021227069 A1 WO2021227069 A1 WO 2021227069A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
global model
update information
global
information
Prior art date
Application number
PCT/CN2020/090663
Other languages
English (en)
French (fr)
Other versions
WO2021227069A9 (zh
Inventor
田文强
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN202080100094.3A priority Critical patent/CN115427969A/zh
Priority to PCT/CN2020/090663 priority patent/WO2021227069A1/zh
Publication of WO2021227069A1 publication Critical patent/WO2021227069A1/zh
Publication of WO2021227069A9 publication Critical patent/WO2021227069A9/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the embodiments of the present application relate to the field of mobile communication technology, and specifically relate to a model update method and device, and communication equipment.
  • the embodiments of the present application provide a model update method and device, and communication equipment.
  • the master node receives the first model update information sent by the child node, where the first model update information is model update information of the global model expected by the child node relative to the first global model;
  • the master node updates the second global model according to the first model update information and the first global model to obtain a third global model.
  • the model update device provided by the embodiment of the present application is applied to a master node, and the device includes:
  • a receiving unit configured to receive first model update information sent by a child node, where the first model update information is model update information of the global model expected by the child node relative to the first global model;
  • the update unit is configured to update the second global model according to the first model update information and the first global model to obtain a third global model.
  • the communication device provided by the embodiment of the present application includes a processor and a memory.
  • the memory is used to store a computer program
  • the processor is used to call and run the computer program stored in the memory to execute the above-mentioned model update method.
  • the chip provided in the embodiment of the present application is used to implement the above-mentioned model update method.
  • the chip includes: a processor, configured to call and run a computer program from the memory, so that the device installed with the chip executes the above-mentioned model update method.
  • the computer-readable storage medium provided by the embodiment of the present application is used to store a computer program, and the computer program causes a computer to execute the above-mentioned model update method.
  • the computer program product provided by the embodiment of the present application includes computer program instructions, and the computer program instructions cause a computer to execute the above-mentioned model update method.
  • the computer program provided in the embodiments of the present application when running on a computer, causes the computer to execute the above-mentioned model update method.
  • the child node participates in the update of the global model
  • the child node that does not provide model update information the child node does not participate in the update of the global model.
  • the master node After the master node receives the first model update information sent by the child nodes, it updates the current global model (ie, the second global model) based on the first model update information to obtain the updated global model (ie, the third global model) In this way, it can be avoided to a certain extent that the update of the global model in the federated learning process is limited to the limited transmission of model update information of some sub-nodes.
  • the first model update information from the child node acquired by the master node is the model update information of the global model expected by the child node relative to the first global model.
  • the first global model is the historical global model of the master node.
  • the master node updates the current global model (that is, the second global model) in combination with the update information of the first model and the historical global model. Since the influence of the historical global model involved in the model update is taken into account, the asynchronous update of the global model is avoided. The performance degradation of the model.
  • FIG. 1 is a schematic diagram of a communication system architecture provided by an embodiment of the present application.
  • Figure 2 (a) is a schematic diagram of the neural network training phase provided by an embodiment of the present application.
  • Figure 2(b) is a schematic diagram of the neural network reasoning stage provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of federated learning provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of a basic solution provided by an embodiment of the present application.
  • FIG. 5 is a first schematic flowchart of a model update method provided by an embodiment of the application.
  • FIG. 6 is a second schematic flowchart of the model update method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the transmission of model reference version information provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of the structural composition of a model update device provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of a communication device provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a chip of an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of a communication system provided by an embodiment of the present application.
  • LTE Long Term Evolution
  • FDD Frequency Division Duplex
  • TDD Time Division Duplex
  • 5G communication system or future communication system etc.
  • the communication system 100 applied in the embodiment of the present application is shown in FIG. 1.
  • the communication system 100 may include a network device 110, and the network device 110 may be a device that communicates with a terminal 120 (or called a communication terminal or terminal).
  • the network device 110 may provide communication coverage for a specific geographic area, and may communicate with terminals located in the coverage area.
  • the network device 110 may be an evolved base station (Evolutional Node B, eNB, or eNodeB) in an LTE system, or a wireless controller in a cloud radio access network (Cloud Radio Access Network, CRAN), or
  • the network equipment can be a mobile switching center, a relay station, an access point, an in-vehicle device, a wearable device, a hub, a switch, a bridge, a router, a network side device in a 5G network, or a network device in a future communication system, etc.
  • the communication system 100 also includes at least one terminal 120 located within the coverage area of the network device 110.
  • the "terminal” used here includes, but is not limited to, connection via wired lines, such as via Public Switched Telephone Networks (PSTN), Digital Subscriber Line (DSL), digital cable, and direct cable connection; And/or another data connection/network; and/or via a wireless interface, such as for cellular networks, wireless local area networks (WLAN), digital TV networks such as DVB-H networks, satellite networks, AM-FM A broadcast transmitter; and/or a device of another terminal configured to receive/send communication signals; and/or an Internet of Things (IoT) device.
  • PSTN Public Switched Telephone Networks
  • DSL Digital Subscriber Line
  • WLAN wireless local area networks
  • TV networks such as DVB-H networks
  • satellite networks such as DVB-H networks
  • AM-FM A broadcast transmitter AM-FM A broadcast transmitter
  • IoT Internet of Things
  • a terminal set to communicate through a wireless interface may be referred to as a "wireless communication terminal", a “wireless terminal” or a “mobile terminal”.
  • mobile terminals include, but are not limited to, satellite or cellular phones; Personal Communications System (PCS) terminals that can combine cellular radio phones with data processing, fax, and data communication capabilities; can include radio phones, pagers, Internet/intranet PDA with internet access, web browser, memo pad, calendar, and/or Global Positioning System (GPS) receiver; and conventional laptop and/or palmtop receivers or others including radio telephone transceivers Electronic device.
  • PCS Personal Communications System
  • GPS Global Positioning System
  • Terminal can refer to access terminal, user equipment (UE), user unit, user station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication equipment, user agent or user Device.
  • the access terminal can be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a wireless local loop (Wireless Local Loop, WLL) station, a personal digital processing (Personal Digital Assistant, PDA), with wireless communication Functional handheld devices, computing devices or other processing devices connected to wireless modems, in-vehicle devices, wearable devices, terminals in 5G networks, or terminals in the future evolution of PLMN, etc.
  • SIP Session Initiation Protocol
  • WLL Wireless Local Loop
  • PDA Personal Digital Assistant
  • direct terminal connection (Device to Device, D2D) communication may be performed between the terminals 120.
  • the 5G communication system or 5G network may also be referred to as a New Radio (NR) system or NR network.
  • NR New Radio
  • FIG. 1 exemplarily shows one network device and two terminals.
  • the communication system 100 may include multiple network devices and the coverage of each network device may include other numbers of terminals. This embodiment of the present application There is no restriction on this.
  • the communication system 100 may also include other network entities such as a network controller and a mobility management entity, which are not limited in the embodiment of the present application.
  • network entities such as a network controller and a mobility management entity, which are not limited in the embodiment of the present application.
  • the devices with communication functions in the network/system in the embodiments of the present application may be referred to as communication devices.
  • the communication device may include a network device 110 and a terminal 120 with communication functions, and the network device 110 and the terminal 120 may be the specific devices described above, which will not be repeated here; communication
  • the device may also include other devices in the communication system 100, such as other network entities such as a network controller and a mobility management entity, which are not limited in the embodiment of the present application.
  • the use process includes two processes: training phase and inference phase.
  • training phase a large amount of data needs to be obtained as the training set (referred to as the training set for short), and then the training set is used as the input parameters of the neural network to be trained (also called the model to be trained), and based on a specific training algorithm, A large number of training and parameter iterations finally determine the network parameters of the neural network to be trained, so that the training process of the neural network is completed, and a trained neural network is obtained (that is, the network parameters of the neural network are optimized).
  • a neural network that recognizes puppies can be trained through a large number of pictures of puppies, as shown in Figure 2(a).
  • the neural network can be used for inference operations such as recognition, classification, and information recovery. This process is called the inference process of the neural network.
  • a trained neural network can be used to identify the puppy in the image, as shown in Figure 2(b).
  • each update of the neural network parameters (that is, the network parameters of the neural network) needs to use all the training set data, or (2 ) Each update of the neural network parameters uses one training set data, or (3) each update of the neural network parameters uses a part of all the training set data, that is, a batch of data is used.
  • a method of updating neural network parameters with a batch of data is usually used in neural network training, and the size of a batch of data can be configured as a hyperparameter.
  • local local neural network in the embodiments of the present application may also be referred to as a "local local model (referred to as a local model or a local model)", and the “global neural network” in the embodiments of the present application may also be referred to as For the "global model”.
  • each child node is not necessarily clock synchronized, so it is difficult to ensure that different child nodes synchronously feedback their respective model update information.
  • the data sources of different sub-nodes are different. Because of the problem of data sources, it is difficult to ensure that different sub-nodes feed back their model update information synchronously. For example, the local data of child node 1 is obtained once a minute, and the local data of child node 2 is obtained once a second. When two child nodes use 100 local data to update the local model, it is obvious that the frequency of each feedback model update information is Different.
  • the master node can update the global model according to the latest arrival time of the model update information. For example, child node 1 transmits model update information every minute, child node 2 transmits model update information every second, and the master node also updates the global model every minute. For another example, due to the poor transmission channel or local data input problems, the child node 1 caused a transmission to be delayed by 0.5 minutes than the agreed time of transmission every minute, and the master node also needs to continue to delay 0.5. It takes minutes to update the global model.
  • the master node updates the global model according to a pre-appointed time period. For example, it is agreed to update the global model every 10 seconds. If child node 1 transmits model update information to the master node within 10 seconds, the model update information provided by child node 1 participates in the update of the global model. If the model update information is not transmitted to the master node, the child node 1 will not participate in the update of the global model.
  • This processing method can avoid to a certain extent the effect of the limitation of the training of the federated learning global model due to the limited transmission of partial sub-node model update information.
  • the above-mentioned improved method can be referred to as a basic solution. But it needs to be pointed out that the basic scheme here has the problem of the timeliness of model update information. E.g:
  • N x represents the global model at time tx.
  • the child node 2 determines that the model update information of the local local model relative to N 0 is ⁇ N 1,0,2 , and transmits the model update information to the master node.
  • the information corresponding to ⁇ N 1,0,2 can be the model update information between the global model N 1 at time t1 and the global model N 0 at time t0 expected by the child node 2, for example, the model update information can be through gradient information And other ways to identify.
  • the reason for the aforementioned model update information requirement is that there is a difference between the global model N 1 at time t1 expected by the child node 2 and the global model N 0 at time t0.
  • the master node updates the global model N 0 at time t0 to the global model N 1 at time t1 according to the model update information ⁇ N 1,0,2 provided by the child node 2.
  • the child node 1 determines that the model update information of the local local model relative to N 0 is ⁇ N 2,0,1 , and transmits the model update information to the master node.
  • the information corresponding to ⁇ N 2,0,1 can be the model update information between the global model N 2 at time t2 and the global model N 0 at time t0 expected by the child node 1, for example, the model update information can be through gradient information And other ways to identify.
  • the reason for the aforementioned update information requirement is that there is a difference between the global model N 2 at time t2 expected by the child node 1 and the global model N 0 at time t0.
  • the master node updates the global model N 1 at time t1 to the global model N 2 ′ at time t2 according to the model update information ⁇ N 2,0,1 provided by the child node 1.
  • N 2,0,1 is determined by the child node 1 according to N 0 and N 2 , which means that the child node 1 expects It is to update the global model from N 0 through ⁇ N 2,0,1 to the N 2 expected by the child node 1.
  • N 2 can also be written as N 2,0,1
  • N 2,0,1 corresponds to the global model of child node 1 at time t2 that is expected from the global model at time t0.
  • the master node After the master node receives ⁇ N 2,0,1 , since the master node has already updated the global model to N 1 at this time, it is problematic to directly use ⁇ N 2,0,1 to update N 1 to N 2'. Because at this time, the global model at the master node has changed from N 0 to N 1 . Analysis can be found by the master node directly ⁇ N 2,0,1 update problem upon N 1 N 2 'is mainly due to the update information is not based on the model N 1, N 0 ⁇ N 2,0,1-based but provide Based on this, the embodiment of the present application has improved the above-mentioned basic scheme, and proposes a more optimized global model update scheme, which will be described in detail below.
  • FIG. 5 is a schematic flowchart of a model update method provided by an embodiment of the application. As shown in FIG. 5, the model update method includes the following steps:
  • Step 501 The master node receives first model update information sent by a child node, where the first model update information is model update information of the global model expected by the child node relative to the first global model.
  • the master node receives the first model update information sent by a child node.
  • the master node receives the model update information 1 sent by the child node 1, and the model update information 1 is the model update information of the global model expected by the child node 1 relative to the first global model.
  • the master node receives multiple first model update information sent by multiple child nodes (such as two or more child nodes). For example, the master node receives model update information 1 sent by child node 1 and receives model update information 2 sent by child node 2.
  • the model update information 1 is the model update information of the global model expected by the child node 1 relative to the first global model
  • the model update information 2 is the model update information of the global model expected by the child node 2 relative to the first global model.
  • the first global model is a historical global model.
  • the first global model is a global model most recently adopted by the child node, or a local model most recently updated by the first node.
  • Step 502 The master node updates the second global model according to the first model update information and the first global model to obtain a third global model.
  • the second global model is the current global model
  • the master node needs to update the current global model according to the update information of the first model and the first global model to obtain the updated global model (ie The third global model).
  • the master node may update the second global model in any of the following ways.
  • the master node determines a fourth global model according to the first global model and the update information of the first model, where the fourth global model is the global model expected by the child node; 1 -2) The master node updates the second global model according to the fourth global model to obtain the third global model.
  • the above 1-2) can be implemented in the following manner: a1) The master node determines second model update information according to the fourth global model and the second global model, where , The second model update information is the model update information of the fourth global model relative to the second global model; b1) the master node performs the update information on the second global model according to the second model update information Update to get the third global model.
  • the above b1) may be implemented in the following manner: the master node multiplies the second model update information by the first parameter and/or multiplies the second parameter and adds the The second global model obtains the third global model; wherein, the first parameter represents the weight factor of the child node, and the second parameter represents the update step size.
  • the first global model is N 0 (that is, the global model at time t0)
  • the second global model is N 1 (that is, the global model at time t1)
  • the first model update information is ⁇ N 2, 0,1 (that is, the model update information of the global model at time t2 expected by the child node 1 relative to the global model at time t0).
  • the master node receives the ⁇ N 2,0,1 determined by the child node 1 at t2
  • the master node uses ⁇ N 2,0,1 , N 0 and N 1 to determine the updated global model N 2 (that is, the third global Model).
  • the specific process of the master node using ⁇ N 2,0,1 , N 0, and N 1 to update the global model is as follows:
  • the master node determines N 2,0,1 through N 0 and ⁇ N 2,0,1 (that is, the global model at time t2 expected by the child node 1 based on the global model at time t0).
  • the master node determines the difference between N 1 and N 2,0,1 is ⁇ N 2,1,1 (i.e., a child node of the global model of the desired time t2 with respect to the update of the global model models the time t1).
  • the master node updates the global model N 2 as N 1 + ⁇ P 1 * ⁇ N 2,1,1 .
  • is the update step size
  • P 1 is the weight factor of child node 1, for example, P 1 is equal to 1/K
  • K is the number of child nodes participating in federated learning.
  • N i N j + ⁇ k (P k ⁇ N i, j,k ).
  • k is a child node participating in federated learning.
  • each sub-node can use different weighting factors. For example, when different sub-nodes are of different importance to federated learning (for example, the importance of sample data used in training is different), the weighting factor of each sub-node It's different.
  • ⁇ N i,j,k is the model update information of the global model at time ti expected by the child node k relative to the global model at time tj (ie N j ). It can also be said that ⁇ N i,j,k is at ti The update of the global model at time tj expected by the child node k at time.
  • is the update step size of the global model.
  • N x is the global model at time tx, for example, N i is the global model at time ti, and N j is the global model at time tj.
  • ⁇ N i,j,k can be determined in the following way:
  • N p is the global model at time tp
  • ⁇ N i,p,k is the model update information of the global model at time ti expected by the child node k relative to the global model at time tp
  • N i,p,k are child nodes k is based on the global model at the time tp and the global model at the time ti expected.
  • N i,p,k are the global model at time ti expected by the child node k according to the global model at time tp
  • N j is the global model at time tj
  • ⁇ N i,j,k are the ti expected by child node k
  • the global model at time is updated with respect to the model of the global model at time tj.
  • the child node k can determine ⁇ N i,j,k based on Ni,p,k and Nj .
  • the master node determines third model update information according to the second global model and the first global model, where the third model update information is that the second global model is relative to the Model update information of the first global model; 2-2) The master node updates the second global model according to the third model update information and the first model update information to obtain a third global model.
  • the above 2-2) can be implemented in the following manner: a2) The master node determines fourth model update information according to the first model update information and the third model update information , Wherein the fourth model update information is the model update information of the global model expected by the child node relative to the second global model; b2) the master node updates the second global model according to the fourth model update information The global model is updated, and the third global model is obtained.
  • the above a2) may be implemented in the following manner: the master node subtracts the third model update information from the first model update information to obtain fourth model update information.
  • the above b2) may be implemented in the following manner: the master node multiplies the fourth model update information by the first parameter and/or multiplies the second parameter and adds the The second global model obtains the third global model; wherein, the first parameter represents the weight factor of the child node, and the second parameter represents the update step size.
  • the first global model is N 0 (ie, the global model at time t0)
  • the second global model is N 1 (ie, the global model at time t1)
  • the first model update information is ⁇ N 2, 0,1 (that is, the model update information of the global model at time t2 expected by the child node 1 relative to the global model at time t0).
  • the master node receives the ⁇ N 2,0,1 determined by the child node 1 at time t2
  • the master node uses ⁇ N 2,0,1 , N 0 and N 1 to determine the updated global model N 2 (that is, the third global Model).
  • the specific process of the master node using ⁇ N 2,0,1 , N 0, and N 1 to update the global model is as follows:
  • the master node determines that the difference between N 0 and N 1 is ⁇ N 1,0 (that is, the model update information of the global model at time t1 relative to the global model at time t0).
  • the master node determines ⁇ N 2,1,1 through ⁇ N 1,0 and ⁇ N 2,0,1 (that is, the model update information of the global model at time t2 expected by the child node 1 relative to the global model at time t1).
  • the master node updates the global model N 2 as N 1 + ⁇ P 1 * ⁇ N 2,1,1 .
  • is the update step size
  • P 1 is the weight factor of child node 1, for example, P 1 is equal to 1/K
  • K is the number of child nodes participating in federated learning.
  • N i N j + ⁇ k (P k ⁇ N i, j,k ).
  • k is a child node participating in federated learning.
  • each sub-node can use different weighting factors. For example, when different sub-nodes are of different importance to federated learning (for example, the importance of sample data used in training is different), the weighting factor of each sub-node It's different.
  • ⁇ N i,j,k is the model update information of the global model at time ti expected by the child node k relative to the global model at time tj (ie N j ). It can also be said that ⁇ N i,j,k is at ti The update of the global model at time tj expected by the child node k at time.
  • is the update step size of the global model.
  • N x is the global model at time tx, for example, N i is the global model at time ti, and N j is the global model at time tj.
  • ⁇ N i,j,k can be determined in the following way:
  • N j is the global model at time tj
  • N p is the global model at time tp
  • ⁇ N j,p is the model update information of the global model at time tj relative to the global model at time tp.
  • ⁇ N i,p,k is the model update information of the global model at time ti relative to the global model at time tp expected by the child node k
  • ⁇ N j,p is the model update information of the global model at time tj relative to the global model at time tp
  • Model update information ⁇ N i,j,k is the model update information of the global model at time ti expected by the child node k relative to the global model at time tj
  • the child node k determines ⁇ N i,j,k based on ⁇ N i,p,k and ⁇ N j,p .
  • ⁇ N 2,1,1 ⁇ N 2,0,1- ⁇ N 1,0 .
  • the master node when the master node updates the global model, not only the child nodes need to provide model update information, but also the child nodes need to provide the global model to be updated corresponding to the model update information (ie, the first global model). ), in other words, the child node needs to feed back the information of the local model (ie, the first global model) on which the child node generates model update information.
  • the above-mentioned information is referred to as model reference version information.
  • the technical solution of the embodiment of the present application further includes: the master node receives model reference version information corresponding to the first model update information sent by the child node, and the model reference version information is used to determine the first model update information.
  • the model reference version information includes at least one of the following: a version number of the global model, and a sequence number corresponding to the global model.
  • the time sequence number corresponding to the global model refers to the time when the global model is generated (or updated) on the master node side or the time when the master node issues the global model.
  • the above model reference version information is transmitted from the child node to the master node.
  • the master node is a network device (such as a base station), and when the child node is a terminal device, the model reference version information is transmitted by the terminal device to the terminal device through at least one of the following Network equipment: application layer messages, non-access stratum (Non Access Status, NAS) signaling, radio resource control (Radio Resource Control, RRC) signaling, media access control control element (Media Access Control Control Element, MAC CE) , Uplink Control Information (UCI), Physical Uplink Shared Channel (PUSCH), and Physical Uplink Control Channel (PUCCH).
  • NAS non-access stratum
  • RRC Radio Resource Control
  • Media Access Control Element Media Access Control Element
  • MAC CE Media Access Control Control Element
  • UCI Uplink Control Information
  • PUSCH Physical Uplink Shared Channel
  • PUCCH Physical Uplink Control Channel
  • the model reference version information is output by the second terminal device through a PC5 interface message The first terminal device.
  • the PC5 interface message includes at least one of the following: Physical Sidelink Shared Channel (PSSCH), Physical Sidelink Control Channel (PSCCH), and Sidelink Control Channel (PSCCH); Link Control Information (Sidelink Control Information, SCI).
  • the master node transmits the global model at time t0 to child node 1 and child node 2 for use.
  • child node 2 feeds back model update information and model reference version information to the main node, and the main node uses the model update information and model reference version information to update the global model.
  • child node 1 feeds back model update information and model reference version information to the main node, and the main node uses the model update information and model reference version information to update the global model.
  • the master node may configure whether the child node needs to transmit the above model reference version information. Accordingly, the technical solution of the embodiment of the present application further includes: the master node sends first configuration information to the child node, and the first configuration information is used to indicate whether the child node transmits model reference version information to the child node. Master node.
  • the above-mentioned first configuration information is transmitted from the master node to the child node.
  • the master node is a network device (such as a base station), and when the child node is a terminal device, the first configuration information is transmitted by the base station to the terminal through at least one of the following Equipment: Application layer messages, NAS signaling, broadcast messages, RRC messages, MAC CE, Downlink Control Information (DCI), Physical Downlink Shared Channel (PDSCH), Physical Downlink Control Channel (hysical Downlink) Control Channel, PDCCH).
  • DCI Downlink Control Information
  • PDSCH Physical Downlink Shared Channel
  • PDCCH Physical Downlink Control Channel
  • the first configuration information is transmitted by the first terminal device through a PC5 interface message The second terminal device.
  • the PC5 interface message includes at least one of the following: PSSCH, PSCCH, and SCI.
  • the technical solution of the embodiment of the present application provides a method for model update in federated learning, which specifically includes: multiple child nodes participating in federated learning report their model update information asynchronously, and the master node participating in federated learning according to the received Model update information, and according to the characteristics of the global model on which the model update information is based, and according to the characteristics of the global model to be updated by the current master node, update the global model to be updated by the current master node.
  • the information reported by the child node to the master node may also include the characteristics of the global model (for example, the version information of the global model) on which the child node feeds back the model update information.
  • FIG. 8 is a schematic diagram of the structural composition of the model updating device provided by an embodiment of the application, which is applied to the master node. As shown in FIG. 8, the model updating device includes:
  • the receiving unit 801 is configured to receive first model update information sent by a child node, where the first model update information is model update information of the global model expected by the child node relative to the first global model;
  • the update unit 802 is configured to update the second global model according to the first model update information and the first global model to obtain a third global model.
  • the update unit 802 is configured to determine a fourth global model according to the first global model and the first model update information, where the fourth global model is the child node The desired global model; the second global model is updated according to the fourth global model to obtain the third global model.
  • the update unit 802 is configured to determine second model update information according to the fourth global model and the second global model, where the second model update information is the fourth The model update information of the global model relative to the second global model; the second global model is updated according to the second model update information to obtain a third global model.
  • the update unit 802 is configured to multiply the second model update information by the first parameter and/or multiply the second parameter and then add the second global model to obtain the third global model.
  • Model
  • the first parameter represents the weight factor of the child node
  • the second parameter represents the update step size
  • the update unit 802 is configured to determine third model update information according to the second global model and the first global model, wherein the third model update information is the first global model. 2. Model update information of the global model relative to the first global model; updating the second global model according to the third model update information and the first model update information to obtain a third global model.
  • the update unit 802 is configured to determine fourth model update information according to the first model update information and the third model update information, where the fourth model update information is the The model update information of the global model expected by the child node relative to the second global model; the second global model is updated according to the fourth model update information to obtain the third global model.
  • the update unit 802 is configured to subtract the third model update information from the first model update information to obtain fourth model update information.
  • the update unit 802 is configured to multiply the fourth model update information by the first parameter and/or multiply the second parameter and then add the second global model to obtain the third global model.
  • Model
  • the first parameter represents the weight factor of the child node
  • the second parameter represents the update step size
  • the first global model is a global model most recently adopted by the child node, or a local model most recently updated by the first node.
  • the receiving unit 801 is further configured to receive model reference version information corresponding to the first model update information sent by the child node, and the model reference version information is used to determine the first model update information.
  • Global model
  • the model reference version information includes at least one of the following: a version number of the global model, and a sequence number corresponding to the global model.
  • the master node is a network device
  • the child node is a terminal device
  • the model reference version information is transmitted by the terminal device to the network device through at least one of the following: application layer message, NAS signaling, RRC signaling, MAC CE, UCI, PUSCH, PUCCH.
  • the master node when the master node is a first terminal device, and the child node is a second terminal device,
  • the model reference version information is transmitted by the second terminal device to the first terminal device through a PC5 interface message.
  • the device further includes:
  • the sending unit (not shown in the figure) is configured to send first configuration information to the child node, where the first configuration information is used to indicate whether the child node transmits model reference version information to the master node.
  • the master node is a network device
  • the child node is a terminal device
  • the first configuration information is transmitted by the base station to the terminal device through at least one of the following: application layer message, NAS signaling, broadcast message, RRC message, MAC CE, DCI, PDSCH, PDCCH.
  • the master node when the master node is a first terminal device, and the child node is a second terminal device,
  • the first configuration information is transmitted by the first terminal device to the second terminal device through a PC5 interface message.
  • the PC5 interface message includes at least one of the following: PSSCH, PSCCH, and SCI.
  • FIG. 9 is a schematic structural diagram of a communication device 900 provided by an embodiment of the present application.
  • the communication device may be a master node, and the master node may be a terminal device or a network device (such as a base station).
  • the communication device 900 shown in FIG. 9 includes a processor 910, and the processor 910 can call and run a computer program from a memory to Implement the method in the embodiment of this application.
  • the communication device 900 may further include a memory 920.
  • the processor 910 can call and run a computer program from the memory 920 to implement the method in the embodiment of the present application.
  • the memory 920 may be a separate device independent of the processor 910, or may be integrated in the processor 910.
  • the communication device 900 may further include a transceiver 930, and the processor 910 may control the transceiver 930 to communicate with other devices. Specifically, it may send information or data to other devices, or receive other devices. Information or data sent by the device.
  • the transceiver 930 may include a transmitter and a receiver.
  • the transceiver 930 may further include an antenna, and the number of antennas may be one or more.
  • the communication device 900 may specifically be a network device of an embodiment of the application, and the communication device 900 may implement the corresponding process implemented by the network device in each method of the embodiment of the application. For the sake of brevity, details are not repeated here. .
  • the communication device 900 may specifically be a mobile terminal/terminal device of an embodiment of the present application, and the communication device 900 may implement the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application.
  • I won’t repeat it here.
  • FIG. 10 is a schematic structural diagram of a chip of an embodiment of the present application.
  • the chip 1000 shown in FIG. 10 includes a processor 1010, and the processor 1010 can call and run a computer program from the memory to implement the method in the embodiment of the present application.
  • the chip 1000 may further include a memory 1020.
  • the processor 1010 can call and run a computer program from the memory 1020 to implement the method in the embodiment of the present application.
  • the memory 1020 may be a separate device independent of the processor 1010, or may be integrated in the processor 1010.
  • the chip 1000 may further include an input interface 1030.
  • the processor 1010 can control the input interface 1030 to communicate with other devices or chips, and specifically, can obtain information or data sent by other devices or chips.
  • the chip 1000 may further include an output interface 1040.
  • the processor 1010 can control the output interface 1040 to communicate with other devices or chips, and specifically, can output information or data to other devices or chips.
  • the chip can be applied to the network device in the embodiment of the present application, and the chip can implement the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the chip can implement the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the chip can be applied to the mobile terminal/terminal device in the embodiment of the present application, and the chip can implement the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application.
  • the chip can implement the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application.
  • the chip can implement the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application.
  • the chip mentioned in the embodiment of the present application may also be referred to as a system-level chip, a system-on-chip, a system-on-chip, or a system-on-chip, etc.
  • FIG. 11 is a schematic block diagram of a communication system 1100 according to an embodiment of the present application. As shown in FIG. 11, the communication system 1100 includes a terminal device 1110 and a network device 1120.
  • the terminal device 1110 can be used to implement the corresponding function implemented by the terminal device in the above method
  • the network device 1120 can be used to implement the corresponding function implemented by the network device in the above method. For brevity, it will not be repeated here. .
  • the processor of the embodiment of the present application may be an integrated circuit chip with signal processing capability.
  • the steps of the foregoing method embodiments may be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC application specific integrated circuit
  • FPGA Field Programmable Gate Array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiments of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Enhanced SDRAM, ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • Synchronous Link Dynamic Random Access Memory Synchronous Link Dynamic Random Access Memory
  • DR RAM Direct Rambus RAM
  • the memory in the embodiment of the present application may also be static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), Synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection Dynamic random access memory (synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DR RAM) and so on. That is to say, the memory in the embodiments of the present application is intended to include, but is not limited to, these and any other suitable types of memory.
  • the embodiments of the present application also provide a computer-readable storage medium for storing computer programs.
  • the computer-readable storage medium can be applied to the network device in the embodiment of the present application, and the computer program causes the computer to execute the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the computer program causes the computer to execute the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the computer-readable storage medium can be applied to the mobile terminal/terminal device in the embodiment of the present application, and the computer program causes the computer to execute the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application , For the sake of brevity, I won’t repeat it here.
  • the embodiments of the present application also provide a computer program product, including computer program instructions.
  • the computer program product can be applied to the network device in the embodiment of the present application, and the computer program instructions cause the computer to execute the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the computer program instructions cause the computer to execute the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the computer program product can be applied to the mobile terminal/terminal device in the embodiment of the present application, and the computer program instructions cause the computer to execute the corresponding process implemented by the mobile terminal/terminal device in each method of the embodiment of the present application, For the sake of brevity, I will not repeat them here.
  • the embodiment of the present application also provides a computer program.
  • the computer program can be applied to the network device in the embodiment of the present application.
  • the computer program runs on the computer, it causes the computer to execute the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • I won’t repeat it here.
  • the computer program can be applied to the mobile terminal/terminal device in the embodiment of the present application.
  • the computer program runs on the computer, the computer executes each method in the embodiment of the present application. For the sake of brevity, the corresponding process will not be repeated here.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory,) ROM, random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供一种模型更新方法及装置、通信设备,该方法包括:主节点接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。

Description

一种模型更新方法及装置、通信设备 技术领域
本申请实施例涉及移动通信技术领域,具体涉及一种模型更新方法及装置、通信设备。
背景技术
联邦学习的过程中,不同子节点在时间上很难保证均能同步地向主节点反馈各自的模型更新信息。针对这种情况,一种简单的处理方法是主节点可按照模型更新信息最晚到达的时间做全局模型的更新。然而,如果参与联邦学习训练的节点数目众多时,主节点处的全局模型更新速率将受限于最慢传输模型更新信息的子节点,全局模型更新的效率将会非常低。
发明内容
本申请实施例提供一种模型更新方法及装置、通信设备。
本申请实施例提供的模型更新方法,包括:
主节点接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;
所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
本申请实施例提供的模型更新装置,应用于主节点,所述装置包括:
接收单元,用于接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;
更新单元,用于根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
本申请实施例提供的通信设备,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,执行上述的模型更新方法。
本申请实施例提供的芯片,用于实现上述的模型更新方法。
具体地,该芯片包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有该芯片的设备执行上述的模型更新方法。
本申请实施例提供的计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行上述的模型更新方法。
本申请实施例提供的计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述的模型更新方法。
本申请实施例提供的计算机程序,当其在计算机上运行时,使得计算机执行上述的模型更新方法。
本申请实施例的技术方案中,对于提供模型更新信息的子节点,该子节点参与全局模型的更新,对于不提供模型更新信息的子节点,该子节点不参与全局模型的更新,基于此,主节点接收到子节点发送的第一模型更新信息后,基于该第一模型更新信息 对当前的全局模型(即第二全局模型)进行更新,得到更新后的全局模型(即第三全局模型),如此,能够一定程度上避免联邦学习过程中全局模型的更新受限于部分子节点的模型更新信息传输受限的影响。此外,主节点获取到的来自子节点的第一模型更新信息为该子节点所期望的全局模型相对于第一全局模型的模型更新信息,这里,第一全局模型为主节点的历史全局模型,基于此,主节点结合第一模型更新信息和历史全局模型对当前全局模型(即第二全局模型)进行更新,由于考虑了模型更新所涉及的历史全局模型的影响,从而避免了非同步更新全局模型时的性能降低问题。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是本申请实施例提供的一种通信***架构的示意性图;
图2(a)是本申请实施例提供的神经网络训练阶段的示意图;
图2(b)是本申请实施例提供的神经网络推理阶段的示意图;
图3是本申请实施例提供的联邦学习的示意图;
图4是本申请实施例提供的基本方案的流程图;
图5为本申请实施例提供的模型更新方法的流程示意图一;
图6是本申请实施例提供的模型更新方法的流程示意图二;
图7是本申请实施例提供的模型参考版本信息传输示意图;
图8为本申请实施例提供的模型更新装置的结构组成示意图;
图9是本申请实施例提供的一种通信设备示意性结构图;
图10是本申请实施例的芯片的示意性结构图;
图11是本申请实施例提供的一种通信***的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例的技术方案可以应用于各种通信***,例如:长期演进(Long Term Evolution,LTE)***、LTE频分双工(Frequency Division Duplex,FDD)***、LTE时分双工(Time Division Duplex,TDD)、***、5G通信***或未来的通信***等。
示例性的,本申请实施例应用的通信***100如图1所示。该通信***100可以包括网络设备110,网络设备110可以是与终端120(或称为通信终端、终端)通信的设备。网络设备110可以为特定的地理区域提供通信覆盖,并且可以与位于该覆盖区域内的终端进行通信。可选地,该网络设备110可以是LTE***中的演进型基站(Evolutional Node B,eNB或eNodeB),或者是云无线接入网络(Cloud Radio Access Network,CRAN)中的无线控制器,或者该网络设备可以为移动交换中心、中继站、接入点、车载设备、可穿戴设备、集线器、交换机、网桥、路由器、5G网络中的网络侧设备或者未来通信***中的网络设备等。
该通信***100还包括位于网络设备110覆盖范围内的至少一个终端120。作为在此使用的“终端”包括但不限于经由有线线路连接,如经由公共交换电话网络(Public Switched Telephone Networks,PSTN)、数字用户线路(Digital Subscriber Line,DSL)、 数字电缆、直接电缆连接;和/或另一数据连接/网络;和/或经由无线接口,如,针对蜂窝网络、无线局域网(Wireless Local Area Network,WLAN)、诸如DVB-H网络的数字电视网络、卫星网络、AM-FM广播发送器;和/或另一终端的被设置成接收/发送通信信号的装置;和/或物联网(Internet of Things,IoT)设备。被设置成通过无线接口通信的终端可以被称为“无线通信终端”、“无线终端”或“移动终端”。移动终端的示例包括但不限于卫星或蜂窝电话;可以组合蜂窝无线电电话与数据处理、传真以及数据通信能力的个人通信***(Personal Communications System,PCS)终端;可以包括无线电电话、寻呼机、因特网/内联网接入、Web浏览器、记事簿、日历以及/或全球定位***(Global Positioning System,GPS)接收器的PDA;以及常规膝上型和/或掌上型接收器或包括无线电电话收发器的其它电子装置。终端可以指接入终端、用户设备(User Equipment,UE)、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置。接入终端可以是蜂窝电话、无绳电话、会话启动协议(Session Initiation Protocol,SIP)电话、无线本地环路(Wireless Local Loop,WLL)站、个人数字处理(Personal Digital Assistant,PDA)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备、5G网络中的终端或者未来演进的PLMN中的终端等。
可选地,终端120之间可以进行终端直连(Device to Device,D2D)通信。
可选地,5G通信***或5G网络还可以称为新无线(New Radio,NR)***或NR网络。
图1示例性地示出了一个网络设备和两个终端,可选地,该通信***100可以包括多个网络设备并且每个网络设备的覆盖范围内可以包括其它数量的终端,本申请实施例对此不做限定。
可选地,该通信***100还可以包括网络控制器、移动管理实体等其他网络实体,本申请实施例对此不作限定。
应理解,本申请实施例中网络/***中具有通信功能的设备可称为通信设备。以图1示出的通信***100为例,通信设备可包括具有通信功能的网络设备110和终端120,网络设备110和终端120可以为上文所述的具体设备,此处不再赘述;通信设备还可包括通信***100中的其他设备,例如网络控制器、移动管理实体等其他网络实体,本申请实施例中对此不做限定。
应理解,本文中术语“***”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
为便于理解本申请实施例的技术方案,以下对本申请实施例相关的技术方案进行说明。
●神经网络
对于一个特定的神经网络来说,在使用过程中包括训练阶段和推理阶段两个过程。在训练阶段,首先需要获得大量的数据作为训练集合(简称为训练集),然后将训练集作为待训练神经网络(也可以称为待训练模型)的输入参数,并基于特定的训练算法,通过大量的训练和参数迭代,最终确定待训练神经网络的网络参数,这样也就完成了神经网络的训练过程,得到一个训练好的神经网络(即对神经网络的网络参数进行了优化)。例如可通过大量小狗的图片训练一个识别小狗的神经网络,如图2(a)所示。有别于训练阶段,对于一个神经网络来说,当神经网络训练完毕之后就可以使用该神经网络做识别、分类、信息恢复等推理操作,这一过程称之为神经网络的推理过程。例如可通过训 练好的神经网络识别出图像中的小狗,如图2(b)。
关于神经网络的训练,需要说明的是,当用户利用训练集训练神经网络时可以(1)神经网络参数(即神经网络的网络参数)的每次更新需要利用所有的训练集数据,或者(2)神经网络参数的每次更新利用一个训练集数据,或者(3)神经网络参数的每次更新利用所有训练集数据中的一部分数据,既利用一批量(batch)数据。一般来讲,在神经网络训练中通常采用的是利用一批量数据更新一次神经网络参数的方法,其中一批量数据的大小可作为超参数配置。
●联邦学习
传统的神经网络训练是集中式的,例如在数据中心收到大量训练集数据后进行模型训练。但是考虑到用户隐私保护、算力分布等因素后,一种特殊的神经网络训练方式——“联邦学习”被提出,其特征是在神经网络的训练过程中,训练集分布在各个子节点(用户)上。首先,(1)各个子节点基于本地训练集生成本地局部神经网络后将该本地局部神经网络上传至主节点(网络);其次,(2)主节点可根据获得的各个本地局部神经网络合成当前全局神经网络,并将全局神经网络传输至各个子节点;继而,(3)子节点继续使用新的全局神经网络做下一次训练迭代。最终在多个节点的协作下完成神经网络的训练,如图3所示。
需要说明的是,本申请实施例中的“本地局部神经网络”也可以称为“本地局部模型(简称为局部模型或者本地模型)”,本申请实施例中的“全局神经网络”也可以称为“全局模型”。
上述联邦学习的过程中,存在如下几个问题:
1.不同子节点在时间上很难保证均能同步地反馈各自的模型更新信息。
首先,各个子节点不一定是时钟同步的,所以很难保证不同子节点同步反馈各自的模型更新信息。
其次,不同子节点的数据来源不同,因为数据来源的问题也很难保证不同子节点同步反馈各自的模型更新信息。例如子节点1的本地数据1分钟得到一次,子节点2的本地数据1秒钟得到一次,当两个子节点都用100次本地数据更新一次局部模型时,那显然各自反馈模型更新信息的频次是不一样的。
2.从上述描述可以看出,不同子节点在时间上非同步地反馈各自的模型更新信息至主节点是实际存在的典型场景和需求。针对这种情况,一种简单的处理方法是主节点可按照模型更新信息最晚到达的时间做全局模型的更新。例如子节点1每分钟传输一次模型更新信息,子节点2每秒钟传输一次模型更新信息,那主节点也每分钟更新一次全局模型。再例如,子节点1因为传输信道太差或者本地数据输入出现问题等原因,导致了有一次传输比预计的每分钟传输一次所约定的时间还延迟了0.5分钟,那主节点也需要继续延迟0.5分钟才可以更新全局模型。
需要指出的是,如果参与联邦学习训练的节点数目众多时,主节点处的全局模型更新速率将受限于最慢传输模型更新信息的子节点,全局模型更新的效率将会非常低。
为此,提出了本申请实施例的以下技术方案。通过分析发现,一种处理上述问题的改进方法是:主节点按照预先约定的时间周期更新全局模型。例如约定每10秒钟更新一次全局模型,10秒钟内如果子节点1向主节点传输了模型更新信息,则子节点1提供的模型更新信息参与全局模型的更新,如果10秒钟内子节点1没有向主节点传输模型更新信息,则子节点1不参与此次全局模型的更新。这种处理方法能够一定程度上避免联邦学习全局模型训练受限于部分子节点模型更新信息传输受限的影响。为便于后续描述,上述改进方法可以称为基本方案。但是需要指出的是,这里的基本方案存在模型更新信息时效性的问题。例如:
在t0时刻,子节点1和子节点2根据t0时刻的全局模型N 0同时更新了各自的本地局部模型为N 0。这里,N x表示tx时刻的全局模型。
在t1时刻,子节点2确定了本地局部模型相对于N 0的模型更新信息为ΔN 1,0,2,并将该模型更新信息传输至主节点。这里,ΔN 1,0,2所对应信息可以是子节点2所期望的t1时刻的全局模型N 1与t0时刻的全局模型N 0之间的模型更新信息,例如该模型更新信息可以通过梯度信息等方式标识。存在上述模型更新信息需求的原因是子节点2所期望的t1时刻的全局模型N 1与t0时刻的全局模型N 0之间存在差别。
在t1时刻,主节点根据子节点2所提供的模型更新信息ΔN 1,0,2将t0时刻的全局模型N 0更新为t1时刻的全局模型N 1
在t2时刻,子节点1确定了本地局部模型相对于N 0的模型更新信息为ΔN 2,0,1,并将该模型更新信息传输至主节点。这里,ΔN 2,0,1所对应信息可以是子节点1所期望的t2时刻的全局模型N 2与t0时刻的全局模型N 0之间的模型更新信息,例如该模型更新信息可以通过梯度信息等方式标识。存在上述更新信息需求的原因是子节点1所期望的t2时刻的全局模型N 2与t0时刻的全局模型N 0之间存在差别。
在t2时刻,主节点根据子节点1所提供的模型更新信息ΔN 2,0,1将t1时刻的全局模型N 1更新为t2时刻的全局模型N 2’。
上述过程的流程如图4所示,但此时问题也就显现出来了,因为ΔN 2,0,1是子节点1根据N 0和N 2所确定的,也就是说此时子节点1期望的是将全局模型从N 0通过ΔN 2,0,1更新至子节点1期望的N 2。这里,N 2也可以写作N 2,0,1,N 2,0,1所对应的是子节点1在根据t0时刻的全局模型所期望的t2时刻的全局模型。主节点接收到ΔN 2,0,1后,由于此时主节点已经将全局模型更新为了N 1,此时再直接利用ΔN 2,0,1将N 1更新成为N 2’是有问题的,因为此时在主节点的全局模型已经由N 0变成了N 1。通过分析可发现主节点直接利用ΔN 2,0,1将N 1更新成为N 2’的问题主要是因为ΔN 2,0,1提供的并不是基于N 1的模型更新信息,而是基于N 0的模型更新信息,基于此,本申请实施例针对上述基本方案进行了改进,提出了一种更为优化的全局模型更新方案,以下具体对其进行说明。
图5为本申请实施例提供的模型更新方法的流程示意图,如图5所示,所述模型更新方法包括以下步骤:
步骤501:主节点接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息。
在一可选实施方式中,主节点接收一个子节点发送的第一模型更新信息。例如主节点接收子节点1发送的模型更新信息1,该模型更新信息1为子节点1所期望的全局模型相对于第一全局模型的模型更新信息。
在另一可选实施方式中,主节点接收多个子节点(如两个或更多的子节点)发送的多个第一模型更新信息。例如主节点接收子节点1发送的模型更新信息1,以及接收子节点2发送的模型更新信息2。其中,模型更新信息1为子节点1所期望的全局模型相对于第一全局模型的模型更新信息,模型更新信息2为子节点2所期望的全局模型相对于第一全局模型的模型更新信息。
本申请实施例中,所述第一全局模型为历史全局模型。在一可选方式中,所述第一全局模型为所述子节点最近一次采用的全局模型,或者为所述第一节点最近一次更新的本地模型。
步骤502:所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
本申请实施例中,所述第二全局模型为当前全局模型,主节点需要根据所述第一模型更新信息和所述第一全局模型对当前全局模型进行更新,得到更新后的全局模型(即 第三全局模型)。具体实现时,所述主节点可以通过以下任意一种方式对第二全局模型进行更新。
●方式一
1-1)所述主节点根据所述第一全局模型和所述第一模型更新信息,确定第四全局模型,其中,所述第四全局模型为所述子节点所期望的全局模型;1-2)所述主节点根据所述第四全局模型对第二全局模型进行更新,得到第三全局模型。
进一步,在一可选方式中,对于上述1-2)可以采用以下方式来实现:a1)所述主节点根据所述第四全局模型和所述第二全局模型确定第二模型更新信息,其中,所述第二模型更新信息为所述第四全局模型相对于所述第二全局模型的模型更新信息;b1)所述主节点根据所述第二模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
进一步,在一可选方式中,对于上述b1)可以采用以下方式来实现:所述主节点将所述第二模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
在一个示例中,参照图6,第一全局模型为N 0(即t0时刻的全局模型),第二全局模型为N 1(即t1时刻的全局模型),第一模型更新信息为ΔN 2,0,1(即子节点1所期望的t2时刻的全局模型相对于t0时刻的全局模型的模型更新信息)。当主节点在t2时刻接收到子节点1所确定的ΔN 2,0,1后,主节点利用ΔN 2,0,1、N 0以及N 1,确定更新后的全局模型N 2(即第三全局模型)。这里,主节点利用ΔN 2,0,1、N 0以及N 1更新全局模型的具体流程如下:
1、主节点通过N 0和ΔN 2,0,1确定N 2,0,1(即子节点1依据t0时刻的全局模型所期望的t2时刻的全局模型)。
2、主节点确定N 2,0,1与N 1的区别为ΔN 2,1,1(即子节点1所期望的t2时刻的全局模型相对于t1时刻的全局模型的模型更新信息)。
3、主节点更新全局模型N 2为N 1+λP 1*ΔN 2,1,1
这里,λ是更新步长,P 1是子节点1的权重因子,例如P 1等于1/K,K为参与联邦学习的子节点数目。
更一般地,对于参与联邦学习的子节点(用户)的数目为多个的情况,可通过如下公式表述本申请上述实施例的原理:N i=N j+λ∑ k(P kΔN i,j,k)。
其中,k是参与联邦学习的子节点。
其中,P k是子节点k的权重因子,反映子节点k对联邦学习的贡献大小。在一可选方式中,各个子节点可采用相同的权重因子,例如P k=1/K,K为参与联邦学习的子节点数目。在另一可选方式中,各个子节点可采用不同的权重因子,例如当不同子节点对于联邦学习的重要程度不同时(比如训练所采用的样本数据重要性不同),各个子节点的权重因子就不同。
其中,ΔN i,j,k是子节点k所期望的ti时刻的全局模型相对于tj时刻的全局模型(即N j)的模型更新信息,也可以说,ΔN i,j,k是在ti时刻子节点k所期望的针对tj时刻的全局模型所做的更新。
其中,λ是全局模型的更新步长。在一可选方式中,该更新补偿为预设的参数,例如λ=1。
需要说明的是,N x是tx时刻的全局模型,例如N i是ti时刻的全局模型,N j是tj时刻的全局模型。
上述方案中,ΔN i,j,k可以通过以下方式确定:
1)确定N i,p,k=N p+λP kΔN i,p,k
这里,N p为tp时刻的全局模型,ΔN i,p,k为子节点k所期望的ti时刻的全局模型相对于tp时刻的全局模型的模型更新信息,N i,p,k为子节点k依据tp时刻的全局模型所期望的ti时刻的全局模型。
通过上述公式可以看出:子节点k基于N p和ΔN i,p,k去更新tp时刻的全局模型后的结果为N i,p,k
2)确定ΔN i,j,k=(N i,p,k-N j)/(λP k)。
这里,N i,p,k为子节点k依据tp时刻的全局模型所期望的ti时刻的全局模型,N j为tj时刻的全局模型,ΔN i,j,k为子节点k所期望的ti时刻的全局模型相对于tj时刻的全局模型的模型更新信息。
通过上述公式可以看出:子节点k基于N i,p,k和N j可以确定ΔN i,j,k
●方式二
2-1)所述主节点根据所述第二全局模型和所述第一全局模型,确定第三模型更新信息,其中,所述第三模型更新信息为所述第二全局模型相对于所述第一全局模型的模型更新信息;2-2)所述主节点根据所述第三模型更新信息和所述第一模型更新信息对第二全局模型进行更新,得到第三全局模型。
进一步,在一可选方式中,对于上述2-2)可以采用以下方式来实现:a2)所述主节点根据所述第一模型更新信息和所述第三模型更新信息确定第四模型更新信息,其中,所述第四模型更新信息为所述子节点所期望的全局模型相对于第二全局模型的模型更新信息;b2)所述主节点根据所述第四模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
进一步,在一可选方式中,对于上述a2)可以采用以下方式来实现:所述主节点将所述第一模型更新信息减去所述第三模型更新信息,得到第四模型更新信息。
进一步,在一可选方式中,对于上述b2)可以采用以下方式来实现:所述主节点将所述第四模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
在一个示例中,参照图6,第一全局模型为N 0(即t0时刻的全局模型),第二全局模型为N 1(即t1时刻的全局模型),第一模型更新信息为ΔN 2,0,1(即子节点1所期望的t2时刻的全局模型相对于t0时刻的全局模型的模型更新信息)。当主节点在t2时刻接收到子节点1所确定的ΔN 2,0,1后,主节点利用ΔN 2,0,1、N 0以及N 1,确定更新后的全局模型N 2(即第三全局模型)。这里,主节点利用ΔN 2,0,1、N 0以及N 1更新全局模型的具体流程如下:
1、主节点确定N 0与N 1的区别为ΔN 1,0(即t1时刻的全局模型相对于t0时刻的全局模型的模型更新信息)。
2、主节点通过ΔN 1,0和ΔN 2,0,1确定ΔN 2,1,1(即子节点1所期望的t2时刻的全局模型相对于t1时刻的全局模型的模型更新信息)。
3、主节点更新全局模型N 2为N 1+λP 1*ΔN 2,1,1
这里,λ是更新步长,P 1是子节点1的权重因子,例如P 1等于1/K,K为参与联邦学习的子节点数目。
更一般地,对于参与联邦学习的子节点(用户)的数目为多个的情况,可通过如下公式表述本申请上述实施例的原理:N i=N j+λ∑ k(P kΔN i,j,k)。
其中,k是参与联邦学习的子节点。
其中,P k是子节点k的权重因子,反映子节点k对联邦学习的贡献大小。在一可选 方式中,各个子节点可采用相同的权重因子,例如P k=1/K,K为参与联邦学习的子节点数目。在另一可选方式中,各个子节点可采用不同的权重因子,例如当不同子节点对于联邦学习的重要程度不同时(比如训练所采用的样本数据重要性不同),各个子节点的权重因子就不同。
其中,ΔN i,j,k是子节点k所期望的ti时刻的全局模型相对于tj时刻的全局模型(即N j)的模型更新信息,也可以说,ΔN i,j,k是在ti时刻子节点k所期望的针对tj时刻的全局模型所做的更新。
其中,λ是全局模型的更新步长。在一可选方式中,该更新补偿为预设的参数,例如λ=1。
需要说明的是,N x是tx时刻的全局模型,例如N i是ti时刻的全局模型,N j是tj时刻的全局模型。
上述方案中,ΔN i,j,k可以通过以下方式确定:
1)确定ΔN j,p=N j-N p
这里,N j为tj时刻的全局模型,N p为tp时刻的全局模型,ΔN j,p为tj时刻的全局模型相对于tp时刻的全局模型的模型更新信息。
2)确定ΔN i,j,k=ΔN i,p,k-ΔN j,p
这里,ΔN i,p,k为子节点k所期望的ti时刻的全局模型相对于tp时刻的全局模型的模型更新信息,ΔN j,p为tj时刻的全局模型相对于tp时刻的全局模型的模型更新信息,ΔN i,j,k为子节点k所期望的ti时刻的全局模型相对于tj时刻的全局模型的模型更新信息,
通过上述公式可以看出:子节点k基于ΔN i,p,k和ΔN j,p确定ΔN i,j,k。例如:ΔN 2,1,1=ΔN 2,0,1-ΔN 1,0
本申请实施例的上述技术方案,主节点更新全局模型时,不仅仅需要子节点提供模型更新信息,而且还需要子节点提供上述模型更新信息所对应的待更新的全局模型(即第一全局模型)的信息,或者说需要子节点反馈该子节点生成模型更新信息所依据的本地模型(即第一全局模型)的信息,本申请实施例简称上述信息为模型参考版本信息。据此,本申请实施例的技术方案还包括:所述主节点接收所述子节点发送的所述第一模型更新信息对应的模型参考版本信息,所述模型参考版本信息用于确定所述第一全局模型。在一可选方式中,所述模型参考版本信息包括以下至少之一:全局模型的版本号、全局模型对应的时序号。这里,全局模型对应的时序号是指全局模型在主节点侧生成(或者更新)的时间或者主节点下发全局模型的时间。
上述模型参考版本信息由子节点传输至主节点。在一可选方式中,所述主节点为网络设备(如基站),所述子节点为终端设备的情况下,所述模型参考版本信息由所述终端设备通过以下至少之一传输给所述网络设备:应用层消息、非接入层(Non Access Statum,NAS)信令、无线资源控制(Radio Resource Control,RRC)信令、媒体接入控制控制单元(Media Access Control Control Element,MAC CE)、上行控制信息(Uplink Control Information,UCI)、物理上行共享信道(Physical Uplink Shared Channel,PUSCH)、物理上行控制信道(Physical Uplink Control Channel,PUCCH)。在另一可选方式中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,所述模型参考版本信息由所述第二终端设备通过PC5接口消息输给所述第一终端设备。进一步,可选地,所述PC5接口消息包括以下至少之一:物理侧行链路共享信道(Physical Sidelink Shared Channel,PSSCH)、物理侧行链路控制信道(Physical Sidelink Control Channel,PSCCH)、侧行链路控制信息(Sidelink Control Information,SCI)。
参照图7,在t0时刻,主节点将t0时刻的全局模型传输至子节点1和子节点2,供其使用。在t1时刻,子节点2反馈模型更新信息和模型参考版本信息给主节点,主节点 利用模型更新信息和模型参考版本信息更新全局模型。在t2时刻,子节点1反馈模型更新信息和模型参考版本信息给主节点,主节点利用模型更新信息和模型参考版本信息更新全局模型。
本申请实施例的上述技术方案,主节点可以配置子节点是否需要传输上述模型参考版本信息。据此,本申请实施例的技术方案还包括:所述主节点向所述子节点发送第一配置信息,所述第一配置信息用于指示所述子节点是否传输模型参考版本信息给所述主节点。
上述第一配置信息由主节点传输至子节点。在一可选方式中,所述主节点为网络设备(如基站),所述子节点为终端设备的情况下,所述第一配置信息由所述基站通过以下至少之一传输给所述终端设备:应用层消息、NAS信令、广播消息、RRC消息、MAC CE、下行控制信息(Downlink Control Information,DCI)、物理下行共享信道(Physical Downlink Shared Channel,PDSCH)、物理下行控制信道(hysical Downlink Control Channel,PDCCH)。在另一可选方式中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,所述第一配置信息由所述第一终端设备通过PC5接口消息输给所述第二终端设备。进一步,可选地,所述PC5接口消息包括以下至少之一:PSSCH、PSCCH、SCI。
本申请实施例的技术方案,给出一种联邦学习中模型更新的方法,具体包括:参与联邦学习的多个子节点非同步地上报各自的模型更新信息,参与联邦学习的主节点根据接收到的模型更新信息,以及根据该模型更新信息所依据的全局模型的特征,以及根据当前主节点待更新的全局模型的特征,更新当前主节点待更新的全局模型。进一步,子节点上报主节点的信息除了模型更新信息外,还可包括子节点反馈模型更新信息所依据的全局模型的特征(例如全局模型的版本信息)。通过上述方案,给出一种联邦学习中处理非同步条件下联邦学习训练效率受限问题的方法,通过具体的模型更新算法,并考虑了模型更新所涉及的历史全局模型的影响,从而进一步避免非同步更新联邦学习全局模型时的性能降低问题。此外,考虑到联邦学习在不同网络节点直接应用时所需要的空口传输需求,以及上述联邦学习算法特征,在网络传输过程中也相应增加了上述模型更新所需的模型参考版本信息,用于分布式场景下能够顺利完成上述模型更新方案。
图8为本申请实施例提供的模型更新装置的结构组成示意图,应用于主节点,如图8所示,所述模型更新装置包括:
接收单元801,用于接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;
更新单元802,用于根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
在一可选方式中,所述更新单元802,用于根据所述第一全局模型和所述第一模型更新信息,确定第四全局模型,其中,所述第四全局模型为所述子节点所期望的全局模型;根据所述第四全局模型对第二全局模型进行更新,得到第三全局模型。
在一可选方式中,所述更新单元802,用于根据所述第四全局模型和所述第二全局模型确定第二模型更新信息,其中,所述第二模型更新信息为所述第四全局模型相对于所述第二全局模型的模型更新信息;根据所述第二模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
在一可选方式中,所述更新单元802,用于将所述第二模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
在一可选方式中,所述更新单元802,用于根据所述第二全局模型和所述第一全 局模型,确定第三模型更新信息,其中,所述第三模型更新信息为所述第二全局模型相对于所述第一全局模型的模型更新信息;根据所述第三模型更新信息和所述第一模型更新信息对第二全局模型进行更新,得到第三全局模型。
在一可选方式中,所述更新单元802,用于根据所述第一模型更新信息和所述第三模型更新信息确定第四模型更新信息,其中,所述第四模型更新信息为所述子节点所期望的全局模型相对于第二全局模型的模型更新信息;根据所述第四模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
在一可选方式中,所述更新单元802,用于将所述第一模型更新信息减去所述第三模型更新信息,得到第四模型更新信息。
在一可选方式中,所述更新单元802,用于将所述第四模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
在一可选方式中,所述第一全局模型为所述子节点最近一次采用的全局模型,或者为所述第一节点最近一次更新的本地模型。
在一可选方式中,所述接收单元801,还用于接收所述子节点发送的所述第一模型更新信息对应的模型参考版本信息,所述模型参考版本信息用于确定所述第一全局模型。
在一可选方式中,所述模型参考版本信息包括以下至少之一:全局模型的版本号、全局模型对应的时序号。
在一可选方式中,所述主节点为网络设备,所述子节点为终端设备的情况下,
所述模型参考版本信息由所述终端设备通过以下至少之一传输给所述网络设备:应用层消息、NAS信令、RRC信令、MAC CE、UCI、PUSCH、PUCCH。
在一可选方式中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
所述模型参考版本信息由所述第二终端设备通过PC5接口消息输给所述第一终端设备。
在一可选方式中,所述装置还包括:
发送单元(图中未示出),用于向所述子节点发送第一配置信息,所述第一配置信息用于指示所述子节点是否传输模型参考版本信息给所述主节点。
在一可选方式中,所述主节点为网络设备,所述子节点为终端设备的情况下,
所述第一配置信息由所述基站通过以下至少之一传输给所述终端设备:应用层消息、NAS信令、广播消息、RRC消息、MAC CE、DCI、PDSCH、PDCCH。
在一可选方式中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
所述第一配置信息由所述第一终端设备通过PC5接口消息输给所述第二终端设备。
在一可选方式中,所述PC5接口消息包括以下至少之一:PSSCH、PSCCH、SCI。
本领域技术人员应当理解,本申请实施例的上述MCS配置装置的相关描述可以参照本申请实施例的模型更新方法的相关描述进行理解。
图9是本申请实施例提供的一种通信设备900示意性结构图。该通信设备可以是主节点,该主节点可以是终端设备或者网络设备(如基站),图9所示的通信设备900包括处理器910,处理器910可以从存储器中调用并运行计算机程序,以实现本申请实施例中的方法。
可选地,如图9所示,通信设备900还可以包括存储器920。其中,处理器910可 以从存储器920中调用并运行计算机程序,以实现本申请实施例中的方法。
其中,存储器920可以是独立于处理器910的一个单独的器件,也可以集成在处理器910中。
可选地,如图9所示,通信设备900还可以包括收发器930,处理器910可以控制该收发器930与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。
其中,收发器930可以包括发射机和接收机。收发器930还可以进一步包括天线,天线的数量可以为一个或多个。
可选地,该通信设备900具体可为本申请实施例的网络设备,并且该通信设备900可以实现本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该通信设备900具体可为本申请实施例的移动终端/终端设备,并且该通信设备900可以实现本申请实施例的各个方法中由移动终端/终端设备实现的相应流程,为了简洁,在此不再赘述。
图10是本申请实施例的芯片的示意性结构图。图10所示的芯片1000包括处理器1010,处理器1010可以从存储器中调用并运行计算机程序,以实现本申请实施例中的方法。
可选地,如图10所示,芯片1000还可以包括存储器1020。其中,处理器1010可以从存储器1020中调用并运行计算机程序,以实现本申请实施例中的方法。
其中,存储器1020可以是独立于处理器1010的一个单独的器件,也可以集成在处理器1010中。
可选地,该芯片1000还可以包括输入接口1030。其中,处理器1010可以控制该输入接口1030与其他设备或芯片进行通信,具体地,可以获取其他设备或芯片发送的信息或数据。
可选地,该芯片1000还可以包括输出接口1040。其中,处理器1010可以控制该输出接口1040与其他设备或芯片进行通信,具体地,可以向其他设备或芯片输出信息或数据。
可选地,该芯片可应用于本申请实施例中的网络设备,并且该芯片可以实现本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该芯片可应用于本申请实施例中的移动终端/终端设备,并且该芯片可以实现本申请实施例的各个方法中由移动终端/终端设备实现的相应流程,为了简洁,在此不再赘述。
应理解,本申请实施例提到的芯片还可以称为***级芯片,***芯片,芯片***或片上***芯片等。
图11是本申请实施例提供的一种通信***1100的示意性框图。如图11所示,该通信***1100包括终端设备1110和网络设备1120。
其中,该终端设备1110可以用于实现上述方法中由终端设备实现的相应的功能,以及该网络设备1120可以用于实现上述方法中由网络设备实现的相应的功能为了简洁,在此不再赘述。
应理解,本申请实施例的处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分 立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。应注意,本文描述的***和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
应理解,上述存储器为示例性但不是限制性说明,例如,本申请实施例中的存储器还可以是静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)以及直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)等等。也就是说,本申请实施例中的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本申请实施例还提供了一种计算机可读存储介质,用于存储计算机程序。
可选的,该计算机可读存储介质可应用于本申请实施例中的网络设备,并且该计算机程序使得计算机执行本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该计算机可读存储介质可应用于本申请实施例中的移动终端/终端设备,并且该计算机程序使得计算机执行本申请实施例的各个方法中由移动终端/终端设备实现的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种计算机程序产品,包括计算机程序指令。
可选的,该计算机程序产品可应用于本申请实施例中的网络设备,并且该计算机程序指令使得计算机执行本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该计算机程序产品可应用于本申请实施例中的移动终端/终端设备,并且该计算机程序指令使得计算机执行本申请实施例的各个方法中由移动终端/终端设备实现的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种计算机程序。
可选的,该计算机程序可应用于本申请实施例中的网络设备,当该计算机程序在计 算机上运行时,使得计算机执行本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该计算机程序可应用于本申请实施例中的移动终端/终端设备,当该计算机程序在计算机上运行时,使得计算机执行本申请实施例的各个方法中由移动终端/终端设备实现的相应流程,为了简洁,在此不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,)ROM、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (39)

  1. 一种模型更新方法,所述方法包括:
    主节点接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;
    所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
  2. 根据权利要求1所述的方法,其中,所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点根据所述第一全局模型和所述第一模型更新信息,确定第四全局模型,其中,所述第四全局模型为所述子节点所期望的全局模型;
    所述主节点根据所述第四全局模型对第二全局模型进行更新,得到第三全局模型。
  3. 根据权利要求2所述的方法,其中,所述主节点根据所述第四全局模型对第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点根据所述第四全局模型和所述第二全局模型确定第二模型更新信息,其中,所述第二模型更新信息为所述第四全局模型相对于所述第二全局模型的模型更新信息;
    所述主节点根据所述第二模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
  4. 根据权利要求3所述的方法,其中,所述主节点根据所述第二模型更新信息对所述第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点将所述第二模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
    其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
  5. 根据权利要求1所述的方法,其中,所述主节点根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点根据所述第二全局模型和所述第一全局模型,确定第三模型更新信息,其中,所述第三模型更新信息为所述第二全局模型相对于所述第一全局模型的模型更新信息;
    所述主节点根据所述第三模型更新信息和所述第一模型更新信息对第二全局模型进行更新,得到第三全局模型。
  6. 根据权利要求5所述的方法,其中,所述主节点根据所述第三模型更新信息和所述第一模型更新信息对第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点根据所述第一模型更新信息和所述第三模型更新信息确定第四模型更新信息,其中,所述第四模型更新信息为所述子节点所期望的全局模型相对于第二全局模型的模型更新信息;
    所述主节点根据所述第四模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
  7. 根据权利要求6所述的方法,其中,所述主节点根据所述第一模型更新信息和所述第三模型更新信息确定第四模型更新信息,包括:
    所述主节点将所述第一模型更新信息减去所述第三模型更新信息,得到第四模型更新信息。
  8. 根据权利要求6或7所述的方法,其中,所述主节点根据所述第四模型更新 信息对所述第二全局模型进行更新,得到第三全局模型,包括:
    所述主节点将所述第四模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
    其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
  9. 根据权利要求1至8中任一项所述的方法,其中,所述第一全局模型为所述子节点最近一次采用的全局模型,或者为所述第一节点最近一次更新的本地模型。
  10. 根据权利要求1至9中任一项所述的方法,其中,所述方法还包括:
    所述主节点接收所述子节点发送的所述第一模型更新信息对应的模型参考版本信息,所述模型参考版本信息用于确定所述第一全局模型。
  11. 根据权利要求10所述的方法,其中,所述模型参考版本信息包括以下至少之一:全局模型的版本号、全局模型对应的时序号。
  12. 根据权利要求10或11所述的方法,其中,所述主节点为网络设备,所述子节点为终端设备的情况下,
    所述模型参考版本信息由所述终端设备通过以下至少之一传输给所述网络设备:应用层消息、非接入层NAS信令、无线资源控制RRC信令、媒体接入控制控制单元MAC CE、上行控制信息UCI、物理上行共享信道PUSCH、物理上行控制信道PUCCH。
  13. 根据权利要求10或11所述的方法,其中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
    所述模型参考版本信息由所述第二终端设备通过PC5接口消息输给所述第一终端设备。
  14. 根据权利要求10至13中任一项所述的方法,其中,所述方法还包括:
    所述主节点向所述子节点发送第一配置信息,所述第一配置信息用于指示所述子节点是否传输模型参考版本信息给所述主节点。
  15. 根据权利要求14所述的方法,其中,所述主节点为网络设备,所述子节点为终端设备的情况下,
    所述第一配置信息由所述基站通过以下至少之一传输给所述终端设备:应用层消息、NAS信令、广播消息、RRC消息、MAC CE、下行控制信息DCI、物理下行共享信道PDSCH、物理下行控制信道PDCCH。
  16. 根据权利要求14所述的方法,其中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
    所述第一配置信息由所述第一终端设备通过PC5接口消息输给所述第二终端设备。
  17. 根据权利要求13或16所述的方法,其中,所述PC5接口消息包括以下至少之一:物理侧行链路共享信道PSSCH、物理侧行链路控制信道PSCCH、侧行链路控制信息SCI。
  18. 一种模型更新装置,应用于主节点,所述装置包括:
    接收单元,用于接收子节点发送的第一模型更新信息,其中,所述第一模型更新信息为所述子节点所期望的全局模型相对于第一全局模型的模型更新信息;
    更新单元,用于根据所述第一模型更新信息和所述第一全局模型对第二全局模型进行更新,得到第三全局模型。
  19. 根据权利要求18所述的装置,其中,所述更新单元,用于根据所述第一全局模型和所述第一模型更新信息,确定第四全局模型,其中,所述第四全局模型为所述子节点所期望的全局模型;根据所述第四全局模型对第二全局模型进行更新,得到第三全局模型。
  20. 根据权利要求19所述的装置,其中,所述更新单元,用于根据所述第四全局模型和所述第二全局模型确定第二模型更新信息,其中,所述第二模型更新信息为所述第四全局模型相对于所述第二全局模型的模型更新信息;根据所述第二模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
  21. 根据权利要求20所述的装置,其中,所述更新单元,用于将所述第二模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
    其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
  22. 根据权利要求18所述的装置,其中,所述更新单元,用于根据所述第二全局模型和所述第一全局模型,确定第三模型更新信息,其中,所述第三模型更新信息为所述第二全局模型相对于所述第一全局模型的模型更新信息;根据所述第三模型更新信息和所述第一模型更新信息对第二全局模型进行更新,得到第三全局模型。
  23. 根据权利要求22所述的装置,其中,所述更新单元,用于根据所述第一模型更新信息和所述第三模型更新信息确定第四模型更新信息,其中,所述第四模型更新信息为所述子节点所期望的全局模型相对于第二全局模型的模型更新信息;根据所述第四模型更新信息对所述第二全局模型进行更新,得到第三全局模型。
  24. 根据权利要求23所述的装置,其中,所述更新单元,用于将所述第一模型更新信息减去所述第三模型更新信息,得到第四模型更新信息。
  25. 根据权利要求23所述的装置,其中,所述更新单元,用于将所述第四模型更新信息乘以第一参数和/或乘以第二参数后加上所述第二全局模型,得到第三全局模型;
    其中,所述第一参数代表所述子节点的权重因子,所述第二参数代表更新步长。
  26. 根据权利要求18至25中任一项所述的装置,其中,所述第一全局模型为所述子节点最近一次采用的全局模型,或者为所述第一节点最近一次更新的本地模型。
  27. 根据权利要求18至26中任一项所述的装置,其中,所述接收单元,还用于接收所述子节点发送的所述第一模型更新信息对应的模型参考版本信息,所述模型参考版本信息用于确定所述第一全局模型。
  28. 根据权利要求27所述的装置,其中,所述模型参考版本信息包括以下至少之一:全局模型的版本号、全局模型对应的时序号。
  29. 根据权利要求27或28所述的装置,其中,所述主节点为网络设备,所述子节点为终端设备的情况下,
    所述模型参考版本信息由所述终端设备通过以下至少之一传输给所述网络设备:应用层消息、NAS信令、RRC信令、MAC CE、UCI、PUSCH、PUCCH。
  30. 根据权利要求27或28所述的装置,其中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
    所述模型参考版本信息由所述第二终端设备通过PC5接口消息输给所述第一终端设备。
  31. 根据权利要求27至30中任一项所述的装置,其中,所述装置还包括:
    发送单元,用于向所述子节点发送第一配置信息,所述第一配置信息用于指示所述子节点是否传输模型参考版本信息给所述主节点。
  32. 根据权利要求31所述的装置,其中,所述主节点为网络设备,所述子节点为终端设备的情况下,
    所述第一配置信息由所述基站通过以下至少之一传输给所述终端设备:应用层消息、NAS信令、广播消息、RRC消息、MAC CE、DCI、PDSCH、PDCCH。
  33. 根据权利要求31所述的装置,其中,所述主节点为第一终端设备,所述子节点为第二终端设备的情况下,
    所述第一配置信息由所述第一终端设备通过PC5接口消息输给所述第二终端设备。
  34. 根据权利要求30或33所述的装置,其中,所述PC5接口消息包括以下至少之一:PSSCH、PSCCH、SCI。
  35. 一种通信设备,包括:处理器和存储器,该存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,执行如权利要求1至17中任一项所述的方法。
  36. 一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行如权利要求1至17中任一项所述的方法。
  37. 一种计算机可读存储介质,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至17中任一项所述的方法。
  38. 一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行如权利要求1至17中任一项所述的方法。
  39. 一种计算机程序,所述计算机程序使得计算机执行如权利要求1至17中任一项所述的方法。
PCT/CN2020/090663 2020-05-15 2020-05-15 一种模型更新方法及装置、通信设备 WO2021227069A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080100094.3A CN115427969A (zh) 2020-05-15 2020-05-15 一种模型更新方法及装置、通信设备
PCT/CN2020/090663 WO2021227069A1 (zh) 2020-05-15 2020-05-15 一种模型更新方法及装置、通信设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/090663 WO2021227069A1 (zh) 2020-05-15 2020-05-15 一种模型更新方法及装置、通信设备

Publications (2)

Publication Number Publication Date
WO2021227069A1 true WO2021227069A1 (zh) 2021-11-18
WO2021227069A9 WO2021227069A9 (zh) 2022-10-20

Family

ID=78526113

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090663 WO2021227069A1 (zh) 2020-05-15 2020-05-15 一种模型更新方法及装置、通信设备

Country Status (2)

Country Link
CN (1) CN115427969A (zh)
WO (1) WO2021227069A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116208492A (zh) * 2021-11-30 2023-06-02 维沃软件技术有限公司 信息交互方法、装置及通信设备
WO2024036526A1 (zh) * 2022-08-17 2024-02-22 华为技术有限公司 一种模型调度方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871702A (zh) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 联邦模型训练方法、***、设备及计算机可读存储介质
CN110263936A (zh) * 2019-06-14 2019-09-20 深圳前海微众银行股份有限公司 横向联邦学习方法、装置、设备及计算机存储介质
US20190385043A1 (en) * 2018-06-19 2019-12-19 Adobe Inc. Asynchronously training machine learning models across client devices for adaptive intelligence
CN110929880A (zh) * 2019-11-12 2020-03-27 深圳前海微众银行股份有限公司 一种联邦学习方法、装置及计算机可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190385043A1 (en) * 2018-06-19 2019-12-19 Adobe Inc. Asynchronously training machine learning models across client devices for adaptive intelligence
CN109871702A (zh) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 联邦模型训练方法、***、设备及计算机可读存储介质
CN110263936A (zh) * 2019-06-14 2019-09-20 深圳前海微众银行股份有限公司 横向联邦学习方法、装置、设备及计算机存储介质
CN110929880A (zh) * 2019-11-12 2020-03-27 深圳前海微众银行股份有限公司 一种联邦学习方法、装置及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116208492A (zh) * 2021-11-30 2023-06-02 维沃软件技术有限公司 信息交互方法、装置及通信设备
WO2024036526A1 (zh) * 2022-08-17 2024-02-22 华为技术有限公司 一种模型调度方法和装置

Also Published As

Publication number Publication date
WO2021227069A9 (zh) 2022-10-20
CN115427969A (zh) 2022-12-02

Similar Documents

Publication Publication Date Title
WO2021147001A1 (zh) 功率控制参数确定方法、终端、网络设备及存储介质
WO2021179196A1 (zh) 一种基于联邦学习的模型训练方法、电子设备及存储介质
WO2020001585A1 (zh) 一种时钟同步的方法和装置
WO2020200068A1 (zh) 用于上行定时同步的方法和装置
WO2020037447A9 (zh) 一种功率控制方法及装置、终端
TW202013921A (zh) 發送上行信號的方法和設備
WO2021184263A1 (zh) 一种数据传输方法及装置、通信设备
WO2021016774A1 (zh) 无线通信方法和设备
WO2021227069A1 (zh) 一种模型更新方法及装置、通信设备
US11363619B2 (en) Relay network duplex coordination method and relay node device
WO2020164077A1 (zh) 一种资源配置方法、终端设备及网络设备
WO2021237715A1 (zh) 一种信道状态信息处理方法、电子设备及存储介质
WO2019140667A1 (zh) 数据传输的方法和终端设备
WO2018010488A1 (zh) 发射功率确定方法、终端、网络设备和***
TW202019210A (zh) 無線通訊方法和終端設備
TW202019208A (zh) 一種回饋資源的複用方法、終端設備及網路設備
TW202014015A (zh) 傳輸訊息的方法和終端設備
WO2020082218A1 (zh) 无线通信的方法和网络设备
WO2021072609A1 (zh) 无线通信的方法及设备
WO2021056385A1 (zh) 无线通信的方法、终端设备和网络设备
WO2021203230A1 (zh) 上行控制信息传输方法及装置、终端设备
WO2021027904A1 (zh) 无线通信的方法和装置以及通信设备
WO2021146950A1 (zh) 一种确定重传资源的方法及装置、终端设备
WO2020199213A1 (zh) 一种信号传输方法及装置、网络设备
WO2021072705A1 (zh) 无线通信的方法和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20935389

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20935389

Country of ref document: EP

Kind code of ref document: A1