WO2021029797A1

WO2021029797A1 - Network nodes and methods for handling machine learning models in a communications network

Info

Publication number: WO2021029797A1
Application number: PCT/SE2019/050747
Authority: WO
Inventors: Miljenko OPSENICA; Patrik Salmela; Joel REIJONEN
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2019-08-15
Filing date: 2019-08-15
Publication date: 2021-02-18

Abstract

Embodiments herein relate to method performed by a first network node in a communications network for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture, the method comprising: - transmitting (304) to a second network node, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party.

Description

NETWORK NODES AND METHODS FOR HANDLING MACHINE LEARNING MODELS

IN A COMMUNICATIONS NETWORK

TECHNICAL FIELD Embodiments herein relate to methods and a first and a second network node in a communications network. In particular, they relate to handling machine learning models in the communications network.

BACKGROUND Computational graph models such as machine learning models are currently used in different applications and are based on different technologies. A computational graph model is a directed graph model where nodes correspond to operations or variables. Variables can feed their value into operations, and operations can feed their output into other operations. This way, every node in the graph model defines a function of the variables. Training of these computational graph models is typically an offline process, meaning that it usually happens in datacenters and takes several minutes to hours and days, depending on the underlying technology, the capabilities of the infrastructure used for training and the complexity of the computational graph model, e.g. amount of input data, parameters, etc. On the other hand, execution of these computational graph models is done anywhere from an edge of the communication network also called network edge, e.g. in devices, gateways or radio access infrastructure, to centralized clouds e.g. data centers.

In the existing industry solutions, machine learning (ML) is executed primarily in a central cloud. However, the latest industry trends bring computing and cloud functions closer to the edge where machine learning can also gain benefits of being involved in the closer local functional loops and closer to the data sources.

There are some solutions already proposed including placing computing nodes executing ML models closer to the network edge, i.e. closer to the nodes requesting ML executions. Thus, one may place capable computing nodes closer to an entity requesting execution of an ML model in order to reduce the time required to communicate input data to a ML model and get a response.

Decentralization of the machine learning introduces many challenges in the management of related processing and data sharing across growing multitude of edge cloud domains. In addition, growing number of edge cloud providers as well as growing number of applications and services on the top are additionally increasing management complexity.

In such expanding edge ecosystem, there are also very big threats in the security areas. With a growing number of edge cloud players, there is a growing demand for authentication and/or authorization of different edge cloud interactions. Different protective mechanisms such as authentication of interacting peers, validation of shared data and software (SW), and authorization of remote management actions can be used in order to protect local system from unauthorized interactions and violations of privacy.

Existing industry machine learning solutions are executed primarily in the central cloud where the context of machine learning model is distributed privately in the same cloud domain. The machine learning model context can be, for instance, a machine learning model, a model training algorithm, a machine learning function, machine learning inferences or any other machine learning optimization artifact. With the latest industry trend of moving machine learning closer to the edge, some pioneer edge solutions utilize pre-trained machine learning models that are “pre-baked” or tightly coupled with the application. In the latter case, data training the ML model is preferably collected and processed in the central cloud and deployed in the further improved application versions with the most recent machine learning models. In the majority of edge machine learning cases, the machine learning model context is installed in the providers’ premises as part of tightly coupled application or service with a limited dynamics on updates and optimization improvements.

SUMMARY

An object of embodiments herein is to provide a distributed ML model concept in a secure and efficient manner. According to an aspect of embodiments herein, the object is achieved by a method performed by a first network node in a communications network for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture. The first network node transmits to a second network node, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party.

According to another aspect of embodiments herein, the object is achieved by a method performed by a second network node in a communications network for handling a second machine learning model being associated with a first machine learning model in a distributed machine learning model architecture. The second network node receives from a first network node, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party. The second network node authenticates the data based on the signature; and upon authentication, the second network node updates the second machine learning model taking the received data into account.

According to yet another aspect of embodiments herein, the object is achieved by providing a first network node for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture. The first network node is configured to transmit to a second network node, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party.

According to still another aspect of embodiments herein, the object is achieved by providing a second network node for handling a second machine learning model being associated with a first machine learning model in a distributed machine learning model architecture. The second network node is configured to receive from a first network node, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party. The second network node is configured to authenticate the data based on the signature; and upon authentication, the second network node is configured to update the second machine learning model taking the received data into account.

It is furthermore provided herein a computer program product comprising instructions, which, when executed on at least one processor, cause the at least one processor to carry out the method above, as performed by the first and second network node, respectively. It is additionally provided herein a computer-readable storage medium, having stored thereon a computer program product comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method according to the method above, as performed by the first and second network node, respectively. Embodiments herein provide a secure authentication and validation solution for the machine learning related requests prior to any machine learning model context exchange. Thus, allowing a distributed ML model concept in a secure and efficient manner.

BRIEF DESCRIPTION OF THE DRAWINGS Examples of embodiments herein are described in more detail with reference to attached drawings in which:

Fig. 1 is a schematic block diagram illustrating embodiments of a communications network.

Fig. 2 is a combined signalling scheme and flowchart depicting embodiments herein.

Fig. 3a is a flowchart depicting embodiments of a method performed by a first network node in the communications network.

Fig. 3b is a flowchart depicting embodiments of a method performed by a second network node in the communications network.

Fig. 4 is a combined signalling scheme and flowchart depicting embodiments herein.

Fig. 5 is a schematic block diagram depicting embodiments herein.

Fig. 6 is a flowchart depicting some embodiments herein.

Fig. 7 is a flowchart depicting some embodiments herein.

Fig. 8 is a flowchart depicting some embodiments herein.

Fig. 9 is a schematic block diagram illustrating a first network node according to embodiments herein.

Fig. 10 is a schematic block diagram illustrating a second network node according to embodiments herein. DETAILED DESCRIPTION

Fig. 1 is a schematic overview depicting a communications network 100 wherein embodiments herein may be implemented. The communications network 100 may for example comprise one or more edge cloud networks, one or more Access Networks (ANs), e.g. in the form of Radio Access Network(s) (RANs) for providing radio access to the communication network 100, and/or one or more Core Networks (CNs). The communications network 100 may use any technology such as 5G new radio (NR) but may further use a number of other different technologies, such as, Wi-Fi, long term evolution (LTE), LTE-Advanced, wideband code division multiple access (WCDMA), global system for mobile communications/enhanced data rate for GSM evolution (GSM/EDGE), worldwide interoperability for microwave access (WiMax), or ultra-mobile broadband (UMB), just to mention a few possible implementations.

Embodiments herein disclose a distributed machine learning model architecture wherein network nodes operate in the communications network 100. For example, a first network node 10 and a second network node 12, e.g. network nodes comprising agents, agents for short, or network node acting as peers, are comprised in the communications network 100. Such a network node may be a cloud based server or an application server providing processing capacity for e.g. executing ML models. The network node may alternatively be a transmission and reception point e.g. a radio access network node such as a base station, e.g. a radio base station such as a NodeB, an evolved Node B (eNB, eNode B), an NR Node B (gNB), a base transceiver station, a radio remote unit, an Access Point Base Station, a base station router, a transmission arrangement of a radio base station, a stand-alone access point, a Wireless Local Area Network (WLAN) access point, an Access Point Station (AP STA), an access controller, a UE acting as an access point or a peer in a Mobile device to Mobile device (D2D) communication, or any other network unit capable of communicating with a UE within an area, e.g. a cell, served by the network node depending e.g. on the radio access technology and terminology used.

Furthermore, the communications network 100 may comprise another network node such as a central network node 14 operable to communicate with all network nodes in the communications network.

The first network node 10 comprises a first ML model and the second network node 12 comprises a second ML model, wherein the first ML model is associated with the second ML model. The first ML model may be a different version of the second ML model but based on a same base ML model.

In all existing solutions, there is no support for open and flexible exchange of data related to the machine learning models of the network nodes, also referred to as the machine learning model context, across the edge clouds where any compatible edge peer could build trust level with the second peer in order to share desired machine learning model context. In addition to the lack of an open distributed exchange solution, there is no existing solution for how to validate machine learning interactions openly across the edge clouds in order to protect local systems from unauthorized machine learning interactions and hazardous system behavior.

Because these capabilities are missing there are many features that cannot be implemented. For instance, in the case of edge machine learning interactions, the first edge system (edge domain) could offer upload/update/patch/optimization of a pre-trained machine learning model or related model training sequence to the neighborhood edge systems (peers). However, without a security solution for supporting this, there are risks. An unauthorized party can cause hazardous behavior with catastrophic incidents for the system and users of the system acting as a trusted party. In order to protect the edge machine learning peer and system in the growing edge ecosystem, there is a need of an open solution to seamlessly authenticate and validate the model context itself and in many cases also the owner of the machine learning model. At the same time, there is a need to enable edge ecosystem interactions where machine learning and related model context can be shared more openly across distributed edge domains. Any related edge machine learning optimization should preferably be less dependent on the application/service deployments so that a machine learning model context can be independently exchanged at any state of the system lifecycle

Embodiments herein provide a secure and an open solution for machine learning model context exchange between edge machine learning domains, i.e. between the first and second network node. Embodiments herein enable a secure decentralized machine learning and open peer-to-peer interactions including any collaborative type of learnings. The first network node 10 and the second network node 12 act as collaboration peers.

The first and second network nodes exchange data that may be any ML model context such as machine learning algorithms, configuration, input/output data, metadata, sharing policies and required capabilities, system/subsystem references, etc.

The embodiments herein may comprise one or more of the following three trust establishment segments to enable such an open machine learning model context exchange and to protect local systems from unauthorized machine learning interactions and hazardous system behavior: 1) authentication of the machine learning model exchanger, 2) authorization of machine learning model, 3) validation of the machine learning model. By providing a trust process according to embodiments herein this enables trusted decentralized learning on e.g. edge network nodes and enables secured machine learning model exchange across the communication network, furthermore distributed edge machine learning is protected from the hazardous behavior triggered by unauthorized parties

Action 201. The second network node 12 may setup a secure connection with the first network node 10 for securely communicating with one another. E.g. the second network node 12 may perform an action that establishes a secure connection with the peer. Additionally or alternatively, if the first network node 10 is about to “push” its model to second network node 12, then the first network node 10 may set up the secure connection. It should be noted that a secure session between two entities is set up together by those two entities. It should further be noted that one or more secure connections may be setup to the central network node 14.

Action 202. The second network node 12 may request an update of the second ML model at the second network node 12. The request may be sent to the first network node 10 or another node such as the central network node 14. It should be noted that the first network node 10 may negotiate, e.g. agree, about data to exchange and/or what is available, e.g. by checking sharing policies for model etc. before sharing the model and/or the updated data. In case the central network node 14 may trigger the second network node 12 to request the model and/or update. The request may thus come from e.g. the second network node 12 directly or via the central network node 14 and may be triggered by the second network node 12 or the central network node 14.

Action 203. The first network node 10 retrieves data for the first ML model, e.g. updated input parameters, weights, inferences.

Action 204. The first network node 10 may then upon receiving the request from the second network node 12 or another node such as the central network node 14, send data e.g. retrieved data to the second network node 12. The data, or message carrying the data, is signed with a signature that can be verified, i.e. that is verifiable, by the second network node 12 as a signature generated by a trusted party. The first network node 10 may e.g. sign the data it shares with the second network node 12 in order to show who provided the modifications to the model, or who provided the model. Alternatively or additionally the model creator signs the base model it provides to all clients to show who created/provided the model.

Action 205. The second network node 12 may then verify the first network node 10. This may be performed as a part of establishing a secure connection with one another, but may alternatively be performed after the secure connection has been established.

Both the first network node 10 and data related to the first ML model being exchanged may be authenticated. The reason for authenticating the first network node 10 is firstly to know an identity of the first network node 10, the second network node 12 is communicating with. When receiving the data, the benefit is that the second network node 12 can verify that the data is coming from a trusted party, e.g. agents belonging to a same group, e.g. business group for agents owned by the same enterprise, or hobbyist group of agents owned by individuals sharing a hobby, e.g. tennis enthusiasts. Typically, the second network node 12 prefers to get the model from a trusted party. However, the second network node 12 can also get data such as a model from a previously unknown network node. In both cases the identity of the network node can be verified and the identity may then be used for signing such as a digital signature, MAC, keyed hash etc., the data such as the first ML model or difference values of the first ML model. This provides proof that the specific network node has provided the data. In case a network node maliciously provides a rogue model or difference values, the signature of the data done with the network node identity binds the data to the network node and thus the network node cannot deny providing the data.

Also for a network node such as the first network node 10, sharing their data with other network nodes authenticating the receiving network node such as the second network node 12 may be performed. Thus, the first network node 10 may share the data, only to authorized network nodes, e.g. only to peers belonging to the same group. This is typically preferred since the data may be something of a business secret that should not be given to just anyone, but typically only to network nodes in the same group, e.g. belonging to the same enterprise.

Action 206. The second network node 12 authenticates the data based on the signature of the signed data. Thus, after successful verification, the received data is required to be authorized in order to be deployed in the system. When the second network node 12 receives the data, the data may be wrapped in a wrapping format, e.g. Open Neural Network Exchange (ONNX) format or Predictive Model Markup Language (PMML) format. To be able to use the data, the second network node 12 may unwrap the data and verify that the data is signed with a signature that can be verified by the second network node as a signature generated by a trusted party, typically by the model creator. When signing the data, also some metadata of e.g. the first ML model may be added and consequently signed, e.g. type of model, type of input and output, possible interpretation of output or output ranges etc. Thus, there may be two different signatures that may be verified: e.g. a base model signature created by model creator; and a Dif signature created by dif provider, i.e. peer providing update.

Action 207. The second network node 12 may then use the authenticated data to update the second ML model in the second network node 12.

Action 208. The second network node 12 may further validate the updated second ML model. After successful authorization of the data, the second network node 12 may perform a validation of the received data in practice while protecting a local working system of the second network node 12, e.g. providing a safe mode validation enforcing system isolation limiting impacts of the shared machine learning inferences, i.e. the output. Inferences may be disregarded and kept in local isolated loops until the functionality of the updated second ML model is validated. After the second network node 12 has finalized the validation, the updated second ML model may thus replace the old second ML model so the second network node 12 may perform more accurate inferences based on the usage of the updated second ML model.

Embodiments herein enable decentralized machine learning across the multitude of edge machine learning domains, and support any machine learning interaction, such as peer-to-peer interaction, between the edge domains but also distribution interaction with a central cloud.

Embodiments herein disclose a solution that may use three dependent segments that are designed to be runnable even in computationally light environments such as a constrained edge cloud. These three segments are

- Authentication of the machine learning model exchanger.

- Authorization of machine learning model.

- Validation of the machine learning model. The network nodes may thus act as agents. An agent may comprise an independent software that is responsible for machine learning activities such as model training, model evaluation, resource monitoring and agent-to-agent communication. The agent is capable of performing machine learning in both centralized and decentralized manner. Agents here may also be referred to as peers. The first and second network nodes may thus act as collaborating peers.

When e.g. the second network node 12 detects a demand to update and improve its own local machine learning model context due to detected system optimization problem, it can request potential updates across neighborhood machine learning agents such as edge peers, e.g. the first network node 10. Models here can be related to optimization of the whole system or a subsystem, where the whole system may contain multiple independent ML models that are interconnected with one another to form the whole system. A similar situation can be triggered in the opposite direction where the first network node 10, that has a “good” model, may offer updates for neighborhood second network node 12. Another machine learning interaction case is related to the distributed or collaborative machine learning where one machine learning network node can propose collaborative machine learning utilizing other network nodes and exchange any related machine learning model context i.e. data related to ML modelling.

Embodiments herein provide a secure authentication and validation solution for the machine learning related request(s) prior to any machine learning model context exchange, which in turn also has to be verified and validated. The second network node 12 receiving the data, valid for all cases, may after authenticating the sending first network node 10, analyze the status of the local machine learning model and may, after verification and validation of the offered model, accept the offered model update.

After collaboration peer authentication, the receiving second network node 12 processes the received data related to the first ML model against e.g. local machine learning related policies and may compose a response. Received data here may contain multiple elements such as machine learning algorithms, configuration, input/output data, metadata, sharing policies and required capabilities, system/subsystem references, etc. The second network node 12, that receives the data, may analyze the received data to identify firstly compatibility with the second ML model and, secondly, scope of the collaboration where, for instance, the received update can be violating sharing polices or be too wide. In the latter case the ML model may be for the whole (wide) system, where the second network node 12 may decide to apply only an isolated subset of the ML model update, to optimize only a subset of the system rather than the whole system. In that case, the original offered model can be partly incompatible or against the local machine learning policies.

Data related to the first ML model may be shared in multiple ways. The full model of the first network node 10 with weights etc. could be shared. However, then it is difficult to know the origin of the model. Instead the original model, e.g. signed by its creator, can be shared (with the signature of the creator) to provide provenance of the model. Alternatively or additionally, when the first and second network nodes already has the base model, the first and second network nodes may verify that the first and second network nodes work with the same base model e.g. by sharing the signature, a hash of the signature, or a hash of the base model. The changes that have been done to the model, or more specifically to its weights, through training may be provided as a separate dataset. This dataset may be presented as a difference (dif) between original weights and trained weights. This way the second network node 12 may verify the original model and who created it, and then apply the changes (dif) to the model as received from the first network node 10. The received dif may be verified to originate from the authenticated first network node 10. If only some weights have been changed, providing the dif will reduce the amount of data that needs to be communicated. It should be noted that weights may be shared instead of the difs e.g. if all weights have changed. If the second network node 12 already has the same original ML model, it is enough to receive the dif from the first network node 10. This is something that can be negotiated between the network nodes after authentication but may be performed before the data exchange. Likewise, in some embodiments the second network node 12 may receive data from multiple network nodes and if the second network node 12 wants to do an average of weights from multiple network nodes, the second network node 12 may receive difs or weights from f the network nodes.

The method actions performed by the first network node 10 in the communications network for handling the first machine learning model being associated with the second machine learning model in the distributed machine learning model architecture according to embodiments will now be described with reference to a flowchart depicted in Fig. 3a. The actions do not have to be taken in the order stated below, but may be taken in any suitable order. Actions performed in some embodiments are marked with dashed boxes. The first ML model may e.g. be: an updated or optimized version of the second ML model, a partly same ML model as the second ML model; a ML model of a different manufacturer; or a different version of the second ML model but of the same base ML model. Action 301. The first network node 10 may setup, or the second network node 12 may setup in cooperation with the first network node 10, a secure connection to the second network node 12, and wherein the secure connection is used for transmitting the data. The exchange of the data and negotiation of the exchange of the data may be performed over said secure connection. Action 302. The first network node 10 may receive a request to retrieve data of the first ML model. This request may be sent over the secure connection.

Action 303. The first network node 10 may verify that the second network node 12 is a trusted party, e.g. verifying that the second network node 12 belongs to a same group such as a same company or similar. Action 304. The first network node 10 transmits to the second network node 12, data related to the first ML model, wherein the data is signed with a signature that is verifiable by the second network node 12 as a signature generated by a trusted party. The data may comprise the first ML model and/or data related to one or more weight values of the first ML model. The data may e.g. comprise one or more difference values, see dif above, relative previous weights used in the first ML model. The signature may be that of a creator of the model and/or of the first network node 10.

The method actions performed by the second network node 12 in the communications network for handling the second machine learning model being associated with the first machine learning model in the distributed machine learning model architecture according to embodiments will now be described with reference to a flowchart depicted in Fig. 3b. The actions do not have to be taken in the order stated below, but may be taken in any suitable order. Actions performed in some embodiments are marked with dashed boxes. Action 311. The second network node 12 may set-up the secure connection to the first network node 10 for securely communicating with the first network node 10.

Action 312. The second network node 12 may transmit to the first network node 10 or another network node, a request to update the second machine learning model.

Action 313. The second network node 12 receives from the first network node 10, data related to the first ML model, wherein the data is signed with the signature that is verifiable by the second network node 12 as a signature generated by a trusted party. The data may comprise the first ML model and/or data related to one or more weight values of the first ML model. The data may e.g. comprise one or more difference values, see dif above, relative to previous weights used in the first ML model. Action 314. The second network node 12 may verify that the first network node is a trusted party

Action 315. The second network node 12 authenticates the data based on the signature.

Action 316. The second network node 12 may average the data received with other data received from one or more other network nodes.

Action 317. The second network node 12, upon authentication, updates the second machine learning model taking the received data into account.

Action 318. The second network node 12 may further validate that the updated second machine learning model is executing correctly.

Embodiments herein such as those mentioned above will now be further described and exemplified. The text below is applicable to and may be combined with any suitable embodiment described above.

Referring back to action 304, wherein the first network node 10 signs the data for verifying its own identity for proving that it is the first network node 10 that provides the data. Digital signatures based on certificates or raw public keys may be used since symmetric key solutions are more difficult to scale when it is unknown beforehand which network nodes the first network node 10 will communicate with. Below is an example of an authentication exchange between two agents, wherein a source agent is an example of the first network node 10 and a destination agent is an example of the second network node 12:

- The source agent and destination agent may be identified as suitable communicating peers for model exchange.

- The agents may establish a secure session between each other, and typically mutually authenticate each other. Typically using public key certificates.

- The destination agent may request the model from the source agent. It should here be noted that the agents may negotiate what exactly to send: e.g. a fully trained model of the source agent; an original model and a dif of weights or trained weights; or only send the dif(s) or full set of trained weights if both agents have the same base model.

- The source agent provides the data such as the model and/or potential additional data to the destination agent, all the data is signed. E.g. the model may be signed by only the source agent, and that requires that the destination agent fully trusts the source agent. Alternatively or additionally, the model (with original weights etc.) may be signed by the model creator, and the source agent provides modifications, e.g. changes to weights or dif, it has created to the model, the modifications may be signed by the source agent. Thus, the data may be signed with one or more signatures.

-This step is described in more detail below: The destination agent verifies the received data such as the model, including verifying that it is signed by a trusted party, is the type of model requested etc., may further verify it works as expected and then applies the model in its own system.

Some embodiments herein relate to negotiating what data to exchange, such as a full model, an original model, dif values, or only dif values, and securing the exchanged data.

In addition or instead of using signatures for verifying the network node, except for model creator signature of original model, in the model exchanged between network nodes, the exchange of the data could be recorded in a blockchain. It would provide proof of the transaction, i.e. who sent what to whom and when.

In this case, network nodes may register to a trusted blockchain system and may use a smart contract exchange format with corresponding business model and sharing policies. Registered peers, i.e. network nodes, may be graded based on trust considering previous exchanges and may also share smart tags indicating machine learning model type and sharing polices. A first peer, such as the second network node 12, may discover registered peers providing wished model and acceptable sharing policies. Best matching peer(s) may be contracted in the smart contract format under validated agreed business model. Contract and related transactions establishment follow regular smart contract mechanisms. The first peer may indicate a model exchange transaction where model provider may validate and contribute transaction context. All such transactions may be validated by involved contractors and recorded as blockchain transactions. This is depicted in Fig. 4. Note that decentralized trust system mechanisms are simplified in this sequence flow. Pre-requirement here is agents’ (network nodes) registration to such system. Agents may create smart contracts under agreed set of business polices on model sharing business. Once agents with the same interest are contracted they may trigger related model exchange transactions and use trusted system to openly authenticate such transactions and establish secure model exchange sessions.

Fig. 5 discloses an embodiment of an alternative model exchange by averaging the weights. In this approach, the agent that detects non-optimal performance of the model, requests weight matrices from network nodes that are utilizing similar model in machine learning based inferences. Average of the received weights matrices reduces the possible domination of those weight values that might be suspicious. The second network node 12 may perform the actions stated above to setup the secure connections and may then detect a need to update the second ML model e.g. increase loss of the model, see Action 501. The second network node 12 then requests an update of the second ML model, e.g. from the first network node 10 or a central network node or another network node, see Action 502. The first network node 10 receives the request from the second network node 12 or another network node and transmits the data e.g. model and/or weights to the second network node 12 directly or via another network node, see Action 503. The second network node 12 authenticates the data and may then average the data such as the weight values, see action 504. Thus, by averaging at least the verification of the identity of the network node providing it, is not necessary in all cases, i.e., when the second network node collects models, or weights of models, from multiple network nodes in order to e.g. average the weights used by multiple network nodes in order to get a new model. As the data collected from the network nodes is not used directly, but rather as input to a function that e.g. averages the data, the data received by any individual network nodes does not necessarily have to be signed by the respective network node. Especially when the amount of network nodes whose data is being used is big, the effect of one single network node’s input is not so big, especially as extreme values would typically be trimmed off before, e.g. averaging the received data. Also, the network node doing, e.g. averaging of multiple weights is aware that the result of the, e.g. averaging, may not be the optimal weights for the model (no-one has provided and/or used those specific weights nor said they will work) and thus the network node takes the risk, and is itself responsible for the choice. Authenticating the network node and recording whose input has been used may further be applied.

When considering the creation and distribution of the data, the actions of signing and wrapping the data, may be performed in different orders, resulting in different benefits and drawbacks as follows:

- First signing data and then wrapping it. The creator of the model can create just one signature, which can be carried regardless of how the model is wrapped and later potentially re-wrapped. This means that the creator just has to create one signature, instead of one per wrapping format. It also means that if one network node receives the data such as the model in e.g. PMML wrapper, and later wants to forward the model to another network node that only supports e.g. ONNX wrapper, the network node can just wrap it however the network node prefers and the signature will still be usable. However, the receiver of a model signed this way first has to unwrap it before it can verify the signature, i.e. that the model is from a trusted source. This is especially undesirable in cases where unwrapping is a heavy operation, which is the case e.g. for ONNX wrapper as it makes it possible to consume the receiver’s resources with invalid data.

- First wrapping the data and then signing it. The creator may thus sign e.g. the model wrapped in each wrapping format in which it will share the model, i.e. multiple signatures, one per wrapping format. The benefit is that the receiver of the model, that is the second network node, can immediately verify the source and integrity of the received model without unwrapping it. However, if the receiver of the model wants to further share the model, it has to share it in the same wrapper as it was received as the signature is of the wrapped representation of the model.

The second alternative is often the preferred option as it does not introduce vulnerabilities as is the case with alternative 1. Of course, forwarding the data to other network nodes is limited due to lack of possibility to change wrapping format.

Embodiments may provide different steps of machine learning model authorization, such as one or more of the following: a. Metadata and machine learning model weights authorization and validation b. Model type validation c. Machine learning model software compatibility validation d. Model delta or dif values validation Note that order of a and b could be changed if data is first signed and then wrapped as discussed earlier. a. Metadata and machine learning model weights authorization and validation One way of providing trust in a received data, such as a model, is to have all models issued by the manufacturer singed by the manufacturer. In addition to signing the model, the manufacturer may also sign metadata about the model such as what the model does, what type of input and output it uses etc. When the model is trained, the weights of the model will change due to training, which means that the signature created by the manufacturer would no longer be valid. Therefore, the network node about to share its model may provide the data in two parts; the original model, if the second network node does not have it, with accompanying valid manufacturer signature, and the difference of the weights when comparing the original model and the trained model, i.e. a set of data indicating how the weights of the original model should be adjusted to reach the same state as the trained model, e.g. wT = w1+0,123, w2’ = w2-.0,5 etc. We call this data set dif. If all weights, or a certain part of weights, have been modified, the new set of weights may be provided instead of the difference of the weights (difs). Using this approach, the second network node receiving the model and dif, can from the model signature verify who has created the model, and optionally from attached metadata what type of model it is, input value set/type and output value set/type as well as potential mapping from output value to output category, e.g. in image recognition model: 0,001 >x>0.002 means output is a tree. The second network node thus knows what the model does and how it should be used, and who has created it. Then it only has to trust that the signed dif provided by the network node actually improves the model. The second network node may verify that the dif is signed by the network node. Again, the most natural type of signature would be a public key certificate signature, and the certificate would typically be included with the signature and e.g. the model. b. Model type validation

After the agent, i.e. the second network node 12, has successfully received the received or updated model, then the agent verifies that the model complies with the type of the model that was initially requested. In addition to the model type, the agent validates that the received model is in a whitelisted model exchange format e.g. ONNX. This may be performed by looking at the metadata of the received model and verifying that it, i.e. the model + the metadata, has been signed by a trusted party, such as the manufacturer. c. Machine learning model software compatibility validation

Before the model exchange takes place, the model is converted to a format that represents the model, i.e. the received model is wrapped. Conversion, such as the wrapping, of the model facilitates the provisioning of the machine learning model over the network by utilizing commonly used protocols such as Hypertext Transfer Protocol (HTTP), and it enables the models to be used between multiple state-of-the-art tools. Previously mentioned formats include formats such as open neural network exchange (ONNX), predictive model markup language (PMML) and neural network exchange format (NNEF) that are popular in the field of machine learning.

After the agent has executed step b. (model type validation), the agent validates that the package can be converted/unwrapped as an executable format by utilizing local backend. If the model is shared by sharing the original model and dif (instead of sharing the full trained model, which requires more trust in the sender of the trained model), the original model might already be available in an accepted format. In this case the agent sharing its model just may create and provide the dif, by comparing the weights of the original model and the trained model that is about to be shared. The agent then signs the dif so that the receiver can verify the origin of the dif.

After the machine learning agent has received a machine learning model context from other agent, the agent will conduct the authorization of the model by ensuring that the received file is packaged in a model specific format, extracting meta data, validating that the local backend can be used to convert the received model in desired executable format and validating that the model is signed by a trusted party. These steps are particularly designed to guarantee that the received package is actually a pre-trained machine learning model that matches what is requested. If some of these steps do not meet the expectations, the machine learning agent is able to prevent the utilization of the model and avoid abusive actions from third party members. The agent also verifies that the received dif is signed by the network node that provided the model d. Model delta/dif validation

After the received model is converted into an executable format, the agent may analyze the dif and the (optionally) received model to verify the dif can be applied (i.e. correct number of weights), and that the dif is signed by a trusted party (typically the peer agent). In machine learning, the learning is often referred to training which aims to tune the weight values of the model in order to achieve higher accuracy of the model. For that reason, the dif indicates the differences in the weights of the original and trained model. If the dif includes weights that do not fit the model (e.g. too many weights), the base structure of the model is not compatible with the dif, and the agent may prevent the utilization of the received model. From time to time, the base structure of the models are tuned, by data scientist, to promote improved performance of the system so changes in base structure do not indicate necessarily the interference of a third party member. When the base structure is changed, the updated model with the new base structure needs to be provided with a signature of a trusted party (e.g. manufacturer that has changed the base structure).

If the machine learning agent already has the original model it can receive/request only the weights of the model, from one or multiple network nodes, instead of the complete model. If weights from multiple network node are requested the agent may merge the received weights by utilizing e.g. averaging. Averages of multiple received weights reduce the dominancy of the weight values that are caused due to shortage in training or, in worst case, due to abusive actions against the system. Averaging of the weights is mainly utilized in highly scaled peer-to-peer model exchange where the peer is assumed to be trusted e.g. devices, such as cars, produced by the same company could be thought as trusted peers.

Figs 6 and 7 are overviews of flowcharts that describe both complete authorization and non-authorized model exchange.

Action 601. A model is received at the second network node 12.

Action 602. The second network node 12 may determine that the model is convertible into an executable format.

Action 603. The second network node 12 may determine that the model is a correct model.

Action 604. The second network node 12 may further determine that the model is signed by a trusted party. Action 605. The second network node 12 may further determine that dif of weights are signed by trusted party.

Action 606. The second network node 12 may in that case consider the model as authorised. In case the second network node determines that the model is not any of the actions 602-605, the second network node 12 may discard the received model, action 607. Action 701. The second network node 12 may receive weights from one or more network nodes.

Action 702. The second network node 12 may determine whether the weights are compatible with a model.

Action 703. The second network node 12 may then average the received weights. Action 704. The second network node 12 may then update the model with the averaged weights.

In case the second network node 12 determines that the weights are not compatible with the model, the second network node 12 may discard the received model, action 705. Please note that the segments in flowchart of complete authorization sequence can be executed in different order based on the security policies of the machine learning agent. For example, if the agent receives a model that is signed before converting it to an exchange format, e.g. ONNX, the agent may convert the model before it can inspect the signature. In weight averaging, there would not need to be authorization since the averaging removes the dominance of suspicious weight matrices, when the received number of weights is high.

For validating the updated second ML model, see action 318 above, some embodiments herein consider preferred protective model validation applicable for the edge machine learning as disclosed in Fig. 8.

Action 801. The second network node 12 may obtain an authorized model.

Action 802. The second network node 12 may mount the model in safe mode.

Action 803. The second network node 12 may further determine whether the model improves overall performance.

Action 804. If the model improves overall performance the second network node 12 may replace the model, i.e. use the received model.

Action 805. If model does not improve overall performance the second network node 12 may discard the received model. Using redundant isolated loops: Validation is done directly in the system allowing processing of the real system inputs, e.g. data, and generation of the related outputs without authorizing end actions on the rest of the system. That can be done by allocating a redundant ML subsystem for the received model in parallel to the original model where data inputs are shared to both the old and new model/subsystem while outputs are considered only from the original ML subsystem. Basically, there would in the system be two different instances of a model performing a specific action, both getting the same live system input, but only the output of the original model being used in the live system. The output of the new model could be compared to the output of the original and the difference could be recorded. From this it is possible to see if there are some cases where the models give very different outputs which could indicate an issue with the new model. After suitable time of evaluation of the new model, if it performs as expected, the original model can be replaced by the new model.

Validation in a local sandbox: This alternative would allocate a complete system sandbox where the model can be freely validated without affecting the physical edge system. Thus, there would be two complete systems running, the live system and the sandbox with the new model replacing the old model. This alternative may be disregarded in constrained edge domains due to high resource requirements.

Validation in an external trusted sandbox: This alternative would use a trusted external sandbox offered by a trusted entity. The trusted entity can be business domain specific (e.g. enterprise sandbox) or entity regarded trusted by the trusted system, e.g. blockchain smart contract, or authorized model provider.

Validation of the functional performance of the received machine learning model in previously mentioned validation environments. The received model should be compatible with the data that the system/subsystem utilizes. In addition, the outputs are monitored, and the difference between outputs of received and old models are observed. If the absolute difference between these two outputs is relatively high, it is likely that the new model is not a good choice as typically changes provide more fine-tuning optimizations than full change of operation. This is something that may be analyzed and decided case by case.

As mentioned above, Fig. 8 shows an overview of a model validation procedure. In this flowchart, the authorized model is mounted in a safe mode which refers to a mode where the outputs of the model are not utilized after the model is validated. Model is validated if the model actually manages to improve overall performance after testing with comprehensive sample of data. By providing a validation of the updated second ML model embodiments herein:

• provide solution for machine learning model validation on the edge

• provide solution for machine learning peer validation on the edge

• enable edge machine learning optimization independently on application lifecycle.

Fig. 9 is a block diagram depicting the first network node 10 for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture according to embodiments herein.

The first network node 10 may comprise processing circuitry 901 , e.g. one or more processors, configured to perform the methods herein.

The first network node 10 may comprise a transmitting unit 902. The first network node 10, the processing circuitry 901 , and/or the transmitting unit 902 is configured to transmit to the second network node 12, data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node 12 as the signature generated by the trusted party. The data may comprise the first machine learning model. The data may comprise data related to one or more weight values of the first machine learning model. The data may comprise one or more difference values relative previous weights used in the first machine learning model.

The first network node 10 may comprise a setting up unit 903. The first network node 10, the processing circuitry 901 , and/or the setting up unit 903 may be configured to set up a secure connection to the second network node 12, and wherein the secure connection is used for transmitting the data.

The first network node 10 may comprise an receiving unit 904. The first network node 10, the processing circuitry 901 , and/or the receiving unit 904 may be configured to receive the request to retrieve data of the first machine learning model.

The first network node 10 may comprise a verifying unit 905. The first network node 10, the processing circuitry 901 , and/or the verifying unit 905 may be configured to verify that the second network node 12 is a trusted party.

The first network node 10 further comprises a memory 906. The memory 906 comprises one or more units to be used to store data on, such as signatures, loss values, weight values, data such as machine learning algorithms, configuration, input/output data, metadata, sharing policies and required capabilities, system/subsystem references, etc. and applications to perform the methods disclosed herein when being executed, and similar. The first network node 10 may further comprise a communication interface comprising e.g. one or more antenna or antenna elements. The methods according to the embodiments described herein for the first network node 10 are respectively implemented by means of e.g. a computer program product 907 or a computer program, comprising instructions, i.e., software code portions, which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the first network node 10. The computer program product 907 may be stored on a computer-readable storage medium 908, e.g. a disc, a universal serial bus (USB) stick or similar. The computer-readable storage medium 908, having stored thereon the computer program product, may comprise the instructions which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the first network node 10. In some embodiments, the computer-readable storage medium may be a transitory or a non-transitory computer-readable storage medium.

Fig. 10 is a block diagram depicting the second network node 12 for handling a second machine learning model being associated with a first machine learning model in a distributed machine learning model architecture according to embodiments herein.

The second network node 12 may comprise processing circuitry 1001 , e.g. one or more processors, configured to perform the methods herein.

The second network node 12 may comprise a receiving unit 1002. The second network node 12, the processing circuitry 1001 , and/or the receiving unit 1002 is configured to receive from the first network node 10, data related to the first machine learning model, wherein the data is signed with the signature that is verifiable by the second network node 12 as the signature generated by a trusted party. The data may comprise the first machine learning model. The data may comprise data related to one or more weight values of the first machine learning model. The data may comprise one or more difference values relative previous weights used in the first machine learning model.

The second network node 12 may comprise an authenticating unit 1003. The second network node 12, the processing circuitry 1001 , and/or the authenticating unit 1003 is configured to authenticate the data based on the signature.

The second network node 12 may comprise an updating unit 1004. The second network node 12, the processing circuitry 1001 , and/or the updating unit 1004 is configured to, upon authentication, update the second machine learning model taking the received data into account. The second network node 12 may comprise a verifying unit 1005. The second network node 12, the processing circuitry 1001 , and/or the verifying unit 1005 may be configured to verify that the first network node 10 is a trusted party.

The second network node 12 may comprise a validating unit 1006. The second network node 12, the processing circuitry 1001 , and/or the validating unit 1006 may be configured to validate that the updated second machine learning model is executing correctly.

The second network node 12 may comprise a setting up unit 1007. The second network node 12, the processing circuitry 1001 , and/or the setting up unit 1007 may be configured to set up a secure connection to the first network node 10 for securely communicating with the first network node 10.

The second network node 12 may comprise a transmitting unit 1008. The second network node 12, the processing circuitry 1001 , and/or the transmitting unit 1008 may be configured to transmit to the first network node 10 or another network node, a request to update the second machine learning model.

The second network node 12 may comprise an averaging unit 1009. The second network node 12, the processing circuitry 1001 , and/or the averaging unit 1009 may be configured to average the data received with other data received from one or more other network nodes.

The second network node 12 further comprises a memory 1010. The memory 1010 comprises one or more units to be used to store data on, such as such as signatures, loss values, weight values, data such as machine learning algorithms, configuration, input/output data, metadata, sharing policies and required capabilities, system/subsystem references, etc. and applications to perform the methods disclosed herein when being executed, and similar. The second network node 12 may further comprise a communication interface comprising e.g. one or more antenna or antenna elements.

The methods according to the embodiments described herein for the second network node 12 are respectively implemented by means of e.g. a computer program product 1011 or a computer program, comprising instructions, i.e., software code portions, which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the second network node 12. The computer program product 1011 may be stored on a computer-readable storage medium 1012, e.g. a disc, a universal serial bus (USB) stick or similar. The computer-readable storage medium 1012, having stored thereon the computer program product, may comprise the instructions which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the second network node 12. In some embodiments, the computer-readable storage medium may be a transitory or a non-transitory computer-readable storage medium.

When using the word "comprise" or “comprising” it shall be interpreted as non- limiting, i.e. meaning "consist at least of".

It will be appreciated that the foregoing description and the accompanying drawings represent non-limiting examples of the methods and apparatus taught herein. As such, the apparatus and techniques taught herein are not limited by the foregoing description and accompanying drawings. Instead, the embodiments herein are limited only by the following claims and their legal equivalents.

Claims

1. A method performed by a first network node (10) in a communications network(IOO) for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture, the method comprising:

- transmitting (304) to a second network node (12), data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node (12) as a signature generated by a trusted party.

2. The method according to claim 1 , wherein the data comprises the first machine learning model.

3. The method according to any of the claims 1-2 wherein the data comprises data related to one or more weight values of the first machine learning model.

4. The method according to claim 3, wherein the data comprises one or more difference values relative previous weights used in the first machine learning model.

5. The method according to any of the claims 1 -4, further comprising

- setting up (301) a secure connection to the second network node (12), and wherein the secure connection is used for transmitting the data.

6. The method according to any of the claims 1 -5, further comprising - receiving (302) a request to retrieve data of the first machine learning model.

7. The method according to any of the claims 1 -6, further comprising - verifying (303) that the second network node (12) is a trusted party.

8. A method performed by a second network node (12) in a communications network (100) for handling a second machine learning model being associated with a first machine learning model in a distributed machine learning model architecture, the method comprising:

- receiving (313) from a first network node (10), data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node (12) as a signature generated by a trusted party;

- authenticating (315) the data based on the signature; and upon authentication,

- updating (317) the second machine learning model taking the received data into account.

9. The method according to claim 8, wherein the data comprises the first machine learning model.

10. The method according to any of the claims 8-9 wherein the data comprises data related to one or more weight values of the first machine learning model.

11. The method according to claim 10, wherein the data comprises one or more difference values relative previous one or more weight values used in the first machine learning model.

12. The method according to any of the claims 8-11 , further comprising

- verifying (314) that the first network node is a trusted party.

13. The method according to any of the claims 8-12, further comprising

- validating (318) that the updated second machine learning model is executing correctly.

14. The method according to any of the claims 8-13, further comprising

- setting up (311) a secure connection to the first network node (10) for securely communicating with the first network node (10).

15. The method according to any of the claims 8-14, further comprising

- transmitting (312) to the first network node (10) or another network node, a request to update the second machine learning model.

16. The method according to any of the claims 8-15, further comprising

- averaging (316) the data received with other data received from one or more other network nodes.

17. A first network node (10) for handling a first machine learning model being associated with a second machine learning model in a distributed machine learning model architecture, wherein the first network node (10) is configured to: transmit to a second network node (12), data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node (12) as a signature generated by a trusted party.

18. The first network node (10) according to claim 17, wherein the data comprises the first machine learning model.

19. The first network node (10) according to any of the claims 17-18, wherein the data comprises data related to one or more weight values of the first machine learning model.

20. The first network node (10) according to claim 19, wherein the data comprises one or more difference values relative previous weights used in the first machine learning model.

21. The first network node (10) according to any of the claims 17-20, wherein the first network node is further configured to set up a secure connection to the second network node (12), and wherein the secure connection is used for transmitting the data.

22. The first network node (10) according to any of the claims 17-21 , wherein the first network node (10) is further configured to receive a request to retrieve data of the first machine learning model.

23. The first network node (10) according to any of the claims 17-22, wherein the first network node (10) is further configured to verify that the second network node (12) is a trusted party.

24. A second network node (12) for handling a second machine learning model being associated with a first machine learning model in a distributed machine learning model architecture, wherein the second network node (12) is configured to: receive from a first network node (10), data related to the first machine learning model, wherein the data is signed with a signature that is verifiable by the second network node as a signature generated by a trusted party; authenticate the data based on the signature; and upon authentication, update the second machine learning model taking the received data into account.

25. The second network node (12) according to claim 24, wherein the data comprises the first machine learning model.

26. The second network node (12) according to any of the claims 24-25, wherein the data comprises data related to one or more weight values of the first machine learning model.

27. The second network node (12) according to claim 26, wherein the data comprises one or more difference values relative previous one or more weight values used in the first machine learning model.

28. The second network node (12) according to any of the claims 24-27, wherein the second network node (12) is further configured to verify that the first network node (10) is a trusted party.

29. The second network node (12) according to any of the claims 24-28, wherein the second network node (12) is further configured to validate that the updated second machine learning model is executing correctly.

30. The second network node (12) according to any of the claims 24-29, wherein the second network node (12) is further configured to set up a secure connection to the first network node (10) for securely communicating with the first network node

(10).

31. The second network node (12) according to any of the claims 24-30, wherein the second network node (12) is further configured to transmit to the first network node

(10) or another network node, a request to update the second machine learning model.

32. The second network node (12) according to any of the claims 24-31 , wherein the second network node (12) is further configured to average the data received with other data received from one or more other network nodes.

33. A computer program product comprising instructions, which, when executed on at least one processor, cause the at least one processor to carry out the method according to any of the claims 1-16, as performed by the first network node (10) and the second network node (12), respectively.

34. A computer-readable storage medium having stored thereon a computer program product, comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method according to any of the claims 1-16, as performed by the first network node (10) and the second network node (12), respectively.