WO2023277780A1

WO2023277780A1 - Enabling downloadable ai

Info

Publication number: WO2023277780A1
Application number: PCT/SE2022/050659
Authority: WO
Inventors: Roy TIMO; Henrik RYDÉN; Mårten SUNDBERG; Rakesh Ranjan
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2021-07-01
Filing date: 2022-06-30
Publication date: 2023-01-05

Abstract

Systems and methods related to downloading of a compiled machine code version of a Machine Learning (ML) model to a wireless communication device (WCD) are disclosed. In one embodiment, a method performed by a WCD comprises receiving, from a server node, a complied machine code version of a ML model, the compiled machine code version of the ML model being one of a set of compiled machine code versions of ML models, wherein the compiled machine code version of the ML model received by the WCD is precompiled for a hardware of the WCD. The method further comprises performing one or more radio network operations using the compiled machine code version of the ML model. Related embodiment of a server node and a Radio Access Network (RAN) node and methods of operation thereof are also disclosed.

Description

ENABLING DOWNLOADABLE Al

Related Applications

This application claims the benefit of provisional patent application serial number 63/217,642, filed July 1, 2021, the disclosure of which is hereby incorporated herein by reference in its entirety.

Technical Field

The present disclosure relates to a wireless communication system and, more specifically, enabling downloadable Artificial Intelligence (AI) or Machine Learning (ML) models in a wireless communication system.

Background

There are ongoing discussions in Third Generation Partnership Project (3GPP) concerning the integration o Artificial Intelligence I Machine Learning (AI/ML) in 3 GPP networks. For example,

• There is an ongoing study item on data collection, led by the Radio Access Network (RAN) Working Group 3 (WG3) (see 3GPP Technical Report (TR)

37.817 VO.1 .0, “Study on enhancement for data collection for NR and EN-DC”, Release 17).

• There is strong support for a RAN-1 study item on AI/ML on the physical layer, and Ericsson has submitted a draft study item description (see 3GPP RWS- 210382, "Draft New SID: Study on AI/ML for PHY Enhancements", Ericsson, June 28 to July 2, 2021).

One way to enable AI/ML technologies in 3GPP networks is via downloadable AI, the basic idea of which is as follows:

• The network signals an ML model to the User Equipment (UE) (i.e., the UE downloads an ML model from the network).

• The UE runs the ML model locally on its hardware (i.e., the UE performs inference).

The general concept of downloadable AI has been proposed.

Downloadable AI enables the network to run custom ML models on a UE using data it would otherwise not have access to. For example, the network does not have access to the UE's downlink channel estimates rom Channel State Information Reference Signal (CSI-RS). Downloadable AI allows the network to compute an ML- based channel state information ( CSl ) report directly from the UE's channel estimates, using an ML-model of the network's choice.

Some other benefits of downloadable AI are as follows:

• The UE does not need to signal the ML model's inputs o the network (to, for example, enable the network to run the ML model), because the ML model is already located on the UE.

• The ML model can be executed more frequently at the UE, for example, whenever the UE receives new information.

Downloadable AI can be viewed as an advanced UE configuration where the network signals an advanced algorithm to the UE.

The Open Radio Access Network (ORAN) working group on AI/ML has identified the need to have an entity responsible for compiling ML-models to a "binary" format optimized for efficient execution of the ML model. See, for example, Figure 1, which is taken from ORAN Working Group 2 AI/ML, "AI/ML workflow description and requirements, "Tec h n i ca I Report (v01.02.04), 2021.

The description for the compiling host from table 1 is reproduced below.

Table 1.

Example use cases of downloadable AI are as follows.

Example 1: Compression of Channel State Information (CSI) The idea of compressing CSI using ML models is being actively studied in academia, for example, see Zhilin Lu, Xudong Zhang, Hongyi He, Jintao Wang, and Jian Song, "Binarized Aggregated Network with Quantization: Flexible Deep Learning Deployment for CSI Feedback in MassiveMIMO System" arXiv, 2105.00354 vl, May 2021 and the references therein. In addition, ML-based CSI compression is a commonly proposed topic for Release 18 by 3GPP companies.

Some existing work on ML-based CSI compression is a follows. International Patent Application Publication No. WO/2020180221 A1 (hereinafter referred to as "the '221 Application") proposed using autoencoders (AE) to compress the LIE'S CSI-RS channel estimate. An AE is an artificial neural network consisting of two parts: an encoder side (here located at the UE), and a decoder side (here located at the network). The interface layer between the encoder and decoders is called the bottleneck layer, and it has few nodes in comparison to the input. The UE uses the encoder to compress its channel estimate. The output of the encoder (the bottleneck layer output) is signaled over the uplink to the network, where it is decompressed. Several embodiments were proposed involving downloadable AI (e.g., signalling of part or all of encoder from the network to the UE).

The methods described in the '221 Application have been developed for compressing the channel for improving the Observed Time Difference of Arrival (OTDOA) positioning accuracy in multipath environments. OTDOA is one of the positioning methods introduced for Long Term Evolution (LTE) in Release 9. The richer channel information can enable the network to test multiple hypotheses for the position estimation at the network side and increases the potential of a more accurate position estimation. In this example, the encoder part of the neural network needs to be signaled from the network to the device using the downloadable AI concept.

The following existing technologies do not explicitly refer to downloadable AI. However, downloadable AI could be used to enhance the proposed solutions by, for example, enabling the network to configure part or all of the ML models used in the UEs:

• Building on the '221 Application, transfer learning has been proposed as a solution to reduce model signaling overhead. Essentially, the network uses transfer learning to modify its decoder to work well in different deployment scenarios with a single UE encoder. • A new signal from the UE to the network - a channel compression quality indicator- has been proposed.

• A method for training ML models in live 3GPP networks, using CSI-RS and Sounding Reference Signal (SRS), has been proposed.

Example 2: Beam Measurement Prediction

The UE can use an ML model to reduce its measurement related to beamforming. In NR, one can request a UE to measure on a set of CSI-RS beams. A stationary UE typically experiences less variations in beam quality in comparison to a moving UE. The stationary UE can, therefore, save battery by reducing its beam measurement by instead using an ML model to predict the strength instead of measuring it. It can do this, for example, by measuring a subset of the beams and predicting the rest of the beams.

Example 3: Secondary Carrier Prediction

A method to configure a UE with one or more ML models for executing radio networking operations has been described. This can enable less signaling in comparison to use cases where the model input is located at the device side but the ML model is at the network side. One such use case is the secondary carrier prediction use case. In order to detect a node on another frequency using target carrier prediction as described in U.S. Patent Application Publication No. US2019/0357057A1 ("Target Carrier Radio Predictions using Source Carrier Measurements"), it requires the UE to perform signalling of source carrier information, where a mobile UE periodically transmits source carrier information to enable the macro node to handover the UE to another node operating at a higher frequency. Using target carrier prediction, the UE does not need to perform inter-frequency measurements, leading to energy savings at the UE. However, frequent signalling of source carrier information that enables prediction of the secondary frequency can lead to an additional overhead and should thus be minimized. The risk of not performing frequent periodic signalling is missing an opportunity of doing an inter-frequency handover to a less-loaded cell on another carrier.

The UE can instead receive the model and use source carrier information as input to the model, which then triggers an output indicating coverage on the frequency-2 node at location 2. This reduces the need of frequent source carrier information signaling, while enabling the UE to predict the coverage on frequency 2 whenever its model input changes. This is illustrated in Figure 1. Example 4: Signal quality drop prediction as described in International Publication No. WO2020/226542A1 ("Network Node, User Equipment and Methods for Handling Signal Quality Variations").

Summary

Systems and methods related to downloading of a compiled machine code version of a Machine Learning (ML) model to a wireless communication device (WCD) are disclosed. In one embodiment, a method performed by a WCD comprises receiving, from a server node, a complied machine code version of a ML model, the compiled machine code version of the ML model being one of a set of compiled machine code versions of ML models, wherein the compiled machine code version of the ML model received by the WCD is precompiled for a hardware of the WCD. The method further comprises performing one or more radio network operations using the compiled machine code version of the ML model. In this manner, the WCD does not need to download and compile the ML model, thus saving time and energy resources at the WCD.

In one embodiment, the method further comprises receiving a unique ML model identity (ID) from a Radio Access Network (RAN) node and sending the unique ML model ID to the server node, wherein receiving the compiled machine code version of the ML model comprises receiving one of the set of compiled machine code versions of the ML models associated to the unique ML model ID from the server node. In one embodiment, the method further comprises sending additional information to the server node in addition to the unique ML model ID, wherein receiving the compiled machine code version of the ML model comprises receiving one of the set of compiled machine code versions of the ML models associated to the unique ML model ID and the additional information. In one embodiment, the additional information comprises information about one or more capabilities of one or more chipsets of the WCD that enable execution of a ML model. In another embodiment, the additional information comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD. In one embodiment, the model radio network identifier is defined, by a registry, to be associated to the one or more hardware and/or software capabilities of the WCD, the registry associating different hardware and/or software capabilities to different model radio network identifiers.

In one embodiment, the method further comprises sending, to a RAN node, information about one or more hardware capabilities of the WCD related to execution of a ML model, one or more software capabilities of the WCD related to execution of a ML model, or both one or more hardware capabilities of the WCD related to execution of a ML model and one or more software capabilities of the WCD related to execution of a ML model, wherein receiving the compiled machine code version of the ML model comprises receiving one of the set of compiled machine code versions of the ML models from the server node responsive to sending the information to the RAN node. In one embodiment, the information sent to the RAN node comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD.

In one embodiment, the method further comprises sending, to a RAN node, information about one or more hardware capabilities of the WCD related to execution of a ML model, one or more software capabilities of the WCD related to execution of a ML model, or both one or more hardware capabilities of the WCD related to execution of a ML model and one or more software capabilities of the WCD related to execution of a ML model, wherein receiving the compiled machine code version of the ML model comprises receiving one of the set of compiled machine code versions of the ML models from the RAN node responsive to sending the information to the RAN node. In one embodiment, the information sent to the RAN node comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD.

Corresponding embodiments of a WCD are also disclosed. In one embodiment, a WCD comprises one or more transmitters, one or more receivers, and processing circuitry associated with the one or more transmitters and the one or more receivers. The processing circuitry is configured to cause the WCD to receive, from a server node, a complied machine code version of a ML model, the compiled machine code version of the ML model being one of a set of compiled machine code versions of ML models, wherein the compiled machine code version of the ML model received by the WCD is precompiled for a hardware of the WCD. The processing circuitry is further configured to cause the WCD to perform one or more radio network operations using the compiled machine code version of the ML model.

Embodiments of a method performed by a RAN node are also disclosed. In one embodiment, a method performed by a RAN node comprises selecting a ML model for a particular WCD and causing delivery of one of a set of compiled machine code versions of the selected ML from a ML model repository to the WCD.

In one embodiment, selecting the ML model for the WCD comprises selecting the ML model for the WCD based on one or more hardware capabilities of the WCD related to execution of a ML model, one or more software capabilities of the WCD related to execution of a ML model, or both one or more hardware capabilities of the WCD related to execution of a ML model and one or more software capabilities of the WCD related to execution of a ML model.

In one embodiment, causing delivery of the one of the set of compiled machine code versions of the selected ML from the ML model repository to the WCD comprises sending a unique ML model ID associated to the selected ML model to the WCD.

In one embodiment, causing delivery of the compiled machine code version of the selected ML from the ML model repository to the WCD comprises sending a unique ML model ID associated to the selected ML model to a server node as part of a request for the server node to push the one of the set of compiled machine code versions of the selected ML from the ML model repository to the WCD.

In one embodiment, causing delivery of the compiled machine code version of the selected ML from the ML model repository to the WCD comprises sending a unique ML model ID associated to the selected ML model to a server node as part of a request for the one of the set of compiled machine code versions of the selected ML from the ML model repository, receiving the one of the set of compiled machine code versions of the selected ML from the server node, and sending the one of the set of compiled machine code versions of the selected ML to the WCD.

Corresponding embodiments of a RAN node are also disclosed. In one embodiment, a RAN node comprises processing circuitry configured to cause the RAN node to select a ML model for a particular WCD and cause delivery of one of a set of compiled machine code versions of the selected ML from a ML model repository to the WCD. Embodiments of a method performed by a server node are also disclosed. In one embodiment, a method performed by a server node comprises receiving, either from a particular WCD or a RAN node, a request for a compiled machine code version of a ML model, the request comprising a unique ML model ID and additional information about one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD. The method further comprises selecting a compiled machine code version of the ML model from a set of compiled machine code versions of the ML model based on the unique ML model ID and the additional information and sending the selected compiled machine code version of the ML model either to the WCD or to the RAN node.

In one embodiment, the additional information comprises information about one or more capabilities of one or more chipsets of the WCD that enable execution of a ML model.

In one embodiment, the additional information comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD.

Corresponding embodiments of a server node are also disclosed. In one embodiment, a server node comprises processing circuitry configured to cause the server node to receive, either from a particular WCD or a RAN node, a request for a compiled machine code version of a ML model, the request comprising a unique ML model ID and additional information about one or more hardware capabilities of the WCD, one or more software capabilities of the WCD, or both one or more hardware capabilities of the WCD and one or more software capabilities of the WCD. The processing circuitry is further configured to cause the server node to select a compiled machine code version of the ML model from a set of compiled machine code versions of the ML model based on the unique ML model ID and the additional information and sending the selected compiled machine code version of the ML model either to the WCD or to the RAN node.

In another embodiment, a method performed by a server node comprises obtaining a compiled machine code version of a ML model associated to a unique ML model ID and one or more hardware and/or software WCD capabilities related to ML model execution and storing the compiled machine code version of the ML model in association with the unique ML model ID and the one or more hardware and/or software WCD capabilities.

In one embodiment, the method further comprises receiving one or more parameters that define the ML model associated to the unique ML model ID, wherein obtaining the compiled machine code version of the ML model comprises compiling the ML model as defined by the one or more parameters based on the one or more hardware and/or software capabilities.

In one embodiment, obtaining the compiled machine code version of the ML model comprises receiving the compiled machine code version of the ML model from a WCD.

Corresponding embodiments of a server node are also disclosed. In one embodiment, a server node comprises processing circuitry configured to cause the server node to obtain a compiled machine code version of a ML model associated to a unique ML model ID and one or more hardware and/or software WCD capabilities related to ML model execution and store the compiled machine code version of the ML model in association with the unique ML model ID and the one or more hardware and/or software WCD capabilities.

Brief Description of the Drawings

The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.

Figure 1 illustrates an Open Radio Access Network (ORAN) architecture related to Artificial Intelligence (AI) / Machine Learning (ML) including a model compiling host responsive for compiling ML models to a binary format optimized for efficient execution of the ML model;

Figure 2 illustrates one example of a cellular communications system in which embodiments of the present disclosure may be implemented;

Figure 3 illustrates the operation of a wireless communication device (WCD), a Radio Access Network (RAN) node, and a server node in accordance with some embodiments of the present disclosure; Figure 4 illustrates an example of inter-vendor operation using a common registration to collect Model Radio Network Identifiers (M-RNIs) for a M-RNI database in accordance with one embodiment of the present disclosure;

Figures 5, 6, and 7 illustrate example embodiments of the use of M-RNI and a registry;

Figure 8 illustrates two options for storing and compiling (e.g., new or updated) ML model(s) in accordance with some embodiments of the present disclosure;

Figures 9, 10, and 11 are schematic block diagrams of example embodiments of a network node; and

Figures 12 and 13 are schematic block diagrams of example embodiments of a

WCD.

The embodiments set forth below represent information to enable those skilled in the art to practice the embodiments and illustrate the best mode of practicing the embodiments. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the disclosure and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure.

Some of the embodiments contemplated herein will now be described more fully with reference to the accompanying drawings. Other embodiments, however, are contained within the scope of the subject matter disclosed herein, the disclosed subject matter should not be construed as limited to only the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art.

Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features, and advantages of the enclosed embodiments will be apparent from the following description.

Radio Node: As used herein, a "radio node" is either a radio access node or a wireless communication device.

Radio Access Node: As used herein, a "radio access node" or "radio network node" or "radio access network node" is any node in a Radio Access Network (RAN) of a cellular communications network that operates to wirelessly transmit and/or receive signals. Some examples of a radio access node include, but are not limited to, a base station (e.g., a New Radio (NR) base station (gNB) in a Third Generation Partnership Project (3GPP) Fifth Generation (5G) NR network or an enhanced or evolved Node B (eNB) in a 3GPP Long Term Evolution (LTE) network), a high-power or macro base station, a low-power base station (e.g., a micro base station, a pico base station, a home eNB, or the like), a relay node, a network node that implements part of the functionality of a base station (e.g., a network node that implements a gNB Central Unit (gNB-CU) or a network node that implements a gNB Distributed Unit (gNB-DU)) or a network node that implements part of the functionality of some other type of radio access node.

Core Network Node: As used herein, a "core network node" is any type of node in a core network or any node that implements a core network function. Some examples of a core network node include, e.g., a Mobility Management Entity (MME), a Packet Data Network Gateway (P-GW), a Service Capability Exposure Function (SCEF), a Home Subscriber Server (HSS), or the like. Some other examples of a core network node include a node implementing an Access and Mobility Management Function (AMF), a User Plane Function (UPF), a Session Management Function (SMF), an Authentication Server Function (AUSF), a Network Slice Selection Function (NSSF), a Network Exposure Function (NEF), a Network Function (NF) Repository Function (NRF), a Policy Control Function (PCF), a Unified Data Management (UDM), or the like.

Communication Device: As used herein, a "communication device" is any type of device that has access to an access network. Some examples of a communication device include, but are not limited to: mobile phone, smart phone, sensor device, meter, vehicle, household appliance, medical appliance, media player, camera, or any type of consumer electronic, for instance, but not limited to, a television, radio, lighting arrangement, tablet computer, laptop, or Personal Computer (PC). The communication device may be a portable, hand-held, computer-comprised, or vehicle- mounted mobile device, enabled to communicate voice and/or data via a wireless or wireline connection.

Wireless Communication Device: One type of communication device is a wireless communication device, which may be any type of wireless device that has access to (i.e., is served by) a wireless network (e.g., a cellular network). Some examples of a wireless communication device include, but are not limited to: a User Equipment device (UE) in a 3GPP network, a Machine Type Communication (MTC) device, and an Internet of Things (IoT) device. Such wireless communication devices may be, or may be integrated into, a mobile phone, smart phone, sensor device, meter, vehicle, household appliance, medical appliance, media player, camera, or any type of consumer electronic, for instance, but not limited to, a television, radio, lighting arrangement, tablet computer, laptop, or PC. The wireless communication device may be a portable, hand-held, computer-comprised, or vehicle-mounted mobile device, enabled to communicate voice and/or data via a wireless connection.

Network Node: As used herein, a "network node" is any node that is either part of the RAN or the core network of a cellular communications network/system.

Machine Learning (ML) Model: As will be appreciated by one of ordinary skill in the art, an ML model (or Artificial Intelligence (AI) model) is a data driven model that applies ML or AI techniques to generate a set of outputs based on a set of inputs. Different ML models may use different architectures and/or different ML models may use the same architecture (e.g., the same neural network architecture) but have different parameters, weights, biases, or the like within that architecture.

Note that the description given herein focuses on a 3GPP cellular communications system and, as such, 3GPP terminology or terminology similar to 3GPP terminology is oftentimes used. However, the concepts disclosed herein are not limited to a 3GPP system.

Note that, in the description herein, reference may be made to the term "cell"; however, particularly with respect to 5G NR concepts, beams may be used instead of cells and, as such, it is important to note that the concepts described herein are equally applicable to both cells and beams.

There currently exist certain challenge(s). The downloadable Artificial Intelligence (AI) concept requires a Machine Learning (ML) model(s) to be signaled from the network to the UE. The following list outlines some problems with signaling ML models "over the air":

• The ML model will need to match the LIE'S chipset capabilities (e.g., the ML model will often need to run on special AI-accelerated hardware in the UE).

• The ML model, with tunable parameters possibly fixed, will need to be compiled down to an appropriate compiled machine code (binary) optimized for efficient execution of the ML model. This compiled machine code may also be referred to herein as a "binary" format to which the ML model is compiled. The process of compiling the ML model can be costly in time, compute, power, and memory resources, and, therefore, cannot be done at run time for a (near) physical layer application.

• The process of compiling an ML model at the UE requires the UE to maintain appropriate software tools chain. It will be difficult to maintain such tool chains across multiple UE and network vendors.

• Errors occurring during compiling ML models at the UE can lead to undefined model behavior and, therefore, compromise network performance.

• The problem of signaling an ML model explicitly (that is, the architecture, weights, and biases) is difficult to solve. For example, the 3GPP signaling protocols will need to be designed in a way to handle arbitrary ML models in a backwards compatible manner.

• The tunable parameters of an ML model (e.g., weights and biases of a neural network) will likely be described in high-precision float resolution (e.g., needed for stochastic gradient descent algorithms). These parameters will need to be quantized before they can be signaled from the network to the UE. The performance of the ML model can be negatively affected by this quantization.

• The ML model can be defined by many parameters (e.g., weights, biases, and architecture) - possible several hundred thousand or more.

Certain aspects of the present disclosure and their embodiments may provide solutions to the aforementioned or other challenges. In one embodiment, a UE ML model repository that hosts trained and compiled ML models for different UE chipsets (e.g., AI-accelerated hardware) is provided.

More specifically, in one embodiment, a procedure in which a ML model repository is utilized includes the following actions:

• [Optional] A wireless communication device (e.g., a UE) signals an ML model capabiHtyX.0 the network, i.e., to a Radio Access Network (RAN) node.

• The network (i.e., the RAN node) selects an ML model that the wireless communication device is capable of running, and the network (i.e., the RAN node) signals a unique ML model Identifier (ID), which is associated to the selected ML model, to the wireless communication device.

• The wireless communication device downloads an appropriate compiled machine code (binary) version of the selected ML model from the UE ML model repository, which may be hosted by a server node. This compiled machine code may also be referred to herein as a "binary" format to which the selected ML model is compiled. In one embodiment, the server node is part of the RAN (e.g., hosted by the RAN node or another RAN node) or part of another network node (e.g., hosted by another network node) such as, e.g., an Operations and Management (OAM) node. The server node may alternatively be, or be part of, a Service Management and Orchestration (SMO), a near Real Time RAN Intelligent Controller (RIC), a Non-Real Time RIC, as defined in Open RAN (ORAN). As another alternative, the server node may be implemented as an external node such as, for example, a node hosted by a UE vendor.

• The wireless communication device uses the downloaded ML model in one or more radio network operations (RNOs).

In one embodiment, the network (e.g., a RAN node) and a wireless communication device (e.g., a UE) support the use of an ML model repository. The ML model repository hosts compiled ML models for different wireless communication device hardware. The network configures the wireless communication device to download an appropriately compiled ML model from the repository. The wireless communication downloads the appropriately compiled ML model from the repository.

Certain embodiments may provide one or more of the following technical advantage(s). Embodiments of the proposed solution may have one or more of the following advantages over existing solutions that quantize and signal ML model architectures and parameters over the air:

• Wireless communication devices (WCDs) do not need to download and compile ML models, thus saving time and energy resources at the WCD.

• WCD does not need to compile the model; hence it can use the model in the RNOs faster in comparison to when needed to compile the model first. This can, for example, lead to improved CSI compression (WCD can directly use an ML-based approach).

• The software tool chains needed for compiling ML models for different WCD architectures are managed by the network and WCD vendors in a centralized manner. Individual WCDs are not required to maintain these tool chains.

• The ML model repository can be implemented with security protocols to, for example, avoid malicious code being executed in the network and/or WCD.

• Compiler optimizations can both reduce the compiled ML model size (reduce over-the-air signaling) and offer improved runtime (inference) performance on, for example, AI-accelerated hardware.

• The compiled ML model is not visible to the entity downloading it in contrast to explicitly downloading an ML model following a defined protocol. Hence, using compiled ML models provides a level of protection against misuse (e.g., downloading a competitor's model) of proprietary models.

• There is no limitation in model architecture, opposed to a protocol that accommodates signaling of a model with quantized parameters.

• ML models in the repository can be easily improved and updated.

Figure 2 illustrates one example of a cellular communications system 200 in which embodiments of the present disclosure may be implemented. In the embodiments described herein, the cellular communications system 200 is a 5G system (5GS) including a Next Generation RAN (NG-RAN) and a 5G Core (5GC); however, the present disclosure is not limited thereto. Embodiments of the present disclosure may be utilized in any type of wireless network or cellular communications system (e.g., an Evolved Packet System (EPS) including an Evolved Universal Terrestrial RAN (E-UTRAN) and an Evolved Packet Core (EPC)) in which downloadable ML models are desired. In this example, the RAN includes base stations 202-1 and 202-2, which in the 5GS include NR base stations (gNBs) and optionally next generation eNBs (ng-eNBs) (e.g., LTE RAN nodes connected to the 5GC), controlling corresponding (macro) cells 204-1 and 204-2. The base stations 202-1 and 202-2 are generally referred to herein collectively as base stations 202 and individually as base station 202. Likewise, the (macro) cells 204-1 and 204-2 are generally referred to herein collectively as (macro) cells 204 and individually as (macro) cell 204. The RAN may also include a number of low power nodes 206-1 through 206-4 controlling corresponding small cells 208-1 through 208-4. The low power nodes 206-1 through 206-4 can be small base stations (such as pico or femto base stations) or RRHs, or the like. Notably, while not illustrated, one or more of the small cells 208-1 through 208-4 may alternatively be provided by the base stations 202. The low power nodes 206-1 through 206-4 are generally referred to herein collectively as low power nodes 206 and individually as low power node 206. Likewise, the small cells 208-1 through 208-4 are generally referred to herein collectively as small cells 208 and individually as small cell 208. The cellular communications system 200 also includes a core network 210, which in the 5G System (5GS) is referred to as the 5GC. The base stations 202 (and optionally the low power nodes 206) are connected to the core network 210.

The base stations 202 and the low power nodes 206 provide service to wireless communication devices (WCDs) 212-1 through 212-5 in the corresponding cells 204 and 208. The WCDs 212-1 through 212-5 are generally referred to herein collectively as WCDs 212 and individually as WCD 212. In the following description, the WCDs 212 are oftentimes UEs, but the present disclosure is not limited thereto.

In the following description, the "RAN node" and "server node" may be any network entities such as OAM, NG-RAN node (e.g., a gNB, gNB-CU, or a gNB-DU), or the like. It can also comprise for example the SMO, the near Real Time RIC, and the Non-Real Time RIC defined in ORAN. The server node can also comprise an external cloud node, for example, a node hosted by a WCD vendor.

In relation to the AI/ML framework in ORAN, the WCD 212 is the inference host in the embodiments described herein, and its model capabilities could be transmitted to the ML compiling host in the "ML Inference host info" ORAN-message. Moreover, the model ID could be transmitted to the Model management host via the ML compiling host in the same info message.

Figure 3 illustrates the operation of a WCD 212, a RAN node 300, and a server node 302 in accordance with some embodiments of the present disclosure. Optional steps are represented by dashed lines/boxes. In one embodiment, the RAN (e.g., the RAN node 300 or another RAN node) hosts the server node 302, but the present disclosure is not limited thereto. Note that embodiments about how to compile and store ML models in the server node 302 are described in subsequent sections.

However, for the embodiment of Figure 3, it is assumed that the ML models have already been complied and stored at the server node 302 (e.g., in a ML model repository). As illustrated, the process of Figure 3 includes the following steps:

• Step 304 (Optional): The WCD 212 (e.g., a UE) signals a ML model capability to the network, i.e., to the RAN node 300.

• Steps 306 and 308: The RAN node 300 selects an ML model that the WCD 212 is capable of running, and the RAN node 300 signals a unique ML model ID, which is associated to the selected ML model, to the WCD 212.

• Steps 310, 312, and 314: The WCD 212 downloads an appropriately compiled machine code version of the selected ML model from a ML model repository, which in this example is hosted by the server node 302. More specifically, in this example, the WCD 212 sends the ML model ID that it obtained from the RAN node 300 and optionally WCD information to the server node 302 (step 310).

The WCD information is information about hardware and/or software capabilities of the WCD 212 related to ML model execution or inference. This may include information that directly or indirectly indicates which ML models and/or which precompiled machine code version(s) of a ML model(s) that the WCD 212 can execute. The server node 302 selects the precompiled ML model corresponding to the ML model ID and optionally the WCD information, e.g., from the ML model registry (step 312) and sends the precompiled ML model to the WCD 212 (step 314).

• Step 316: The WCD 212 uses the downloaded precompiled machine code version of the ML model in one or more radio network operations.

In step 304 of the process of Figure 3, the WCD 212 optionally signals a model capability to the network (i.e., to the RAN node 300 in the example of Figure 3). The intention of the capability is, for example, to inform the network of the WCD's ability to run different ML models.

In one embodiment, the model capability signaling (e.g., in step 304) includes explicit information on the WCD's hardware and software capabilities. For example, the model capability signaling may include a summary of the WCD's AI/ML-enabled chipset(s) and any compiler optimizations that it supports.

In another embodiment, the model capability is a unique ID, referred to herein as a M-RNI (Model Radio Network Identifier). In one embodiment, the M-RNI is (e.g., defined in the specifications, e.g., 3GPP specifications, as) an identifier of a specific length (e.g., a 32-bit field). The M-RNI is defined in a registry, e.g., WCD vendor A is assigned X different possible M-RNIs, where "X" here is a positive integer that is greater than or equal to 1. It is then up to the interoperability testing between the network vendor and WCD or chipset vendors to assign the hardware and software capabilities associated to each M-RNI. Assuming that the registry is accessible for different vendors that place products on the market, such a setup would allow inter-vendor operability using the same M-RNI in different networks in which the WCD 212 is operating. An example of such inter-vendor operation using a common registration to collect M-RNIs for a M-RNI database is illustrated in Figure 4. As illustrated in Figure 4, a registry 400 stores a M-RNI database that associates hardware and software capabilities to different M-RNIs. Server nodes 402 and 404 of different network vendors can then obtain the M- RNI database from the registry 400. To concretize, a similar setup is seen for Medium Access Control (MAC) addresses, where for the 48-bit MAC address, a 24-bit number "Organizationally Unique Identifier" (OUI) is used to identify vendors/organizations.

The OUIs are purchased from the Institute of Electronics and Electrical Engineers (IEEE) and are used by vendors across all markets on which they place products. Flence, such a registry is unique and the same between vendors / operator networks / markets.

Example embodiments of the use of M-RNI and a registry in a single vendor illustration are shown in Figures 5, 6, and 7. More specifically, Figure 5 illustrates an embodiment in which the model-ID is sent from the RAN node 300 to the WCD 212 and the WCD 212 performs a model request to the server node 302 for acquiring the ML- model. The steps of the process of Figure 5 are as follows:

• Step 502 (Optional): The server node 302 obtains the M-RNI database from a registry 500.

• Step 504: The WCD 212 sends its M-RNI to the RAN node 300.

• Steps 506 and 508: The RAN node 300 selects an ML model that the WCD 212 is capable of running based on the M-RNI, and the RAN node 300 signals a unique ML mode! ID, which is associated to the selected ML model, to the WCD 212. • Steps 510, 512, and 514: The WCD 212 downloads an appropriate compiled machine code (binary) version of the selected ML model from a UE ML model repository, which in this example is hosted by the server node 302. More specifically, in this example, the WCD 212 sends the ML model ID and its M-RNI to the server node 302 (step 510). The server node 302 selects the precompiled ML model corresponding to the ML model ID and the M-RNI of the WCD 212, e.g., from the ML model registry (step 512) and sends the precompiled ML model to the WCD 212 (step 514).

• Step 516: The WCD 212 uses the downloaded ML model in one or more radio network operations (RNOs).

Figure 6 illustrates an embodiment in which the RAN node 300 performs a request to the server node 302 by a model-ID and M-RNI, pushing the model to the WCD 212 from the server node 302. More specifically, the steps of the process of Figure 6 are as follows:

• Step 600 (Optional): The server node 302 obtains the M-RNI database from the registry 500.

• Step 602: The WCD 212 sends its M-RNI to the RAN node 300.

• Steps 604 and 606: The RAN node 300 selects an ML model that the WCD 212 is capable of running based on the M-RNI, and the RAN node 300 signals a unique ML model ID, which is associated to the selected ML model, and the M-RNI of the WCD 212 to the server node 302.

• Steps 608 and 610: The server node 302 pushes the precompiled/optimized "binary" version of the selected ML model to the WCD 212. More specifically, in this example, the server node 302 selects the precompiled ML model corresponding to the ML model ID and the M-RNI of the WCD 212, e.g., from the ML model registry (step 608) and sends the precompiled ML model to the WCD 212 (step 610).

• Step 612: The WCD 212 uses the downloaded ML model in one or more radio network operations (RNOs).

Figure 7 illustrates an embodiment in which the RAN node 300 performs a request to the server node 302 by a model-ID and M-RNI, receives the precompiled ML model from the server node 302, and sends the precompiled ML model to the WCD 212 More specifically, the steps of the process of Figure 7 are as follows: • Step 700 (Optional): The server node 302 obtains the M-RNI database from the registry 500.

. Step 702: The WCD 212 sends its M-RNI to the RAN node 300.

• Steps 704 and 706: The RAN node 300 selects an ML model that the WCD 212 is capable of running based on the M-RNI and sends a request to the server node 302 including a unique ML model ID, which is associated to the selected ML model, and the M-RNI of the WCD 212.

• Steps 708 and 710: The server node 302 selects a corresponding compiled machine code (binary) version of the selected ML model based on the received ML model ID and M-RNI and sends the compiled machine code (binary) version of the selected ML model to the RAN node 300.

• Step 712: The RAN node 300 sends the compiled machine code (binary) version of the selected ML model to the WCD 212.

• Step 714: The WCD 212 uses the downloaded ML model in one or more radio network operations (RNOs).

As discussed above in relation to steps 306, 506, 604, and 704, the RAN node 300 selects an appropriate ML model and either configures the WCD 212 to download the corresponding pre-compiled model (e.g., for its chipset) from the server node 302 ML model repository) (e.g., as in steps 308-314), causes the server node 302 to push the corresponding pre-compiled model to the WCD 212, or obtains the corresponding pre-compiled model from the server node 302 and sends it to the WCD 212. Note that, in case the model is compiled by the WCD 212, the RAN node 212 is, in one embodiment, informed by the WCD 212 upon the completion of the compiled ML model. The RAN node 300 then, in one embodiment, schedules the WCD 212 to upload its model to the ML model repository (where it can then be reused by other WCDs).

As discussed above, a particular precompiled version of a ML model is uniquely identified by the model-ID and the WCD information / M-RNI. In one embodiment, a specific model ID refers to a specific model for a pre-defined functionality in the specification; for example, a ML-model for compressing channel state information (see Example 1 in the Introduction section above). Hence, it should be noted that a given functionality (e.g., compressing channel state information) can be defined using a range of model IDs, where the WCD 212 should use one of the identified range of models for inference. In an alternative implementation, the ML model-ID is separated into separate fields. For example, one could define two fields: The first field specifies the functional block (e.g., CSI compression), and the second field specifies a specific ML Model for that function (e.g., a specific ML model the UE should use for CSI compression).

In one embodiment, the RAN node 300 maintains a table for the best-suited model(s) for each of multiple WCD categories. When the RAN node 300 knows the category of an attaching WCD (e.g., either via the provided WCD information or M-RNI), the RAN node 300 signals the ML model ID to the WCD that suits it the best.

Note that the conditions may change and there may be need for the change in the ML model that the WCD 212 could use. The WCD 212 indicates change in availability of resources. The RAN node 300 sends the new ML model ID to the WCD 212.

In one embodiment, updates to the ML models in the ML model repository are performed periodically (or a-periodically) to address model decay and data drift. The ML models in the repository are updated, possibly with a new release / version number.

Embodiments are also disclosed herein that relate to storing and compiling new or updated ML model(s). In this regard, Figure 8 illustrates two options for storing and compiling (e.g., new or updated) ML model(s) in accordance with some embodiments of the present disclosure. In the first option ("Option l"), the ML model(s) is (are) compiled in the server node 302. In the second option ("Option 2"), the ML model(s) is (are) complied in the WCD(s) 212 and provided to the server node 300 for storage in the repository.

More specifically, as illustrated in Figure 8, the WCD 212 and RAN node 300 perform RNO data collection (step 800). The RAN node 300 creates or updates a ML model and associated model ID based on the collected data (step 802). For Option 1, the RAN node 300 signals the ML model (i.e., the parameters of the ML model) and the associated model ID to the server node 302 (step 804), and the server node (302) compiles the ML model for one or more WCDs (e.g., one or more types of WCDs or one or more chipsets that may be used by WCDs) to provide corresponding compiled machine code (binary) versions of the ML model for different WCDs or different types of WCDs that are then stored in the ML model repository in association with the model ID and the respective WCD information or M-RNIs (step 806). For Option 2, the RAN node 300 signals the ML model (i.e., the parameters of the ML model) and the associated model ID to the WCD 212 (step 808), and the WCD 212 complies the ML model to provide a precompiled version of the ML model for the WCD 212 or for the WCD type or chipset of the WCD 212 (step 810). The WCD 212 then sends the compiled machine code (binary) version of the ML model and the associated model ID to the server node 302 (step 812), and the server node 302 stores the compiled machine code (binary) version of the ML model in association with the associated model ID and WCD information or M-RNI of the WCD 212 (step 814). For Option 2, the illustrated steps may be repeated for multiple WCDs of different types or having different AI/ML enabled chipsets to obtain different precompiled versions of the ML model for different types of WCDs/different WCD capabilities/different M-RNIs.

Optionally, the server node 302 may, in some embodiments, notify the WCD 212 of an outdated ML model(s) (step 816). The WCD 212 may then delete the outdated ML model (s) (step 818).

The server node 302 is the entity that hosts a copy of the ML model repository. The server node 302 can for example comprise of a baseband unit of a base station (e.g., gNB-DU). The server node 302 can update the model repository upon a model update/creation in a RAN node.

In Option 1 in Figure 8, the server node 302 can for example receive information from WCDs 212 or the RAN node 300 comprising appropriate software tool chains for different WCD AI/ML-enabled chipsets. These software tool chains will be used to pre^¬ compile ML models for different WCD chipset architectures. For example, precompile the models when new model parameters are received from the RAN node 300.

In one embodiment, instead of signaling the model parameters to the server node 302 hosting the ML model repository, the RAN node 300 provides compiled versions of an ML model in the ML model repository (note: different WCD chipsets will likely need different compiled versions of the model).

In Option 2 of Figure 8, the model repository is directly updated with the ML model by the WCD(s). Option 2 can for example be done in case there is no pre^¬ compiled model in the server node model repository for a certain WCD 212. The WCD 212 can supply a compiled model or an explicit description of the model and its parameters (e.g., eights and biases). In the latter case, the ML model is compiled and put in the repository by the network. In one embodiment, the server node 302 hosting the ML model repository compresses the compiled ML models. This would further reduce the overhead when the WCD 212 downloads a compressed compiled model. However, the WCD 212 needs to decompress the compiled model before using it, and potential assistance information needed to compress the model (what compression method is used).

Examples of operations executed by the WCD 212 with a ML model may comprise, in addition to the examples listed in the Introduction section, one or more operations in the group of:

- power control in UL transmission

- timing advance in UL transmission

- Link adaptation in the UL transmission, such as selection of modulation and coding scheme

- Estimation of channel quality or other performance metrics, such as o radio channel estimation in uplink and downlink, o channel quality indicator (CQI) estimation/selection, o signal to noise estimation for uplink and downlink, o signal to noise and interference estimation, o reference signal received power (RSRP) estimation, o reference signal received quality (RSRQ) estimation, etc.

- Information compression for uplink transmission

- Coverage estimation for secondary carrier

- Estimation of signal quality/strength degradation o Beam-level o Cell-level

- Mobility related operations, such as cell reselection and handover trigger

- Energy saving operations

- Positioning using ML methods, for example a model that translates radio measurements into a geographical location

- Compression of radio measurements, such as efficient channel state information reporting. Used to improve beamforming operations or positioning estimation. Figure 9 is a schematic block diagram of a network node 900 according to some embodiments of the present disclosure. Optional features are represented by dashed boxes. The network node 900 may be, for example, the RAN node 300, a network node that implements or hosts the server node 302, a network node that implements or hosts the ML model repository, or the like. As illustrated, the network node 900 includes a control system 902 that includes one or more processors 904 (e.g., Central Processing Units (CPUs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), and/or the like), memory 906, and a network interface 908. The one or more processors 904 are also referred to herein as processing circuitry. In addition, if the network node 900 is a RAN node (e.g., a base station 202, gNB, or network node that implements at least some of the functionality of the base station 202 or gNB), the network node 900 may include one or more radio units 910 that each includes one or more transmitters 912 and one or more receivers 914 coupled to one or more antennas 916. The radio units 910 may be referred to or be part of radio interface circuitry. In some embodiments, the radio unit(s) 910 is external to the control system 902 and connected to the control system 902 via, e.g., a wired connection (e.g., an optical cable). However, in some other embodiments, the radio unit(s) 910 and potentially the antenna(s) 916 are integrated together with the control system 902. The one or more processors 904 operate to provide one or more functions of the network node 900 as described herein (e.g., one or more functions of the RAN node 300 or one or more functions of the server node 302, as described herein). In some embodiments, the function(s) are implemented in software that is stored, e.g., in the memory 906 and executed by the one or more processors 904.

Figure 10 is a schematic block diagram that illustrates a virtualized embodiment of the network node 900 according to some embodiments of the present disclosure. Again, optional features are represented by dashed boxes. As used herein, a "virtualized" network node is an implementation of the network node 900 in which at least a portion of the functionality of the network node 900 is implemented as a virtual component(s) (e.g., via a virtual machine(s) executing on a physical processing node(s) in a network(s)). As illustrated, in this example, if the network node 900 is a radio access node, the network node 900 may include the control system 902 and/or the one or more radio units 910, as described above. The control system 902 may be connected to the radio unit(s) 910 via, for example, an optical cable or the like. The network node 900 includes one or more processing nodes 1000 coupled to or included as part of a network(s) 1002. If present, the control system 902 or the radio unit(s) are connected to the processing node(s) 1000 via the network 1002. Each processing node 1000 includes one or more processors 1004 (e.g., CPUs, ASICs, FPGAs, and/or the like), memory 1006, and a network interface 1008.

In this example, functions 1010 of the network node 900 described herein (e.g., one or more functions of the RAN node 300 or one or more functions of the server node 302, as described herein) are implemented at the one or more processing nodes 1000 or distributed across the one or more processing nodes 1000 and the control system 902 and/or the radio unit(s) 910 in any desired manner. In some particular embodiments, some or all of the functions 1010 of the network node 900 described herein are implemented as virtual components executed by one or more virtual machines implemented in a virtual environ ment(s) hosted by the processing node(s) 1000. As will be appreciated by one of ordinary skill in the art, additional signaling or communication between the processing node(s) 1000 and the control system 902 is used in order to carry out at least some of the desired functions 1010. Notably, in some embodiments, the control system 902 may not be included, in which case the radio unit(s) 910 communicate directly with the processing node(s) 1000 via an appropriate network interface(s).

In some embodiments, a computer program including instructions which, when executed by at least one processor, causes the at least one processor to carry out the functionality of the network node 900 or a node (e.g., a processing node 1000) implementing one or more of the functions 1010 of the network node 900 in a virtual environment according to any of the embodiments described herein is provided. In some embodiments, a carrier comprising the aforementioned computer program product is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium (e.g., a non-transitory computer readable medium such as memory).

Figure 11 is a schematic block diagram of the network node 900 according to some other embodiments of the present disclosure. The network node 900 includes one or more modules 1100, each of which is implemented in software. The module(s) 1100 provide the functionality of the network node 900 described herein. This discussion is equally applicable to the processing node 1000 of Figure 10 where the modules 1100 may be implemented at one of the processing nodes 1000 or distributed across multiple processing nodes 1000 and/or distributed across the processing node(s) 1000 and the control system 902. Figure 12 is a schematic block diagram of the WCD 212 (e.g., a UE) according to some embodiments of the present disclosure. As illustrated, the WCD 212 includes one or more processors 1202 (e.g., CPUs, ASICs, FPGAs, and/or the like), memory 1204, and one or more transceivers 1206 each including one or more transmitters 1208 and one or more receivers 1210 coupled to one or more antennas 1212. The transceiver(s) 1206 includes radio-front end circuitry connected to the antenna(s) 1212 that is configured to condition signals communicated between the antenna(s) 1212 and the processor(s) 1202, as will be appreciated by on of ordinary skill in the art. The processors 1202 are also referred to herein as processing circuitry. The transceivers 1206 are also referred to herein as radio circuitry. In some embodiments, the functionality of the WCD 212 (or UE) described above may be fully or partially implemented in software that is, e.g., stored in the memory 1204 and executed by the processor(s) 1202. Note that the WCD 212 may include additional components not illustrated in Figure 12 such as, e.g., one or more user interface components (e.g., an input/output interface including a display, buttons, a touch screen, a microphone, a speaker(s), and/or the like and/or any other components for allowing input of information into the WCD 212 and/or allowing output of information from the WCD 212), a power supply (e.g., a battery and associated power circuitry), etc.

In some embodiments, a computer program including instructions which, when executed by at least one processor, causes the at least one processor to carry out the functionality of the WCD 212 according to any of the embodiments described herein is provided. In some embodiments, a carrier comprising the aforementioned computer program product is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium (e.g., a non-transitory computer readable medium such as memory).

Figure 13 is a schematic block diagram of the WCD 212 according to some other embodiments of the present disclosure. The WCD 212 includes one or more modules 1300, each of which is implemented in software. The module(s) 1300 provide the functionality of the WCD 212 (or UE) described herein.

Any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses. Each virtual apparatus may comprise a number of these functional units. These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include Digital Signal Processor (DSPs), special-purpose digital logic, and the like. The processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as Read Only Memory (ROM), Random Access Memory (RAM), cache memory, flash memory devices, optical storage devices, etc. Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein. In some implementations, the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments of the present disclosure.

While processes in the figures may show a particular order of operations performed by certain embodiments of the present disclosure, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc.).

Some example embodiments of the present disclosure are as follows:

Embodiment 1: A method performed by a wireless communication device, WCD, (212) comprising: receiving (314; 514; 610; 712), from the server node (302), a precompiled binary (i.e., compiled machine code (binary)) version of a ML model; and performing (316; 516; 612; 714) one or more radio network operations within the precompiled binary version of the ML model.

Embodiment 2: The method of embodiment 1 further comprising: receiving (308; 508) a unique machine learning, ML, model identity, ID, from a radio access network, RAN, node (300); wherein receiving (314; 514) the precompiled binary version of a ML model comprises downloading (314; 514) the precompiled binary version of a ML model associated to the unique ML model ID from a server node (302).

Embodiment 3: The method of embodiment 2 wherein downloading (314; 514) the precompiled binary version of the ML model associated to the unique ML model ID from the server node (302) comprises: sending (310; 510) the unique ML model ID and additional information to a server node (302); and receiving (314; 514) the precompiled binary version of a ML model associated to the unique ML model ID responsive to sending (310; 510) the unique ML model ID and additional information to the server node (302).

Embodiment 4: The method of embodiment 3 wherein the additional information comprises information about one or more capabilities of one or more chipsets of the WCD (212) that enable execution of a ML model.

Embodiment 5: The method of embodiment 3 wherein the additional information comprises a M-RNI that is associated to one or more hardware and/or software capabilities of the WCD (212).

Embodiment 6: The method of embodiment 1 further comprising: sending (602), to a radio access network, RAN, node (300), information about one or more hardware and/or software capabilities of the WCD (212) related to execution of a ML model; wherein receiving (610) the precompiled binary version of the ML model comprises receiving (610) the precompiled binary version of the ML model from a server node (302) responsive to sending (602) the information to the RAN node (300).

Embodiment 7: The method of embodiment 6 wherein the additional information comprises a M-RNI that is associated to one or more hardware and/or software capabilities of the WCD (212).

Embodiment 8: The method of embodiment 1 further comprising: sending (702), to a radio access network, RAN, node (300), information about one or more hardware and/or software capabilities of the WCD (212) related to execution of a ML model; wherein receiving (712) the precompiled binary version of the ML model comprises receiving (712) the precompiled binary version of the ML model from the RAN node (300) responsive to sending (702) the information to the RAN node (300).

Embodiment 9: The method of embodiment 8 wherein the additional information comprises a M-RNI that is associated to one or more hardware and/or software capabilities of the WCD (212).

Embodiment 10: A wireless communication device, WCD, (212) adapted to perform the method of any of embodiments 1 to 9.

Embodiment 11: A method performed by a radio access network, RAN, node (300) comprising: selecting (306; 506; 604; 704) a machine learning, ML, model for a particular wireless communication device, WCD, (212); causing delivery of a precompiled binary (i.e., a compiled machine code (binary)) version of the selected ML from a ML model repository to the WCD (212). Embodiment 12: The method of embodiment 11 wherein selecting (306; 506; 604; 704) the ML model for the WCD (212) based on one or more hardware and/or software capabilities of the WCD (212) related to execution of a ML model.

Embodiment 13: The method of embodiment 11 or 12 wherein causing delivery of the precompiled binary version of the selected ML from the ML model repository to the WCD (212) comprises sending (308; 508) a unique ML model identity, ID, associated to the selected ML model to the WCD (212).

Embodiment 14: The method of embodiment 11 or 12 wherein causing delivery of the precompiled binary version of the selected ML from the ML model repository to the WCD (212) comprises sending (606) a unique ML model identity, ID, associated to the selected ML model to a server node (302) as part of a request for the server node (302) to push the precompiled binary version of the selected ML from the ML model repository to the WCD (212).

Embodiment 15: The method of embodiment 11 or 12 wherein causing delivery of the precompiled binary version of the selected ML from the ML model repository to the WCD (212) comprises: sending (706) a unique ML model identity, ID, associated to the selected ML model to a server node (302) as part of a request for the precompiled binary version of the selected ML from the ML model repository; receiving (710) the precompiled binary version of the selected ML from the server node (302); and sending (712) the precompiled binary version of the selected ML to the WCD (212).

Embodiment 16: A radio access network, RAN, node (300) adapted to perform the method of any of embodiments 11 to 15.

Embodiment 17: A method performed by a server node (302) comprising: receiving (310; 510; 606; 706), either from a particular wireless communication device, WCD, (212) or a radio access network, RAN, node (300), a request for a precompiled binary (i.e., a compiled machine code (binary)) version of a machine learning, ML, model, the request comprising a unique ML model identity, ID, and additional information about one or more hardware and/or software capabilities of the WCD (212); selecting (312; 512; 608; 706) a precompiled binary version of the ML model based on the unique ML model ID and the additional information; and sending (314; 514; 610; 706) the selected precompiled binary version of the ML model either to the WCD (212) or to the RAN node (300). Embodiment 18: The method of embodiment 17 wherein the additional information comprises information about one or more capabilities of one or more chipsets of the WCD (212) that enable execution of a ML model.

Embodiment 19: The method of embodiment 17 wherein the additional information comprises a M-RNI that is associated to one or more hardware and/or software capabilities of the WCD (212).

Embodiment 20: A server node (302) adapted to perform the method of any of embodiments 17 to 19.

Embodiment 21: A method performed by a server node (302) comprising: obtaining (806; 812) a precompiled binary (i.e., a compiled machine code (binary)) version of a ML model associated to a unique ML model identity, ID, and one or more hardware and/or software wireless communication device, WCD, capabilities related to ML model execution; and storing (806; 814) the precompiled binary version of the ML model in association with the unique ML model ID and the one or more hardware and/or software WCD capabilities.

Embodiment 22: The method of embodiment 21 further comprising: receiving (804) one or more parameters that define the ML model associated to the unique ML model ID; wherein obtaining (806) the precompiled binary version of the ML model comprises compiling (806) the ML model as defined by the one or more parameters based on the one or more hardware and/or software capabilities.

Embodiment 23: The method of embodiment 21 wherein obtaining (812) the precompiled binary version of the ML model comprises receiving (812) the precompiled binary version of the ML model from a WCD (212).

Embodiment 24: A server node (302) adapted to perform the method of any of embodiments 21 to 23.

Embodiment 25: A method performed by a wireless communication device, WCD, (212), the method comprising: receiving (808) one or more parameters that define a machine learning, ML, model associated to a unique ML model identity, ID; compiling (810) the ML model as defined by the one or more parameters based on one or more hardware and/or software capabilities of the WCD (212) to thereby provide a precompiled binary (i.e., a compiled machine code (binary)) version of the ML model; and sending (812), to a server node (302), the precompiled binary version of the ML model together with the unique ML model ID and information about or that indicates the one or more hardware and/or software capabilities of the WCD (212).

Embodiment 26: A wireless communication device, WCD, (212) adapted to perform the method of embodiment 25. Those skilled in the art will recognize improvements and modifications to the embodiments of the present disclosure. All such improvements and modifications are considered within the scope of the concepts disclosed herein.

Claims

1. A method performed by a wireless communication device, WCD, (212) comprising: receiving (314; 514; 610; 712), from a server node (302), a complied machine code version of a Machine Learning, ML, model, the compiled machine code version of the ML model being one of a set of compiled machine code versions of ML models, wherein the compiled machine code version of the ML model received by the WCD (212) is precompiled for a hardware of the WCD (212); and performing (316; 516; 612; 714) one or more radio network operations using the compiled machine code version of the ML model.

2. The method of claim 1 further comprising: receiving (308; 508) a unique ML model identity, ID, from a Radio Access Network, RAN, node (300); and sending (310; 510) the unique ML model ID to the server node (302); wherein receiving (314; 514) the compiled machine code version of the ML model comprises receiving (314; 514) one of the set of compiled machine code versions of the ML models associated to the unique ML model ID from the server node (302).

3. The method of claim 2 further comprising: sending (310; 510) additional information to the server node (302) in addition to the unique ML model ID; wherein receiving (314; 514) the compiled machine code version of the ML model comprises receiving (314; 514) one of the set of compiled machine code versions of the ML models associated to the unique ML model ID and the additional information.

4. The method of claim 3 wherein the additional information comprises information about one or more capabilities of one or more chipsets of the WCD (212) that enable execution of a ML model.

5. The method of claim 3 wherein the additional information comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212).

6. The method of claim 5 wherein the model radio network identifier is defined, by a registry, to be associated to the one or more hardware and/or software capabilities of the WCD (212), the registry associating different hardware and/or software capabilities to different model radio network identifiers.

7. The method of claim 1 further comprising: sending (602), to a radio access network, RAN, node (300), information about one or more hardware capabilities of the WCD (212) related to execution of a ML model, one or more software capabilities of the WCD (212) related to execution of a ML model, or both one or more hardware capabilities of the WCD (212) related to execution of a ML model and one or more software capabilities of the WCD (212) related to execution of a ML model; wherein receiving (610) the compiled machine code version of the ML model comprises receiving (610) one of the set of compiled machine code versions of the ML models from the server node (302) responsive to sending (602) the information to the RAN node (300).

8. The method of claim 7 wherein the information sent to the RAN node (300) comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212).

9. The method of claim 1 further comprising: sending (702), to a radio access network, RAN, node (300), information about one or more hardware capabilities of the WCD (212) related to execution of a ML model, one or more software capabilities of the WCD (212) related to execution of a ML model, or both one or more hardware capabilities of the WCD (212) related to execution of a ML model and one or more software capabilities of the WCD (212) related to execution of a ML model; wherein receiving (712) the compiled machine code version of the ML model comprises receiving (712) one of the set of compiled machine code versions of the ML models from the RAN node (300) responsive to sending (702) the information to the RAN node (300).

10. The method of claim 8 wherein the information sent to the RAN node (300) comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212).

11. A wireless communication device, WCD, (212) comprising: one or more transmitters (1208); one or more receivers (1210); and processing circuitry (1202) associated with the one or more transmitters (1208) and the one or more receivers (1210), the processing circuitry (1202) configured to cause the WCD (212) to: receive (314; 514; 610; 712), from a server node (302), a complied machine code version of a Machine Learning, ML, model, the compiled machine code version of the ML model being one of a set of compiled machine code versions of ML models, wherein the compiled machine code version of the ML model received by the WCD (212) is precompiled for a hardware of the WCD (212); and perform (316; 516; 612; 714) one or more radio network operations using the compiled machine code version of the ML model.

12. The WCD (212) of claim 11 wherein the processing circuitry (1202) is further configured to cause the WCD (212) to perform the method of any of claims 2 to 10.

13. A method performed by a radio access network, RAN, node (300) comprising: selecting (306; 506; 604; 704) a machine learning, ML, model for a particular wireless communication device, WCD, (212); causing delivery of one of a set of compiled machine code versions of the selected ML from a ML model repository to the WCD (212).

14. The method of claim 13 wherein selecting (306; 506; 604; 704) the ML model for the WCD (212) comprises selecting (306; 506; 604; 704) the ML model for the WCD (212) based on one or more hardware capabilities of the WCD (212) related to execution of a ML model, one or more software capabilities of the WCD (212) related to execution of a ML model, or both one or more hardware capabilities of the WCD (212) related to execution of a ML model and one or more software capabilities of the WCD (212) related to execution of a ML model.

15. The method of claim 13 or 14 wherein causing delivery of the one of the set of compiled machine code versions of the selected ML from the ML model repository to the WCD (212) comprises sending (308; 508) a unique ML model identity, ID, associated to the selected ML model to the WCD (212).

16. The method of claim 13 or 14 wherein causing delivery of the compiled machine code version of the selected ML from the ML model repository to the WCD (212) comprises sending (606) a unique ML model identity, ID, associated to the selected ML model to a server node (302) as part of a request for the server node (302) to push the one of the set of compiled machine code versions of the selected ML from the ML model repository to the WCD (212).

17. The method of claim 13 or 14 wherein causing delivery of the compiled machine code version of the selected ML from the ML model repository to the WCD (212) comprises: sending (706) a unique ML model identity, ID, associated to the selected ML model to a server node (302) as part of a request for the one of the set of compiled machine code versions of the selected ML from the ML model repository; receiving (710) the one of the set of compiled machine code versions of the selected ML from the server node (302); and sending (712) the one of the set of compiled machine code versions of the selected ML to the WCD (212).

18. A radio access network, RAN, node (300) comprising processing circuitry (904; 1004) configured to cause the RAN node (300) to: select (306; 506; 604; 704) a machine learning, ML, model for a particular wireless communication device, WCD, (212); cause delivery of one of a set of compiled machine code versions of the selected ML from a ML model repository to the WCD (212).

19. The RAN node (300) of claim 18 wherein the processing circuitry (904; 1004) is further configured to cause the RAN node (300) to perform the method of any of claims 14 to 17.

20. A method performed by a server node (302) comprising: receiving (310; 510; 606; 706), either from a particular wireless communication device, WCD, (212) or a radio access network, RAN, node (300), a request for a compiled machine code version of a machine learning, ML, model, the request comprising a unique ML model identity, ID, and additional information about one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212); selecting (312; 512; 608; 706) a compiled machine code version of the ML model from a set of compiled machine code versions of the ML model based on the unique ML model ID and the additional information; and sending (314; 514; 610; 706) the selected compiled machine code version of the ML model either to the WCD (212) or to the RAN node (300).

21. The method of claim 20 wherein the additional information comprises information about one or more capabilities of one or more chipsets of the WCD (212) that enable execution of a ML model.

22. The method of claim 20 wherein the additional information comprises a unique model radio network identifier that is associated to one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212).

23. A server node (302) comprising processing circuitry configured to cause the server node (302) to: receive (310; 510; 606; 706), either from a particular wireless communication device, WCD, (212) or a radio access network, RAN, node (300), a request for a compiled machine code version of a machine learning, ML, model, the request comprising a unique ML model identity, ID, and additional information about one or more hardware capabilities of the WCD (212), one or more software capabilities of the WCD (212), or both one or more hardware capabilities of the WCD (212) and one or more software capabilities of the WCD (212); select (312; 512; 608; 706) a compiled machine code version of the ML model from a set of compiled machine code versions of the ML model based on the unique ML model ID and the additional information; and send (314; 514; 610; 706) the compiled machine code version of the ML model either to the WCD (212) or to the RAN node (300).

24. The server node (302) of claim 23 wherein the processing circuitry is further configured to cause the server node (302) to perform the method of any of claims 21 to 22.

25. A method performed by a server node (302) comprising: obtaining (806; 812) a compiled machine code version of a ML model associated to a unique ML model identity, ID, and one or more hardware and/or software wireless communication device, WCD, capabilities related to ML model execution; and storing (806; 814) the compiled machine code version of the ML model in association with the unique ML model ID and the one or more hardware and/or software WCD capabilities.

26. The method of claim 25 further comprising: receiving (804) one or more parameters that define the ML model associated to the unique ML model ID; wherein obtaining (806) the compiled machine code version of the ML model comprises compiling (806) the ML model as defined by the one or more parameters based on the one or more hardware and/or software capabilities.

27. The method of claim 25 wherein obtaining (812) the compiled machine code version of the ML model comprises receiving (812) the compiled machine code version of the ML model from a WCD (212).

28. A server node (302) comprising processing circuitry configured to cause the server node (302) to: obtain (806; 812) a compiled machine code version of a ML model associated to a unique ML model identity, ID, and one or more hardware and/or software wireless communication device, WCD, capabilities related to ML model execution; and store (806; 814) the compiled machine code version of the ML model in association with the unique ML model ID and the one or more hardware and/or software WCD capabilities.

29. The server node (302) of claim 28 wherein the processing circuitry is further configured to cause the server node (302) to perform the method of any of claims 26 to 27.