CN117768075A - Method, terminal and network equipment for determining uplink channel - Google Patents

Method, terminal and network equipment for determining uplink channel Download PDF

Info

Publication number
CN117768075A
CN117768075A CN202211128492.3A CN202211128492A CN117768075A CN 117768075 A CN117768075 A CN 117768075A CN 202211128492 A CN202211128492 A CN 202211128492A CN 117768075 A CN117768075 A CN 117768075A
Authority
CN
China
Prior art keywords
channel
uplink channel
sta
determining
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211128492.3A
Other languages
Chinese (zh)
Inventor
王和俊
王滨后
徐芳
孙可欣
谢刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Smart Technology R&D Co Ltd
Haier Smart Home Co Ltd
Original Assignee
Qingdao Haier Smart Technology R&D Co Ltd
Haier Smart Home Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Smart Technology R&D Co Ltd, Haier Smart Home Co Ltd filed Critical Qingdao Haier Smart Technology R&D Co Ltd
Priority to CN202211128492.3A priority Critical patent/CN117768075A/en
Priority to PCT/CN2023/107293 priority patent/WO2024055739A1/en
Publication of CN117768075A publication Critical patent/CN117768075A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/382Monitoring; Testing of propagation channels for resource allocation, admission control or handover
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L5/00Arrangements affording multiple use of the transmission path
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The application relates to the technical field of wireless communication, and discloses a method for determining an uplink channel, which is applied to terminal equipment (STA), and comprises the following steps: receiving a trigger instruction; the triggering instruction is used for triggering the STA to sense the channel state; if the idle channel exists, determining an uplink channel according to the performance information of the idle channel, and accessing the uplink channel for data transmission; if no idle channel exists, the channel state is perceived again after the set back-off time length. Under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the throughput can be greatly improved, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved. The application also discloses a terminal and network equipment.

Description

Method, terminal and network equipment for determining uplink channel
Technical Field
The present invention relates to the field of wireless communications technologies, and for example, to a method, a terminal, and a network device for determining an uplink channel.
Background
Currently, the manner of cooperative scheduling among multiple APs mainly includes C-OFDMA (Coordinated Orthogonal Frequency-Division Multiple Access, cooperative orthogonal frequency division multiple access) and CBF (Coordinated Beamforming, cooperative beamforming).
In C-OFDMA, an AP (Wireless Access Point, radio access point) coordinates to share OFDMA (Orthogonal Frequency-Division Multiple Access, orthogonal frequency division multiple access) resources for all STAs (stations), and makes different STAs use orthogonal time and frequency to avoid RU (Resource Unit) collision. On the one hand, allocating the appropriate RU to the STA requires adequate CSI (Channel State Information ) or adequate channel estimation; on the other hand, C-OFDMA suppresses interference and transmits correctly through orthogonal channels, but in case of mismatch, the AP cannot guarantee that the channels are orthogonal, and thus cannot suppress interference through orthogonal channels. In addition, when the orthogonality of the channels is ensured, the utilization of the channels is also reduced. And the channel information fed back to the AP by the STA under CBF is incomplete, so that the AP has difficulty in effectively estimating the channel.
In addition, channel estimation is done in the HE-SIG-B field in WiFi6, which is limited in length, so complete channel estimation cannot be achieved. Multiple transmissions are required to obtain the full CSI, which in turn greatly reduces the efficiency of the channel.
Therefore, both existing schemes CBF and C-OFDMA require complete CSI to achieve channel sharing, however, in the case of communication mismatch, it is difficult for an AP to accurately know the channel conditions used by other APs, so it is difficult to give channel estimation conditions, that is, in the case of communication mismatch, these two schemes are not feasible.
Disclosure of Invention
The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview, and is intended to neither identify key/critical elements nor delineate the scope of such embodiments, but is intended as a prelude to the more detailed description that follows.
The embodiment of the disclosure provides a method, a terminal and network equipment for determining an uplink channel, so that channel allocation is completed under the condition that unknown CSI or CSI information is incomplete, and the improvement of system throughput and the utilization rate of the channel are realized.
In some embodiments, the method for determining an uplink channel is applied to a terminal device STA, and includes:
receiving a trigger instruction; the triggering instruction is used for triggering the STA to sense the channel state;
if the idle channel exists, determining an uplink channel according to the performance information of the idle channel, and accessing the uplink channel for data transmission;
if no idle channel exists, the channel state is perceived again after the set back-off time length.
Optionally, the accessing the uplink channel for data transmission includes:
after successful data transmission, the performance information of the corresponding channel is updated according to the transmission result, and a new data packet arrival indication is received;
and after the data transmission fails, updating the performance information of the corresponding channel according to the transmission result, and re-executing the data transmission instruction.
Optionally, the determining the uplink channel according to the performance information of the idle channel includes:
acquiring performance information of an idle channel;
and determining an idle channel with optimal capability of successfully transmitting the data packet as an uplink channel.
Optionally, the determining of the uplink channel includes:
constructing an uplink channel selection model based on reinforcement learning;
inputting channel state information and performance information into an uplink channel selection model based on reinforcement learning for training, and obtaining network average throughput;
and under the condition that the average throughput of the network reaches the maximum value, determining an uplink channel according to the output of the uplink channel selection model based on reinforcement learning.
Optionally, training of the reinforcement learning based uplink channel selection model includes:
channel state information and performance information as state set for reinforcement learning
Wherein,a set of channel perceived weights representing STAs on a channel;representing the perceived weight of the kth STA on the mth channel at the moment t;
a set of data packet transmission weights representing STAs on a channel;the data packet transmission weight perception weight of the kth STA on the mth channel at the t moment is represented;
inputting the state set S into an uplink channel selection model based on reinforcement learning for training to obtain an action set A= { f 1 ,f 2 ,…,f M -representing a set of STAs taking actions corresponding to selecting an uplink channel among the M idle channels;
determining a reward parameter from the set of statesAn instant prize indicating transmission by the kth STA on the mth channel;
according to the rewarding parameter R t Training the reinforcement learning-based uplink channel selection model to obtain a reward parameter R t And taking the channel corresponding to the maximum system action as the uplink channel.
Optionally, the establishing of the uplink channel selection model based on reinforcement learning includes:
wherein C is t Representing the average throughput of the network at time t; n represents the total number of STAs;and represents the signal-to-interference-and-noise ratio of the kth STA at the time t.
Optionally, according to the reward parameter R t Training the reinforcement learning-based uplink channel selection model, including:
the following will be used as the update rule for reinforcement learning:
wherein Q is t Q value, Q, representing the current state t+1 A Q value indicating a next state time; the learning rate of reinforcement learning represented by α is (0, 1); beta represents the importance degree of historical rewards, and the value is (0, 1);representing an instant prize; maxQ t (S ', A') represents the maximum Q value of all possible action strategies at the next time.
In some embodiments, the method for determining an uplink channel is applied to an access point AP, and includes:
sending a trigger instruction; the triggering instruction is used for triggering the STA to sense the channel state;
receiving data transmitted by the STA through an uplink channel; the uplink channel is determined by the STA according to performance information of the idle channel.
In some embodiments, a terminal device is provided, including a processor and a memory, where the memory is configured to store a computer program, and the processor is configured to invoke and run the program stored in the memory, to perform a method for determining an uplink channel as described above.
In some embodiments, a network device is provided that includes a processor and a communication interface for communicating with other network devices; the processor is configured to run a set of programs to cause the network device to implement the method for determining an uplink channel as described above.
The method, the terminal and the network device for determining the uplink channel provided by the embodiment of the disclosure can realize the following technical effects:
and sensing the channel state through the terminal STA, determining an uplink channel in the idle channel according to the channel performance, and carrying out data transmission. Therefore, under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the larger throughput improvement can be realized, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved.
The foregoing general description and the following description are exemplary and explanatory only and are not restrictive of the application.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which like reference numerals refer to similar elements, and in which:
FIG. 1 is a schematic illustration of an environmental system of an embodiment of the present disclosure;
fig. 2 is a flow chart of a method for determining an uplink channel according to an embodiment of the present disclosure;
fig. 3 is a flow chart of another method for determining an uplink channel provided by an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a training process of an uplink channel selection model based on reinforcement learning in an embodiment of the present disclosure;
fig. 5 is a flow chart of another method for determining an uplink channel provided by an embodiment of the present disclosure;
fig. 6 is a flow chart of another method for determining an uplink channel provided by an embodiment of the present disclosure;
FIG. 7 is a schematic illustration of an application of an embodiment of the present disclosure;
fig. 8 is a schematic diagram of a terminal device provided in an embodiment of the disclosure;
fig. 9 is a schematic diagram of a network device provided in an embodiment of the present disclosure.
Detailed Description
So that the manner in which the features and techniques of the disclosed embodiments can be understood in more detail, a more particular description of the embodiments of the disclosure, briefly summarized below, may be had by reference to the appended drawings, which are not intended to be limiting of the embodiments of the disclosure. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may still be practiced without these details. In other instances, well-known structures and devices may be shown simplified in order to simplify the drawing.
The terms first, second and the like in the description and in the claims of the embodiments of the disclosure and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe embodiments of the present disclosure. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion.
The term "plurality" means two or more, unless otherwise indicated.
In the embodiment of the present disclosure, the character "/" indicates that the front and rear objects are an or relationship. For example, A/B represents: a or B.
The term "and/or" is an associative relationship that describes an object, meaning that there may be three relationships. For example, a and/or B, represent: a or B, or, A and B.
The term "corresponding" may refer to an association or binding relationship, and the correspondence between a and B refers to an association or binding relationship between a and B.
In the embodiment of the disclosure, the AP represents a wireless access point, and may be a router, a gateway or a combined router gateway.
The STA means a user terminal and may be a mobile terminal or station connected to an AP via a communication connection function to gain access to AP system resources (e.g., a network). May be a cellular telephone, a cordless telephone, a session initiation protocol (Session Initiation Protocol, SIP) phone, a wireless local loop (Wireless Local Loop, WLL) station, a personal digital assistant (Personal Digital Assistant, PDA) device, a handheld device with wireless communication capabilities, a computing device or other processing device connected to a wireless modem, an in-vehicle device, a wearable device, a terminal device in a next generation communication system such as an NR network, or a terminal device in a future evolved public land mobile network (Public Land Mobile Network, PLMN) network, etc. It may also be a Mobile Phone, tablet (Pad), a computer with a wireless transceiving function, a Virtual Reality (VR) terminal, an augmented Reality (Augmented Reality, AR) terminal, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self driving), a wireless terminal in remote medical (remote medical), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation security (transportation safety), a wireless terminal in smart city (smart city), or a wireless terminal in smart home (smart home), etc.
Fig. 1 shows a schematic diagram of an environmental system provided by an embodiment of the present disclosure.
As shown in connection with fig. 1, the environment system includes a plurality of APs and a plurality of STAs.
In the above systems, each AP may access one or more STAs; each STA may also access one or more APs.
For example, one AP may be connected to one STA and establish at least two channels. STA1 and AP1 as in fig. 1; an AP may also connect to two STAs and establish at least two channels. STA2 and AP3 in fig. 1.
Each STA may acquire channel awareness information and data transmission information for one or more channels between itself and the accessed AP. Interference power of other APs in the acquisition group to the STA can also be sensed and acquired among the STAs.
In the prior art, although one STA may connect to a plurality of APs, the STA may establish a plurality of duplex channels with the plurality of APs, each channel needs to be responsible for uplink and downlink data bearers. Before the data is sent by the channel, the STA and the AP perform carrier sense multiple access (carrier sense multiple access, CSMA)/enhanced distributed channel access (enhanced distributed channel access, EDCA) back-off, and after the data is sent, an air interface collision may occur, if the air interface collision occurs, the data is sent failure, and retransmission is needed. When data is transmitted in both uplink and downlink channels, if the AP or STA is transmitting uplink data, the downlink needs to wait for the completion of uplink transmission before data transmission. If the downlink data is being transmitted, the uplink needs to wait for the downlink transmission to be completed before the data transmission can be performed. Therefore, collision between uplink and downlink data may occur, and latency of data transmission may be prolonged, which affects channel utilization and data throughput of the system.
The uplink channel in this embodiment is responsible for transmitting information data from the STA to the AP.
The idle channel refers to an unoccupied channel among the accessed multi-channels. In general, the idle channel may be determined among the accessed channels by searching for a channel that transmits an idle signal, or searching for a no-carrier channel.
By building the network architecture, and information interaction and data processing between the AP and the STA, the channel allocation is completed based on the method for determining the uplink channel under the condition that the channel state is difficult to estimate. So as to realize the improvement of the throughput of the system and improve the utilization rate of the channel.
In some cases, the above-mentioned environmental system may further include other network entities such as a network controller, a mobility management entity, and the embodiment of the present application is not limited thereto.
Based on the above environmental system, the embodiments of the present disclosure provide a method for determining an uplink channel, so that an STA can determine the uplink channel in the allocated resources and access the uplink data.
As shown in fig. 2, the method is applied to a terminal device STA, and includes:
step S201, the STA receives a trigger instruction; the trigger instruction is used for triggering the STA to sense the channel state.
Here, the trigger instruction is used to inform the STA that the data packet arrives, and the channel state may be perceived to transmit the data packet.
Step S202, if the idle channel exists, the STA determines an uplink channel according to the performance information of the idle channel, and accesses the uplink channel to perform data transmission.
In step S203, if there is no idle channel, the STA perceives the channel state again after the set backoff period.
Thus, when a data packet arrives, the STA starts to sense the channel state, and if no idle channel is sensed, the data packet is retracted and continues sensing; if the idle state is perceived, determining an uplink channel in the idle state according to the performance information of the idle state so as to perform data transmission. Under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the throughput can be greatly improved, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved.
Optionally, determining the uplink channel according to the performance information of the idle channel includes:
acquiring performance information of an idle channel;
and determining an idle channel with optimal capability of successfully transmitting the data packet as an uplink channel.
Here, the ability to successfully transmit a data packet may be determined by historical transmission success rates of the data packet and/or channel perceived weights.
For example, the channel with the highest historical transmission success rate of the data packet is determined as the uplink channel.
Thus, by obtaining historical transmission data in one or more idle channel performance information, a historical transmission success rate may be determined. The higher the historical transmission success rate, the lower the possibility of channel collision or the like affecting the transmission quality.
For another example, the channel with the highest channel perceived weight is determined as the uplink channel.
Channel perceived weights may be generally obtained by STAs on the corresponding channels using spectral awareness algorithm calculations to represent channel quality.
In recent years, research based on reinforcement learning is becoming more and more widespread, reinforcement learning is an online learning algorithm, an agent and an external environment interact through a reward mechanism, and self behaviors are adjusted according to a reward value obtained in the environment, so that the agent learns and adapts to the external environment, and the agent is prompted to select a behavior capable of obtaining the maximum reward of the agent in the environment. The characteristic of reinforcement learning and adaptation to the external environment can be applied to channel selection between the STA and the AP, so that the STA can learn the changed channel state as an intelligent body, and finally, an idle channel with optimal capability of successfully transmitting the data packet is selected from the idle channels as an uplink channel, thereby reducing channel state scanning overhead and improving channel detection probability. The method achieves the purposes of realizing larger throughput improvement, maximizing the channel utilization rate, reducing the probability of collision among channels and improving the frequency spectrum efficiency of the system.
The above-described embodiments will be described below with reference to specific embodiments.
As shown in fig. 3, an embodiment of the present disclosure provides a method for determining an uplink channel, which is applied to the STA in fig. 1 to determine the uplink channel between the STA and the AP by means of reinforcement learning data processing. The method comprises the following steps:
step S301, the STA receives a trigger instruction; the trigger instruction is used for triggering the STA to sense the channel state.
Step S302, under the condition that idle channels exist, an uplink channel selection model based on reinforcement learning is constructed according to the network average throughput optimization problem.
If no idle channel exists, the STA perceives the channel state again after the set back-off time length.
Here, the constructed reinforcement learning-based uplink channel selection model includes a state set, an action set, and a reward function.
Optionally, the establishing of the uplink channel selection model based on reinforcement learning includes:
wherein C is t Representing the average throughput of the network at time t; n represents the total number of STAs;and represents the signal-to-interference-and-noise ratio of the kth STA at the time t. Where the signal-to-interference-and-noise ratio refers to the ratio of the signal to the sum of interference and noise in the system.
In this way, the channel information between the STA and the AP is used for establishing the uplink channel selection model, so that when the uplink channel is selected, the states of the channels can be combined, so that the throughput of the system can meet the requirement.
In step S303, with the maximum average throughput of the network as a target, channel state information and performance information are input into an uplink channel selection model based on reinforcement learning for training, and the average throughput of the network is obtained.
Optionally, the training of the reinforcement learning-based uplink channel selection model includes:
channel state information and performance information as state set for reinforcement learning
Wherein,a set of channel perceived weights representing STAs on a channel;representing the perceived weight of the kth STA on the mth channel at the moment t;
a set of data packet transmission weights representing STAs on a channel;the data packet transmission weight perception weight of the kth STA on the mth channel at the t moment is represented;
the state set S is input into an uplink channel selection model based on reinforcement learning for training, and an action set A= { f is obtained 1 ,f 2 ,…,f M -representing a set of STAs taking actions corresponding to selecting an uplink channel among the M idle channels;
determining prize parameters from a set of statesRepresenting instant rewards transmitted by kth STA in mth channel at t moment;
according to the rewarding parameter R t Training an uplink channel selection model based on reinforcement learning to obtain a reward parameter R t The channel corresponding to the largest system operation is used as an uplink channel.
Here, the bonus parameter R t For representing the average value of the perceived weight of the uplink channel and the channel transmission weight selected at time t.
Further, according to the reward parameter R t Training a reinforcement learning based resource allocation decision model, comprising: the following will be used as the update rule for reinforcement learning:
wherein Q is t Q value, Q, representing the current state t+1 A Q value indicating a next state time; the learning rate of reinforcement learning represented by α is (0, 1); beta represents the importance degree of historical rewards, and the value is (0, 1);representing an instant prize; maxQ t (S ', A') represents the maximum Q value of all possible action strategies at the next time.
A schematic diagram of reinforcement learning training in an embodiment of the present disclosure is shown in fig. 4 to illustrate the above steps.
The reinforcement Learning in this embodiment uses a Q-Learning algorithm. The environment is perceived by the agent performing an action in the environment to obtain a certain prize, thereby learning a mapping strategy from state to action to maximize the prize value.
In fig. 4, the STA performs data processing as an agent for reinforcement learning as an agent. And according to mutual interference information among the APs received by the STA and the idle condition of the channels, reasonably and effectively selecting the uplink channels by using a reinforcement learning algorithm. And the feedback is obtained from the environment through the continuous interaction process of the intelligent body STA and the environment, so that the action of the intelligent body STA is changed, and the adjustment of the uplink channel selection action is realized.
Specifically, the STA first acquires mutual interference information between APs and channel idle condition as channel performance information and status information S 0 The agent STA is at S 0 Taking action A in the Environment 0 As a channel selection decision and fed back to the APs in the environment. Here, the action taken by the STA may be selected according to a greedy policy.
And after the intelligent body STA makes a channel selection decision, accessing and transmitting data according to the selected uplink channel. Determining a reward parameter R from system throughput 1 Feeding back to the STA; and transmitting a next state S containing mutual interference information between the APs and channel idle conditions to the STA 1 . The STA receives the reward parameter R 1 And S of environmental State 1 Then, according to the reinforcement learning updating rule, updating the Q value table and taking action A on the environment 1 As an uplink channel selection decision. Environment status received action A 1 State S of the back slave 1 Change to S 2 And feed back the bonus parameter R 2 . I.e. STA gets reward parameter R 2 State S 2 Updating Q value table, adopting A 2 Action; obtaining the reward parameter R 3 State S 3 Updating Q value table, adopting A 3 And (5) acting. The loop is performed until the system throughput is maximized, i.e., the bonus parameter Rt is maximized. Finally, the purposes of reducing interference and improving throughput are achieved.
Through updating the Q value table, each channel in the table uses a Q value to represent the transmission quality of the channel; when a data packet arrives, the STA starts to sense an idle channel, and if the idle channel is not sensed, the data packet is retracted and continues sensing; and if the idle channel is perceived, learning the uplink channel selection strategy by using a Q-Learning mechanism. The Q-learning process includes: according to the last state S t Action A for determining this time t+1 Then update state S t+1 And will feed back a prize R t . The STA selects a channel with the best transmission quality from the idle channels to transmit through learning, wherein the measurement of the transmission quality is the success rate of the historical transmission data packets. And updating the Q value in the Q value table according to the reward parameter. Thus, the channels are ordered according to the Q value, and a channel transmission quality ordered list can be obtained. After the received data packet reaches the information, the STA can take action according to greedy decision strategy through Q-Learning, namely select from the idle channels with epsilon probability, and finally determine the uplink channel.
Step S304, when the average throughput of the network reaches the maximum value, the uplink channel is determined according to the output of the uplink channel selection model based on reinforcement learning.
Will award parameter R t When the maximum value is reached, the action A corresponding to the system t As an optimal strategy to determine the corresponding uplink channel selection action.
Step S305, accessing the uplink channel for data transmission.
In this way, through the uplink channel selection model based on reinforcement learning, a decision is made by sensing the channel state and the number of idle channels by the terminal STA, the channel with the highest channel quality is selected for data transmission, and rewards are fed back to the environment while the next state is updated. And determining an uplink channel in the idle channel according to the channel performance, and carrying out data transmission. Therefore, under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the larger throughput improvement can be realized, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved.
Fig. 5 shows a method for determining an uplink channel to illustrate the steps of sensing channel conditions when a packet arrives and selecting a channel to be accessed using reinforcement learning to complete an uplink transmission to an AP.
As shown in fig. 5, an embodiment of the present disclosure provides a method for determining an uplink channel, which is applied to the STA in fig. 1 to determine the uplink channel between the STA and the AP by means of reinforcement learning data processing. The method comprises the following steps:
step S501, the STA receives a trigger instruction; the trigger instruction includes a packet arrival indication.
In step S502, the STA perceives whether there is an idle channel.
In step S503, if there is no idle channel, the data packet is retracted, and the channel state is perceived again after the set retraction time. The set backoff duration is determined following a random distribution with mean lambda.
In step S504, if there is an idle channel, the Q-learning algorithm is used to output a channel selection decision as an uplink channel by using an uplink channel selection model based on reinforcement learning.
Step S505, the uplink channel is accessed for data transmission. And updates the action set and reward parameters in step S504 based on the selected channel actions and the post-selection system throughput variations.
Step S506, after the data transmission is successful, the information required by the Q-learning algorithm in step S504 is updated according to the transmission result. The state set in step S504 is updated according to the transmission result, and step S501 is returned to receive a new packet arrival indication.
Step S507, after the data transmission fails, the information required by the Q-learning algorithm in step S504 is updated according to the transmission result, and the data transmission instruction is re-executed in step S502. The state set in step S504 is updated according to the transmission result.
In this way, through the uplink channel selection model based on reinforcement learning, a decision is made by sensing the channel state and the number of idle channels by the terminal STA, the channel with the highest channel quality is selected for data transmission, and rewards are fed back to the environment while the next state is updated. And continue updating the environmental status after making the decision. And updating two conditions existing after the data packet is transmitted into the Q-learning process. After the transmission is successful, after the environment state is updated, the data transmission is finished, a new data packet is waited to arrive, and the next round of data transmission is entered; after the transmission fails, after the environment state is updated, a retransmission mechanism is needed to be entered, and the channel is perceived again so as to transmit the data packet. Therefore, under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the larger throughput improvement can be realized, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved.
Fig. 6 shows a method for determining an uplink channel, which is applied to an AP in the environment system shown in fig. 1, and includes:
step S601, the AP sends a trigger instruction; the trigger instruction is used for triggering the STA to sense the channel state.
Here, the AP sends a trigger instruction to the STA to obtain data buffering information fed back by the STA, and triggers the STA to perform channel state sensing, so as to perform data transmission. The AP may send a BSRP buffer status report poll frame (Buffer Status Report Poll, BSRP) to cause the STA to send a buffer status report frame (Buffer Status Report, BSR).
Step S602, the AP receives data transmitted by the STA through an uplink channel; the uplink channel is determined by the STA based on the performance information of the idle channel.
After receiving the data transmitted by the STA through the uplink channel, the AP also transmits an acknowledgement character (Acknowledge character, ACK) to the STA to indicate the receipt of the uploaded data.
Therefore, under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the larger throughput improvement can be realized, the channel utilization rate is maximized, the probability of collision among channels is reduced, and the spectrum efficiency of the system is improved.
Fig. 7 shows an application diagram of a method for determining an uplink channel.
In this practical application, the method for determining an uplink channel includes the steps of:
step S701, the AP sends a BRSP to the STA, requesting to acquire data cache information of the STA;
step S702, STA sends BSR to AP, and feeds back data buffer information;
in step S703, the STA perceives all current channel states, and if a plurality of idle channels are perceived, it enters the Q-learning process environment, and selects the idle channel with the optimal capability of successfully transmitting the data packet as the uplink channel. If no channel is idle, the data packet will back-off for a period of time, the back-off time being subject to a random distribution with mean value lambda.
In step S704, the STA accesses the uplink channel and transmits data.
In step S705, the AP receives the data transmitted by the STA and sends an ACK to the STA to indicate reception.
In this way, the terminal STA perceives the channel state, determines the uplink channel according to the channel performance in the idle channel, and performs data transmission. Therefore, under the condition that unknown CSI information such as communication mismatch or the CSI information is incomplete, the STA is used for completing the selection of an uplink channel, so that the larger throughput improvement can be realized, the channel utilization rate is maximized, the probability of collision among channels is reduced, the influence of multi-AP interference on data transmission is reduced, and the spectrum efficiency of the system is improved.
As shown in connection with fig. 8, an embodiment of the present disclosure provides a terminal device including a processor 800 and a memory 801. The memory 801 is used for storing a computer program, and the processor 800 is used for calling and running the program stored in the memory, and executing the method for determining an uplink channel as described above.
Optionally, the device further comprises a communication interface 802 and a bus 803. The communication interface 802 is used to communicate with other network devices; the processor 800, communication interface 802, and memory 801 may communicate with each other via a bus 803.
As shown in connection with fig. 9, an embodiment of the present disclosure provides a network device comprising a processor 900 and a memory 901. The memory 901 is used for storing a computer program, and the processor 900 is used for calling and running the program stored in the memory, and executing the method for determining an uplink channel as described above.
Optionally, the device further comprises a communication interface 902 and a bus 903. The communication interface 902 is used to communicate with other network devices; the processor 900, communication interface 902, and memory 901 may communicate with each other via bus 903.
Further, the logic instructions in the memory 901 may be implemented in the form of a software functional unit and may be stored in a computer readable storage medium when sold or used as a separate product.
The memory 901 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 900 executes functional applications and data processing by executing program instructions/modules stored in the memory 901, i.e., implements the method for determining an uplink channel in the above-described embodiment.
The memory 901 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal device, etc. Further, the memory 901 may include a high-speed random access memory, and may also include a nonvolatile memory.
Embodiments of the present disclosure provide a computer readable storage medium storing computer executable instructions configured to perform the above-described method for determining an uplink channel.
The disclosed embodiments provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method for determining an uplink channel.
The computer readable storage medium may be a transitory computer readable storage medium or a non-transitory computer readable storage medium.
Embodiments of the present disclosure may be embodied in a software product stored on a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of a method according to embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium including: a plurality of media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or a transitory storage medium.
The above description and the drawings illustrate embodiments of the disclosure sufficiently to enable those skilled in the art to practice them. Other embodiments may involve structural, logical, electrical, process, and other changes. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in, or substituted for, those of others. Moreover, the terminology used in the present application is for the purpose of describing embodiments only and is not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a," "an," and "the" (the) are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, when used in this application, the terms "comprises," "comprising," and/or "includes," and variations thereof, mean that the stated features, integers, steps, operations, elements, and/or components are present, but that the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof is not precluded. Without further limitation, an element defined by the phrase "comprising one …" does not exclude the presence of other like elements in a process, method or apparatus comprising such elements. In this context, each embodiment may be described with emphasis on the differences from the other embodiments, and the same similar parts between the various embodiments may be referred to each other. For the methods, products, etc. disclosed in the embodiments, if they correspond to the method sections disclosed in the embodiments, the description of the method sections may be referred to for relevance.
Those of skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. The skilled artisan may use different methods for each particular application to achieve the described functionality, but such implementation should not be considered to be beyond the scope of the embodiments of the present disclosure. It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the embodiments disclosed herein, the disclosed methods, articles of manufacture (including but not limited to devices, apparatuses, etc.) may be practiced in other ways. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the units may be merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form. The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to implement the present embodiment. In addition, each functional unit in the embodiments of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than that disclosed in the description, and sometimes no specific order exists between different operations or steps. For example, two consecutive operations or steps may actually be performed substantially in parallel, they may sometimes be performed in reverse order, which may be dependent on the functions involved. Each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims (10)

1. A method for determining an uplink channel, applied to a terminal device STA, the method comprising:
receiving a trigger instruction; the triggering instruction is used for triggering the STA to sense the channel state;
if the idle channel exists, determining an uplink channel according to the performance information of the idle channel, and accessing the uplink channel for data transmission;
if no idle channel exists, the channel state is perceived again after the set back-off time length.
2. The method of claim 1, wherein the accessing the uplink channel for data transmission comprises:
after successful data transmission, the performance information of the corresponding channel is updated according to the transmission result, and a new data packet arrival indication is received;
and after the data transmission fails, updating the performance information of the corresponding channel according to the transmission result, and re-executing the data transmission instruction.
3. The method of claim 1, wherein the determining the uplink channel based on the performance information of the idle channel comprises:
acquiring performance information of an idle channel;
and determining an idle channel with optimal capability of successfully transmitting the data packet as an uplink channel.
4. A method according to claim 3, wherein the determining of the uplink channel comprises:
constructing an uplink channel selection model based on reinforcement learning;
inputting channel state information and performance information into an uplink channel selection model based on reinforcement learning for training, and obtaining network average throughput;
and under the condition that the average throughput of the network reaches the maximum value, determining an uplink channel according to the output of the uplink channel selection model based on reinforcement learning.
5. The method of claim 4, wherein training the reinforcement learning based uplink channel selection model comprises:
channel state information and performance information as state set for reinforcement learning
Wherein,a set of channel perceived weights representing STAs on a channel;representing the perceived weight of the kth STA on the mth channel at the moment t;
a set of data packet transmission weights representing STAs on a channel;the data packet transmission weight perception weight of the kth STA on the mth channel at the t moment is represented;
inputting the state set S into an uplink channel selection model based on reinforcement learning for training to obtain an action set A= { f 1 ,f 2 ,…,f M -representing a set of STAs taking actions corresponding to selecting an uplink channel among the M idle channels;
determining a reward parameter from the set of statesAn instant prize indicating transmission by the kth STA on the mth channel;
according to the rewarding parameter R t Training the reinforcement learning-based uplink channel selection model to obtain a reward parameter R t And taking the channel corresponding to the maximum system action as the uplink channel.
6. The method of claim 5, wherein the establishing of the reinforcement learning based uplink channel selection model comprises:
wherein C is t Representing the average throughput of the network at time t; n represents the total number of STAs; SINR (Signal to interference plus noise ratio) t k And represents the signal-to-interference-and-noise ratio of the kth STA at the time t.
7. The method according to claim 5, wherein said determining said bonus parameter R t Training the reinforcement learning-based uplink channel selection model, including:
the following will be used as the update rule for reinforcement learning:
wherein Q is t Q value, Q, representing the current state t+1 A Q value indicating a next state time; the learning rate of reinforcement learning represented by α is (0, 1); beta represents the importance degree of historical rewards, and the value is (0, 1);representing an instant prize; maxQ t (S ', A') represents the maximum Q value of all possible action strategies at the next time.
8. A method for determining an uplink channel, applied to an access point AP, comprising:
sending a trigger instruction; the triggering instruction is used for triggering the STA to sense the channel state;
receiving data transmitted by the STA through an uplink channel; the uplink channel is determined by the STA according to performance information of the idle channel.
9. A terminal device comprising a processor and a memory for storing a computer program, the processor for invoking and running the program stored in the memory for performing the method for determining an uplink channel according to claims 1 to 7.
10. A network device comprising a processor and a communication interface for communicating with other network devices; the processor is configured to run a set of programs to cause the network device to implement the method for determining an uplink channel as claimed in claim 8.
CN202211128492.3A 2022-09-16 2022-09-16 Method, terminal and network equipment for determining uplink channel Pending CN117768075A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211128492.3A CN117768075A (en) 2022-09-16 2022-09-16 Method, terminal and network equipment for determining uplink channel
PCT/CN2023/107293 WO2024055739A1 (en) 2022-09-16 2023-07-13 Method for determining uplink channel, and terminal and network device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211128492.3A CN117768075A (en) 2022-09-16 2022-09-16 Method, terminal and network equipment for determining uplink channel

Publications (1)

Publication Number Publication Date
CN117768075A true CN117768075A (en) 2024-03-26

Family

ID=90274172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211128492.3A Pending CN117768075A (en) 2022-09-16 2022-09-16 Method, terminal and network equipment for determining uplink channel

Country Status (2)

Country Link
CN (1) CN117768075A (en)
WO (1) WO2024055739A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257714B (en) * 2008-04-08 2011-03-09 浙江大学 Across layer self-adapting paralleling channel allocating method of cognized radio system
WO2017036258A1 (en) * 2015-09-02 2017-03-09 华为技术有限公司 Contention access method, contention access device, base station and contention access system
EP3892054B1 (en) * 2019-02-21 2024-04-24 Google LLC User-equipment-coordination set for a wireless network using an unlicensed frequency band
CN111342920B (en) * 2020-01-10 2021-11-02 重庆邮电大学 Channel selection method based on Q learning
CN113225832B (en) * 2020-02-05 2023-02-24 维沃移动通信有限公司 Data transmission method and device of unauthorized frequency band and communication equipment

Also Published As

Publication number Publication date
WO2024055739A1 (en) 2024-03-21

Similar Documents

Publication Publication Date Title
CN110447258B (en) Resource mapping method and device for uplink control channel
US10856164B2 (en) Method for scheduling resources in unlicensed frequency band, base station and terminal
US20230422188A1 (en) Multi-link operation with triggered alignment of frames
EP3241391B1 (en) Method and station for digital communications with interference avoidance
JP5951648B2 (en) Access point and terminal communication method for uplink multiple user multiple I / O channel access
CN112333661A (en) Resource selection method and device and terminal equipment
EP3403351B1 (en) Feedback for data block transmission
US9526106B2 (en) Method and apparatus for transmitting data in wireless communication system
EP3713122B1 (en) Method for replying with acknowledgement frame, apparatus, and data transmission system
CN110139353A (en) A kind of method and relevant apparatus of multi-access point AP coordinating transmissions
CN107736049A (en) The method and apparatus of transmitting uplink data
JP2019083577A (en) System and method for setting cyclic prefix length
CN107113749A (en) System and method for power control
US10135563B2 (en) Triggering of an uplink pair of packets in a wireless local area network
EP3952166B1 (en) Feedback information determining method and apparatus
TW201607266A (en) Full duplex wireless communications on devices with limited echo cancellation capabilities
WO2018145302A1 (en) Wireless communication method, terminal device, and network device
CN104158645A (en) Media access control method based on full-duplex access point in wireless local area network
EP3099130B1 (en) Data communication method, station and system
CN108476476A (en) Power indicates system and method
CN107409431A (en) A kind of data transfer control method and access point, website
CN114365530B (en) Information transmission method, device, terminal and storage medium
CN112822766A (en) Method, device, station and storage medium for sending response frame
CN117768075A (en) Method, terminal and network equipment for determining uplink channel
CN107041006B (en) Method and device for transmitting data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication