CN105007582B - Controlled Radio Network System dynamic resource allocation method based on POMDP - Google Patents

Controlled Radio Network System dynamic resource allocation method based on POMDP Download PDF

Info

Publication number
CN105007582B
CN105007582B CN201510271561.XA CN201510271561A CN105007582B CN 105007582 B CN105007582 B CN 105007582B CN 201510271561 A CN201510271561 A CN 201510271561A CN 105007582 B CN105007582 B CN 105007582B
Authority
CN
China
Prior art keywords
user
base station
antennas
access
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510271561.XA
Other languages
Chinese (zh)
Other versions
CN105007582A (en
Inventor
***
李萌
闫玉玮
孙恩昌
司鹏搏
杨睿哲
孙艳华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201510271561.XA priority Critical patent/CN105007582B/en
Publication of CN105007582A publication Critical patent/CN105007582A/en
Application granted granted Critical
Publication of CN105007582B publication Critical patent/CN105007582B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/02Resource partitioning among network components, e.g. reuse partitioning
    • H04W16/10Dynamic resource partitioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/542Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

Controlled Radio Network System dynamic resource allocation method based on POMDP, belongs to controlled wireless network and communication resource distribution field.By the state-transition matrix and feedback observing matrix that construct community user access base station number, calculate user's Belief state probability and user and open several obtained transmission rates according to antenna for base station, antenna for base station when maximum return is obtained so as to decision system opens number and optimal user access number, completes optimal resource allocation in cell.The present invention is in the case of multiple users and a multi-antenna base station being in the cell present, it is maximum minimum with the user data transmission bit error rate with intra-cell users receiving power respectively, it is target so as to obtain maximum system income, it is determined that final antenna for base station opens number and user's access base station number.Consumed excessively instant invention overcomes the energy, cell base station load is larger, antenna is opened and accesses the problems such as number mismatches with user.There is certain advantage in terms of user's receiving power, anti-interference and system integral benefit is improved.

Description

Dynamic resource allocation method of controlled wireless network system based on POMDP
Technical Field
The invention relates to a dynamic resource allocation method of a controlled wireless communication network system based on a Partially Observable Markov Decision Process (POMDP). A selection scheme beneficial to the resource allocation of a wireless communication network is designed by the POMDP method, belonging to the related fields of controlled wireless networks and communication resource allocation research.
Background
Mobile communication has been rapidly developed in recent decades, and the user's demand for quality of service of wireless communication networks has been increasing, which has prompted the wireless communication system to evolve from 2G, 3G, B G, 4G and 5G, and the network body will also transform from a voice-dominated network to a high-speed data-dominated network. Meanwhile, the mobile multimedia service has higher and higher bandwidth requirements, and "broadband" becomes a development trend of mobile communication technology. Currently, there are three main aspects affecting the Quality of Service (QoS) of a wireless communication network: firstly, the high dynamics of the wireless mobile communication network, frequent handover operations caused by random changes of user positions and the changeability of network topology will cause the instability of data transmission rate and connectivity; secondly, the power loss of the cell user received by the base station is a great proportion due to the characteristics of channel fading of a wireless communication network, the limited power or energy of the mobile terminal and the like; thirdly, due to the influence of channel fading between the base station and the user, the number of antennas turned on by the base station and the user, the signal-to-noise ratio of the user, etc., the bit error rate in data transmission is also greatly influenced, thereby affecting the reliability of data link transmission. For many years, although the design algorithm and the like of the wireless communication network are continuously optimized and improved in the industry, a plurality of methods for improving the network service quality are provided, and the forward development of the wireless communication network design is promoted, the problems such as network power consumption loss, data transmission reliability and the like cannot be thoroughly solved all the time, so that the design and the deployment based on the traditional wireless communication network system architecture and the communication layered protocol system cannot more effectively solve the contradictions.
In the field of control engineering, a feedback control strategy is used as the most basic control method, becomes the core of a closed-loop control system, and plays an important role in controlling and adjusting the states of all nodes of the system. The feedback strategy is widely and deeply applied to the fields of closed-loop control, information theory, channel coding and the like of an industrial system from the beginning of the proposal. By means of the feedback strategy, the control system has self-adjusting, self-adapting and self-stabilizing capabilities, and system performance indexes are comprehensively improved. Meanwhile, research on Wireless Network Control Systems (WNCS) has attracted high attention from both domestic and foreign researchers. Professor l.litz and doctor a.chamaken, university of kezewalen, germany, propose embedding a wireless communication network into an industrial control system, and design a system architecture, a control algorithm, a wireless communication network architecture and a communication protocol that meet the requirements of performance indexes of the control system, thereby improving the processing of information and the control of the system among sensors, controllers and actuators of the system, and realizing the prediction and optimization of the industrial control system. M.D.Di Benedetto and other scholars of the university of Liraquina deeply research WNCS design, and the scholars propose a relevant cost function, firstly map parameters such as noise, coding, modulation mode, system power and the like of a control system into a wireless network by utilizing the function, and then select a proper wireless network type, so that the requirements of improving the robustness and flexibility of the control system are met.
The Partially Observable Markov Decision Process (POMDP) is solved by converting a non-Markov chain problem into a Markov chain problem by introducing a belief state space, and is characterized in that the state information of a system cannot be directly observed and obtained, is partially knowable, and is used for modeling the system only with incomplete state information and making a decision according to the current incomplete state information, so that the maximum benefit is obtained. The state transition model is more consistent with the characteristic that part of state information in a wireless communication network scene is not completely known and needs to be observed to obtain optimal resource allocation.
In summary, the main objective of the present invention is to introduce a control feedback optimization strategy, apply the POMDP model to the controlled wireless communication network system, and predict and judge the next time cell user optimal access number by using a State transition probability matrix formed by the given cell user access number and an observation probability matrix formed by feedback network QoS service indexes (user received power and user transmission error rate), and according to the cell user access State (Belief State) at a certain time and the corresponding base station open antenna number gain; meanwhile, according to the maximum profit, the number of the antennas of the base station of the cell at the moment is judged, and finally the optimal resource allocation of the antennas of the base station and the user access in the cell is achieved.
Disclosure of Invention
The invention mainly aims to complete the optimal resource allocation strategy of cell base station antenna opening and access users by taking the dynamic resource allocation optimization of the number of the access users and the number of the cell opening antennas at each moment as an optimization target and applying a POMDP model and a control feedback strategy under the condition that one multi-antenna base station and a plurality of users exist in a cell network in the aspect of optimal resource allocation of the cell communication network. The method solves the problem of how to select and determine the optimal resource allocation under the condition that a base station with a plurality of antennas and a plurality of communication users exist in a cell network, and obtains the maximum benefit of a cell wireless communication network system through the optimal resource allocation.
The scene model of the cell environment to which the invention is adapted is shown in figure 1.
The flow chart of the system operation principle in the technical scheme of the invention is shown in figure 2.
The comparison of the system user received power base station situation of the invention is shown in figure 3.
The comparison graph of the error rate of the system of the invention is shown in figure 4.
The average profit comparison graph under different conditions in the cell of the system of the present invention is shown in fig. 5.
The comparison of the number of users accessing the cell and the number of antennas turned on by the base station in the system of the present invention is shown in fig. 6.
The model diagram of the cell environment scene is shown in fig. 1, and the dynamic resource allocation method of the controlled wireless network system based on the POMDP is characterized in that: in a communication cell, a base station with N antennas and M users with single antennas are included, after a state transition probability matrix of the cell user access number and an observation matrix of feedback network QoS indexes (user receiving power and user transmission error rate) are known, according to the reliability state probability (BS) of the user access number at a certain moment, the base station antenna opening number with the maximum profit at the moment and the cell user optimal access number at the next moment are obtained, and the method is specifically realized by the following steps in sequence:
step (1), initializing a system, wherein the method comprises the following steps according to actual conditions:
m single-antenna users are contained in a cell, and the number of users needing to access a base station at a certain moment is represented as s 1 ,s 2 ,…,s m ,…,s M ,s m It shows that there are m users accessing the base station, and at the same time, the base station contains one N antennas, the number of the open antennas is T 1 ,T 2 ,…,T n ,…,T N ,T n Indicating that the base station turns on n antennas. The transmission bandwidth between the base station and each user is B, and the channel fading coefficients are all h S,D The base station transmission power is P total Each transmitting antenna is the same and corresponds to the transmitting power P of each antenna tr =P total The system noise power is expressed as σ;
step (2), constructing a state transition matrix of the number of the access base stations of the user: determining a transition probability matrix of user access numbers in a cell according to the number of antennas started by a base station, wherein when the number of antennas started by the base station is T n Time, cell user access number transition probability matrix S n Can be expressed as:
by s i The current time is represented by i (i is more than or equal to 1 and less than or equal to M) base stations accessed by the user, and s' j The number of the user access base stations at the next moment is j (j is more than or equal to 1 and less than or equal to M), p ij The probability of the number of the user access base stations from i to j is represented, and the calculation method is represented as follows:
when the number of the base station starting antennas is T n When the number of the user access base stations is shifted from i to j total B (B is less than or equal to A), the probability p ij Expressed as:
step (3), constructing a feedback observation matrix: according to a feedback control strategy, aiming at a feedback QoS target to be optimized by a system, namely user receiving power and a user transmission error rate, an observation matrix is determined, and the method specifically comprises the following steps:
step (3.1), when the number of the started antennas is T n When the number of the access base stations of the user is m, calculating the receiving power of the user, and expressing as follows:
wherein, the transmission power of the base station is P total The transmission power of each antenna can be expressed as P tr =P total /N,l n Is the distance between the base station and the user, H n For the user antenna height, h S,D Is a path fading coefficient;
step (3.2), when the number of the started antennas is T n And when the number of the user access base stations is m, calculating the transmission error rate of the user:
the bit error rate of the data sent by the base station and received by the users in the cell is related to the number of the base stations started and the path loss along with the number of the users accessed, so the user receiving bit error rate can be expressed as:
step (3.3), when the number of the antennas started by the base station is T n Then, according to the user receiving power and user transmission error rate, the computing system feeds back the observation probability matrix O n It can be expressed as:
wherein o is 1 Presentation considerationsInfluence of the received power of the subscriber, o 2 Indicating the influence of considering the bit error rate of the user, pp m1 Indicating the probability, pp, that the threshold alpha for the user's received power is met in the case where the number of user accesses is m m2 Which represents the probability of satisfying the threshold value beta of the user error rate in case that the number of user accesses is m,
the threshold values α and β satisfy:
pp m1 and pp m2 The calculation method of (2) is as follows:
wherein δ and ε satisfy:
0<δ≤1,0<ε≤1
step (4), after the user number transfer matrix and the feedback observation matrix are constructed in the steps (1) to (3), the optimal resource allocation at each moment is calculated, and the user reliability state probability (BS) is calculated according to the following formula, namely according to the antenna opening number T n Corresponding state transition matrix S n And a feedback observation matrix O n And the user BS value b(s) of the last time k-1 nlm Calculate b at this time k nlm The value:
η=1/Pr(o|b,T)
wherein S is n (s' m |s m ,T n ) Indicates that the number of antennas turned on at the base station is T n Subscriber access number from s m Transfer to s' m Probability of (A) of n (o l |s' m ,T n ) Indicates that the number of antennas turned on at the base station is T n In time, the number s of access users' m Probability corresponding to feedback QoS optimization target of the ith (l =1 or l = 2), eta is an intermediate variable, and b(s) is initial at the first moment nlm The values are set as:
after being respectively calculated, are respectively substituted into b nlm Value, can obtain the number of antennas turned on at the base station from 1 to T N B '(s') corresponding to two feedback QoS optimization targets, wherein the number of access users ranges from 1 to M nlm Matrix:
step (5), calculating the number of the base station starting antennas as T n The number of access users is s m System transmission rate of time c nm Namely, the data transmission rate C obtained for each case:
wherein the content of the first and second substances,
step (6)) B '(s') obtained according to the step (4) nlm And (5) obtaining the data transmission rate C, and when the computing system considers the feedback QoS optimization targets of the l (l =1 or l = 2), the base station antenna opening number is from 1 to T n Obtained system benefits
Wherein R is nm =c nm ·b nlm
And (7) determining an optimization target:
step (7.1), corresponding to the first feedback QoS optimization target method, determining the profitExpressed as:
namely to selectMiddle maximum R nm Corresponding T n That is, when the current time k is, the number of base station antennas that should be turned on when the ith (l =1 or l = 2) feedback QoS optimization target method is correspondingly considered, and the corresponding s m I.e. the initial state b(s) of the cell user access number at the next time k +1 nlm
Step (7.2), the maximum benefit of the user receiving power and the data transmission error rate are comprehensively considered, and the maximum benefit is expressed as:
wherein, γ and λ are weight coefficients corresponding to two feedback QoS optimization target methods, respectively, and satisfy:
the invention has the advantages that in a communication cell with a multi-antenna base station and multiple users, the number of the antennas of the base station in the cell and the number of the users in the cell are enabled to reach the optimal resource allocation by considering the change of the number of the users accessing the base station in the cell and combining the receiving power of the users and the data transmission error rate. On the other hand, the optimization direction of the system is further improved and the performance of the system is improved by considering the feedback QoS optimization target. The performance influence of the dynamic resource allocation method of the POMDP-based controlled wireless network system on the opening of the base station antenna and the number of access users in a cell is investigated through simulation experiments.
Drawings
Fig. 1 shows a communication cell model including a schematic structure of a base station and users.
Fig. 2 is a flow chart of a design of a dynamic resource allocation method for a POMDP-based controlled wireless network system.
FIG. 3 is a graph comparing the received power of users in a cellWhich is representative of the method of the present invention,a method of representing the received power of an uncoded user.
Fig. 4 is a comparison graph of the bit error rate of user data transmission in a cell. In the drawingsWhich is representative of the method of the present invention,indicating a non-rate of investigation of a userA method for data transmission error rate.
Fig. 5 is a graph comparing average earnings under different conditions in a cell. In the figureExpressed in terms of a feedback optimization strategy to be considered,the representation only considers the user received power situation,indicating that only the user data transmission error rate situation is considered,represents the method of the invention.
Fig. 6 is a diagram comparing the number of users accessing a cell with the number of antennas turned on by a base station.The situation that 8 users need to access the base station of the cell under the condition of the method of the invention is shown,the situation that 8 users need to access the cell base station under the condition of not considering the feedback optimization strategy is shown.
Detailed Description
The following describes the technical solution of the dynamic resource allocation method of the controlled wireless network system with reference to the accompanying drawings and embodiments.
The flow chart of the method of the invention is shown in figure 2, and comprises the following steps:
step 1, system initialization: setting the number of base station antennas and the number of users in a cell, and setting the transmission power and the path fading coefficient of a base station;
step 2, repeating multiple observation to determine a state probability transition matrix of the number of the cell user connection base stations;
step 3, setting a user receiving power threshold value a and a user data transmission error rate threshold value beta according to actual requirements, respectively calculating the probability of meeting two feedback QoS target requirements, and constructing a feedback observation matrix;
and 4, calculating the user reliability state probability b '(s') at the moment according to the user number transfer matrix, the feedback observation matrix and the user reliability state probability (belief state, BS) at the last moment.
And step 5, respectively calculating various user access numbers corresponding to various antenna opening numbers to obtain the transmission rate C.
Step 6, according to the obtained b '(s') and the obtained data transmission rate C, calculating the system benefits corresponding to the opening numbers of the antennas of different base stations when the system considers the user receiving power or the user data transmission error rate
Step 7, selecting the maximum system benefitThe corresponding base station antenna opening number and user access number are the base station antenna opening number and the user access number of the next moment which enable the cell to obtain the optimal resource allocation when the user receiving power or the user data transmission error rate is considered, the user receiving power and the user data transmission error rate are comprehensively considered, and the maximum benefit of the system can be obtained:
the simulation of the invention on the PC is realized by using Matlab language for programming. MATLAB is a high-level matrix language that contains control statements, functions, data structures, inputs and outputs, and object-oriented programming features, and is a collection of vast computing algorithms. The system has more than 600 mathematical operation functions used in engineering, and can conveniently realize various calculation functions required by users.
Fig. 3 is a diagram comparing the received power of users in a cell. As can be seen from fig. 3, in the method of the present invention, under the condition that different base stations turn on the number of antennas, the value of the received power of the user is always better than the case of not considering the feedback of the received power. When the number of the base station starting antennas is 3, the user receiving power corresponding to the method can reach 72.5W, and the user receiving power corresponding to the method without considering feedback optimization is only 62.5W. It can be concluded that the user received power is related to the number of antennas turned on by the base station, and the total trend increases with the increase of the number of antennas turned on by the base station, but the user received power obtained based on the content of the present invention is always better than the situation corresponding to the feedback optimization method.
Fig. 4 is a comparison diagram of the bit error rate of user data transmission in a cell. As can be seen from fig. 4, in the data transmission process, under the condition that different base stations turn on the number of antennas, the bit error rate of user data transmission in the cell also changes. When the number of the base station starting antennas is 4, the user data transmission error rate corresponding to the method is only 6.30%, and the user data transmission error rate corresponding to the method without considering feedback optimization is 9.62%. Meanwhile, under the condition of the same transmission error rate, compared with a feedback optimization method which is not considered, the method can obviously reduce the number of the antennas of the base station, thereby achieving the purpose of energy conservation. For example, when the transmission error rates all need to reach about 10%, the base station only needs to turn on 1 antenna by using the method of the present invention, and the base station needs to turn on 4 antennas by using the method without considering feedback optimization.
Fig. 5 is a comparison graph of the number of users accessing the cell and the number of antennas of the base station. As shown in fig. 5, when the base station turns on the same number of antennas, compared with the method without feedback optimization, the method of the present invention can significantly increase the number of users accessing the cell. At a certain moment, when 8 users need to access the base station in a cell, the base station only needs to start 4 base stations, and the base station needs to start 6 antennas by adopting a method without considering feedback optimization. Therefore, when the number of users is the same, the method can effectively reduce the number of the opened antennae of the base station, in other words, if the number of the opened antennae of the base station is the same, the base station of the cell can access more users.
For comparison with the feedback optimization objectives in the prior art and the method of the present invention, fig. 6 simultaneously performed simulation experiments on the system gains of different methods. FIG. 6 is a graph comparing system gains for the method of the present invention and prior art methods based on different feedback optimization objectives. As can be seen from fig. 6, under the condition that the base station turns on any number of antennas, when the user reception power and the user data transmission error rate are considered comprehensively in the scheme of the present invention, the maximum system gain can be obtained, the system gain obtained by considering only the power feedback strategy in the method is better than the system gain obtained by considering only the user reception power, but the system gains obtained in the three cases are better than the method without considering the feedback optimization, which further proves that the system can obtain a greater gain by using the present invention.

Claims (1)

1. The dynamic resource allocation method of the controlled wireless network system based on the POMDP is characterized in that: in a certain communication cell, a base station with N antennas and users with M single antennas are included, after a state transition probability matrix of the cell user access number and an observation matrix of a QoS index for feeding back the network user receiving power and the data transmission error rate are known, the base station antenna opening number with the maximum profit at the moment and the cell user optimal access number at the next moment are obtained according to the credibility state probability of the user access number at a certain moment, and the method is specifically realized by the following steps in sequence:
step (1), initializing a system, wherein the method comprises the following steps according to actual conditions:
m single-antenna users are contained in a cell, and the number of users needing to access a base station at a certain moment is represented as s 1 ,s 2 ,…,s m ,…,s M ,s m It shows that there are m users accessing the base station, and at the same time, the base station contains one N antennas, the number of the open antennas is T 1 ,T 2 ,…,T n ,…,T N ,T n Indicating that the base station turns on n antennas; between base station and each userHas a transmission bandwidth of B and channel fading coefficients of h S,D The base station transmission power is P total Each transmitting antenna is the same and corresponds to the transmitting power P of each antenna tr =P total The system noise power is expressed as σ;
step (2), constructing a state transition matrix of the number of the access base stations of the user: determining a transition probability matrix of user access numbers in a cell according to the number of antennas started by a base station, wherein when the number of antennas started by the base station is T n Time, cell user access number transition probability matrix S n Expressed as:
by s i The current time is represented by i, wherein i is more than or equal to 1 and less than or equal to M and s' j The number of the user access base stations at the next moment is j, wherein j is more than or equal to 1 and less than or equal to M, p ij The probability of the number of the user access base stations from i to j is represented, and the calculation method is represented as follows:
when the number of the base station starting antennas is T n When the number of the user access base stations is shifted from i to j total B (B is less than or equal to A), the probability p ij Expressed as:
step (3), constructing a feedback observation matrix: according to a feedback control strategy, aiming at a feedback QoS target to be optimized by a system, namely user receiving power and a user transmission error rate, an observation matrix is determined, and the method specifically comprises the following steps:
step (3.1), when the number of the started antennas is T n When the number of the access base stations of the user is m, calculating the receiving power of the user, and expressing as follows:
wherein, the transmitting power of the base station is P total The transmission power of each antenna is denoted as P tr =P total /N,l n Is the distance between the base station and the user, H n For the user antenna height, h S,D Is a path fading coefficient;
step (3.2), when the number of the started antennas is T n And when the number of the user access base stations is m, calculating the transmission error rate of the user:
the bit error rate of the data sent by the base station and received by the users in the cell is related to the number of the opened base stations and the path loss along with the number of the accessed users, so the bit error rate received by the users is expressed as:
step (3.3), when the number of the antennas started by the base station is T n Then, according to the user receiving power and user transmission error rate, the computing system feeds back the observation probability matrix O n Expressed as:
wherein o is 1 Indicating the effect of taking into account the received power of the user, o 2 Indicating the influence of considering the error rate of data transmission, pp m1 Indicating the probability, pp, that the threshold alpha for the user's received power is met in the case where the number of user accesses is m m2 The probability that the threshold value beta of the user error rate is met under the condition that the user access number is m is shown, and the threshold values alpha and beta respectively meet:
pp m1 and pp m2 The calculation method of (2) is as follows:
wherein δ and ε satisfy:
0<δ≤1,0<ε≤1
step (4), after the user number transfer matrix and the feedback observation matrix are constructed in the steps (1) to (3), the optimal resource allocation at each moment is calculated, and the user reliability state probability is calculated according to the following formula, namely according to the antenna opening number T n Corresponding state transition matrix S n And a feedback observation matrix O n And the user confidence level status value b(s) of the last time k-1 nlm Calculate b at this time k nlm The value:
η=1/Pr(o|b,T)
wherein S is n (s' m |s m ,T n ) Indicates that the number of antennas turned on at the base station is T n Subscriber access number from s m Transfer to s' m Probability of (A) of n (o l |s' m ,T n ) Indicating on the base station on dayNumber of lines T n In time, the number s of access users' m Probability corresponding to feedback QoS optimization goal of the first kind, i =1 or l =2, eta is an intermediate variable, and b(s) is initial at the first moment nlm The values are set as:
after being respectively calculated, are respectively substituted into b nlm Value, obtained at the base station antenna turn-on number from 1 to T N B '(s') corresponding to two feedback QoS optimization targets, wherein the number of access users ranges from 1 to M nlm Matrix:
step (5), calculating the number of the base station starting antennas as T n The number of access users is s m System transmission rate of time c nm Namely, the data transmission rate C obtained for each case:
wherein, the first and the second end of the pipe are connected with each other,
step (6), according to b '(s') obtained in step (4) nlm And (5) obtaining the data transmission rate C, wherein when the computing system considers the feedback QoS optimization target of the ith type, l =1 or l =2, the opening number of the base station antenna is from 1 to T n Obtained system benefits
Wherein R is nm =c nm ·b nlm
And (7) determining an optimization target:
step (7.1), corresponding to the first feedback QoS optimization target method, determining the profitExpressed as:
namely to selectMiddle maximum R nm Corresponding T n That is, when the current time k is, the number of base station antennas to be turned on when the ith feedback QoS optimization target method is considered correspondingly, i =1 or i =2, and corresponding s m I.e. the initial state b(s) of the cell user access number at the next time k +1 nlm
Step (7.2), the maximum benefit of the user receiving power and the data transmission error rate are comprehensively considered, and the maximum benefit is expressed as:
wherein, γ and λ are weight coefficients corresponding to two feedback QoS optimization target methods, respectively, and satisfy:
if the user receiving power is considered preferentially, gamma is greater than lambda; if the data transmission error rate is considered preferentially, γ < λ is provided.
CN201510271561.XA 2015-05-25 2015-05-25 Controlled Radio Network System dynamic resource allocation method based on POMDP Expired - Fee Related CN105007582B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510271561.XA CN105007582B (en) 2015-05-25 2015-05-25 Controlled Radio Network System dynamic resource allocation method based on POMDP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510271561.XA CN105007582B (en) 2015-05-25 2015-05-25 Controlled Radio Network System dynamic resource allocation method based on POMDP

Publications (2)

Publication Number Publication Date
CN105007582A CN105007582A (en) 2015-10-28
CN105007582B true CN105007582B (en) 2018-03-16

Family

ID=54380061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510271561.XA Expired - Fee Related CN105007582B (en) 2015-05-25 2015-05-25 Controlled Radio Network System dynamic resource allocation method based on POMDP

Country Status (1)

Country Link
CN (1) CN105007582B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827294B (en) * 2016-04-27 2019-05-21 东南大学 A kind of method of uplink extensive MIMO combined optimization antenna for base station number and user emission power
CN107493583B (en) * 2017-06-29 2020-09-25 南京邮电大学 Price perception user scheduling algorithm based on multi-slope online game

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101398914A (en) * 2008-11-10 2009-04-01 南京大学 Preprocess method of partially observable Markov decision process based on points
CN102946643A (en) * 2012-10-24 2013-02-27 复旦大学 Orthogonal frequency division multiple access (OFDMA) resource scheduling method adopting cross-layer feedback information
CN103188813A (en) * 2013-02-25 2013-07-03 南京邮电大学 Distributed resource scheduling method combined with dynamic management of control information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102098784B (en) * 2009-12-14 2014-06-04 华为技术有限公司 Resource allocation method and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101398914A (en) * 2008-11-10 2009-04-01 南京大学 Preprocess method of partially observable Markov decision process based on points
CN102946643A (en) * 2012-10-24 2013-02-27 复旦大学 Orthogonal frequency division multiple access (OFDMA) resource scheduling method adopting cross-layer feedback information
CN103188813A (en) * 2013-02-25 2013-07-03 南京邮电大学 Distributed resource scheduling method combined with dynamic management of control information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Stability analysis of networked systems with similar dynamics;Rene Schuh ET AL.;《Control Conference (ECC), 2013 European》;20131202;第4359-4364页 *
部分可观测马尔可夫决策过程算法综述;桂林等;《***工程与电子技术》;20080615;第30卷(第6期);第1058-1064页 *

Also Published As

Publication number Publication date
CN105007582A (en) 2015-10-28

Similar Documents

Publication Publication Date Title
CN112383922B (en) Deep reinforcement learning frequency spectrum sharing method based on prior experience replay
CN111585816B (en) Task unloading decision method based on adaptive genetic algorithm
CN111491358B (en) Adaptive modulation and power control system based on energy acquisition and optimization method
CN102595570B (en) Hidden Markov model based spectrum accessing method for cognitive radio system
CN113596785B (en) D2D-NOMA communication system resource allocation method based on deep Q network
CN111162888B (en) Distributed antenna system, remote access unit, power distribution method, and medium
CN109787696B (en) Cognitive radio resource allocation method based on case reasoning and cooperative Q learning
CN105007582B (en) Controlled Radio Network System dynamic resource allocation method based on POMDP
CN107528650A (en) A kind of Forecasting Methodology of the cognitive radio networks frequency spectrum based on GCV RBF neurals
CN105188124A (en) Robustness gaming power control method under imperfect CSI for multi-user OFDMA relay system
CN111083786B (en) Power distribution optimization method of mobile multi-user communication system
CN105792218A (en) Optimization method of cognitive radio network with radio frequency energy harvesting capability
Kang Reinforcement learning based adaptive resource allocation for wireless powered communication systems
Guo et al. An energy-efficiency multi-relay selection and power allocation based on deep neural network for Amplify-and-Forward cooperative transmission
Ji et al. Reconfigurable intelligent surface enhanced device-to-device communications
CN114126021B (en) Power distribution method of green cognitive radio based on deep reinforcement learning
CN113473580B (en) User association joint power distribution method based on deep learning in heterogeneous network
CN115065728A (en) Multi-strategy reinforcement learning-based multi-target content storage method
CN113242066B (en) Multi-cell large-scale MIMO communication intelligent power distribution method
Liu et al. Power allocation in ultra-dense networks through deep deterministic policy gradient
Zhong et al. Online sparse beamforming in C-RAN: A deep reinforcement learning approach
Huang et al. Joint AMC and resource allocation for mobile wireless networks based on distributed MARL
CN113395757B (en) Deep reinforcement learning cognitive network power control method based on improved return function
CN113595609B (en) Collaborative signal transmission method of cellular mobile communication system based on reinforcement learning
Du et al. Joint time and power control of energy harvesting CRN based on PPO

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180316