CN105007582B - Controlled Radio Network System dynamic resource allocation method based on POMDP - Google Patents
Controlled Radio Network System dynamic resource allocation method based on POMDP Download PDFInfo
- Publication number
- CN105007582B CN105007582B CN201510271561.XA CN201510271561A CN105007582B CN 105007582 B CN105007582 B CN 105007582B CN 201510271561 A CN201510271561 A CN 201510271561A CN 105007582 B CN105007582 B CN 105007582B
- Authority
- CN
- China
- Prior art keywords
- user
- base station
- antennas
- access
- cell
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000013468 resource allocation Methods 0.000 title claims abstract description 23
- 230000005540 biological transmission Effects 0.000 claims abstract description 51
- 239000011159 matrix material Substances 0.000 claims abstract description 35
- 238000004891 communication Methods 0.000 claims abstract description 23
- 230000008901 benefit Effects 0.000 claims abstract description 13
- 238000005457 optimization Methods 0.000 claims description 32
- 230000007704 transition Effects 0.000 claims description 13
- 238000005562 fading Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000012546 transfer Methods 0.000 claims description 5
- 238000011217 control strategy Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000023402 cell communication Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004134 energy conservation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/02—Resource partitioning among network components, e.g. reuse partitioning
- H04W16/10—Dynamic resource partitioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
- H04W72/542—Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Controlled Radio Network System dynamic resource allocation method based on POMDP, belongs to controlled wireless network and communication resource distribution field.By the state-transition matrix and feedback observing matrix that construct community user access base station number, calculate user's Belief state probability and user and open several obtained transmission rates according to antenna for base station, antenna for base station when maximum return is obtained so as to decision system opens number and optimal user access number, completes optimal resource allocation in cell.The present invention is in the case of multiple users and a multi-antenna base station being in the cell present, it is maximum minimum with the user data transmission bit error rate with intra-cell users receiving power respectively, it is target so as to obtain maximum system income, it is determined that final antenna for base station opens number and user's access base station number.Consumed excessively instant invention overcomes the energy, cell base station load is larger, antenna is opened and accesses the problems such as number mismatches with user.There is certain advantage in terms of user's receiving power, anti-interference and system integral benefit is improved.
Description
Technical Field
The invention relates to a dynamic resource allocation method of a controlled wireless communication network system based on a Partially Observable Markov Decision Process (POMDP). A selection scheme beneficial to the resource allocation of a wireless communication network is designed by the POMDP method, belonging to the related fields of controlled wireless networks and communication resource allocation research.
Background
Mobile communication has been rapidly developed in recent decades, and the user's demand for quality of service of wireless communication networks has been increasing, which has prompted the wireless communication system to evolve from 2G, 3G, B G, 4G and 5G, and the network body will also transform from a voice-dominated network to a high-speed data-dominated network. Meanwhile, the mobile multimedia service has higher and higher bandwidth requirements, and "broadband" becomes a development trend of mobile communication technology. Currently, there are three main aspects affecting the Quality of Service (QoS) of a wireless communication network: firstly, the high dynamics of the wireless mobile communication network, frequent handover operations caused by random changes of user positions and the changeability of network topology will cause the instability of data transmission rate and connectivity; secondly, the power loss of the cell user received by the base station is a great proportion due to the characteristics of channel fading of a wireless communication network, the limited power or energy of the mobile terminal and the like; thirdly, due to the influence of channel fading between the base station and the user, the number of antennas turned on by the base station and the user, the signal-to-noise ratio of the user, etc., the bit error rate in data transmission is also greatly influenced, thereby affecting the reliability of data link transmission. For many years, although the design algorithm and the like of the wireless communication network are continuously optimized and improved in the industry, a plurality of methods for improving the network service quality are provided, and the forward development of the wireless communication network design is promoted, the problems such as network power consumption loss, data transmission reliability and the like cannot be thoroughly solved all the time, so that the design and the deployment based on the traditional wireless communication network system architecture and the communication layered protocol system cannot more effectively solve the contradictions.
In the field of control engineering, a feedback control strategy is used as the most basic control method, becomes the core of a closed-loop control system, and plays an important role in controlling and adjusting the states of all nodes of the system. The feedback strategy is widely and deeply applied to the fields of closed-loop control, information theory, channel coding and the like of an industrial system from the beginning of the proposal. By means of the feedback strategy, the control system has self-adjusting, self-adapting and self-stabilizing capabilities, and system performance indexes are comprehensively improved. Meanwhile, research on Wireless Network Control Systems (WNCS) has attracted high attention from both domestic and foreign researchers. Professor l.litz and doctor a.chamaken, university of kezewalen, germany, propose embedding a wireless communication network into an industrial control system, and design a system architecture, a control algorithm, a wireless communication network architecture and a communication protocol that meet the requirements of performance indexes of the control system, thereby improving the processing of information and the control of the system among sensors, controllers and actuators of the system, and realizing the prediction and optimization of the industrial control system. M.D.Di Benedetto and other scholars of the university of Liraquina deeply research WNCS design, and the scholars propose a relevant cost function, firstly map parameters such as noise, coding, modulation mode, system power and the like of a control system into a wireless network by utilizing the function, and then select a proper wireless network type, so that the requirements of improving the robustness and flexibility of the control system are met.
The Partially Observable Markov Decision Process (POMDP) is solved by converting a non-Markov chain problem into a Markov chain problem by introducing a belief state space, and is characterized in that the state information of a system cannot be directly observed and obtained, is partially knowable, and is used for modeling the system only with incomplete state information and making a decision according to the current incomplete state information, so that the maximum benefit is obtained. The state transition model is more consistent with the characteristic that part of state information in a wireless communication network scene is not completely known and needs to be observed to obtain optimal resource allocation.
In summary, the main objective of the present invention is to introduce a control feedback optimization strategy, apply the POMDP model to the controlled wireless communication network system, and predict and judge the next time cell user optimal access number by using a State transition probability matrix formed by the given cell user access number and an observation probability matrix formed by feedback network QoS service indexes (user received power and user transmission error rate), and according to the cell user access State (Belief State) at a certain time and the corresponding base station open antenna number gain; meanwhile, according to the maximum profit, the number of the antennas of the base station of the cell at the moment is judged, and finally the optimal resource allocation of the antennas of the base station and the user access in the cell is achieved.
Disclosure of Invention
The invention mainly aims to complete the optimal resource allocation strategy of cell base station antenna opening and access users by taking the dynamic resource allocation optimization of the number of the access users and the number of the cell opening antennas at each moment as an optimization target and applying a POMDP model and a control feedback strategy under the condition that one multi-antenna base station and a plurality of users exist in a cell network in the aspect of optimal resource allocation of the cell communication network. The method solves the problem of how to select and determine the optimal resource allocation under the condition that a base station with a plurality of antennas and a plurality of communication users exist in a cell network, and obtains the maximum benefit of a cell wireless communication network system through the optimal resource allocation.
The scene model of the cell environment to which the invention is adapted is shown in figure 1.
The flow chart of the system operation principle in the technical scheme of the invention is shown in figure 2.
The comparison of the system user received power base station situation of the invention is shown in figure 3.
The comparison graph of the error rate of the system of the invention is shown in figure 4.
The average profit comparison graph under different conditions in the cell of the system of the present invention is shown in fig. 5.
The comparison of the number of users accessing the cell and the number of antennas turned on by the base station in the system of the present invention is shown in fig. 6.
The model diagram of the cell environment scene is shown in fig. 1, and the dynamic resource allocation method of the controlled wireless network system based on the POMDP is characterized in that: in a communication cell, a base station with N antennas and M users with single antennas are included, after a state transition probability matrix of the cell user access number and an observation matrix of feedback network QoS indexes (user receiving power and user transmission error rate) are known, according to the reliability state probability (BS) of the user access number at a certain moment, the base station antenna opening number with the maximum profit at the moment and the cell user optimal access number at the next moment are obtained, and the method is specifically realized by the following steps in sequence:
step (1), initializing a system, wherein the method comprises the following steps according to actual conditions:
m single-antenna users are contained in a cell, and the number of users needing to access a base station at a certain moment is represented as s 1 ,s 2 ,…,s m ,…,s M ,s m It shows that there are m users accessing the base station, and at the same time, the base station contains one N antennas, the number of the open antennas is T 1 ,T 2 ,…,T n ,…,T N ,T n Indicating that the base station turns on n antennas. The transmission bandwidth between the base station and each user is B, and the channel fading coefficients are all h S,D The base station transmission power is P total Each transmitting antenna is the same and corresponds to the transmitting power P of each antenna tr =P total The system noise power is expressed as σ;
step (2), constructing a state transition matrix of the number of the access base stations of the user: determining a transition probability matrix of user access numbers in a cell according to the number of antennas started by a base station, wherein when the number of antennas started by the base station is T n Time, cell user access number transition probability matrix S n Can be expressed as:
by s i The current time is represented by i (i is more than or equal to 1 and less than or equal to M) base stations accessed by the user, and s' j The number of the user access base stations at the next moment is j (j is more than or equal to 1 and less than or equal to M), p ij The probability of the number of the user access base stations from i to j is represented, and the calculation method is represented as follows:
when the number of the base station starting antennas is T n When the number of the user access base stations is shifted from i to j total B (B is less than or equal to A), the probability p ij Expressed as:
step (3), constructing a feedback observation matrix: according to a feedback control strategy, aiming at a feedback QoS target to be optimized by a system, namely user receiving power and a user transmission error rate, an observation matrix is determined, and the method specifically comprises the following steps:
step (3.1), when the number of the started antennas is T n When the number of the access base stations of the user is m, calculating the receiving power of the user, and expressing as follows:
wherein, the transmission power of the base station is P total The transmission power of each antenna can be expressed as P tr =P total /N,l n Is the distance between the base station and the user, H n For the user antenna height, h S,D Is a path fading coefficient;
step (3.2), when the number of the started antennas is T n And when the number of the user access base stations is m, calculating the transmission error rate of the user:
the bit error rate of the data sent by the base station and received by the users in the cell is related to the number of the base stations started and the path loss along with the number of the users accessed, so the user receiving bit error rate can be expressed as:
step (3.3), when the number of the antennas started by the base station is T n Then, according to the user receiving power and user transmission error rate, the computing system feeds back the observation probability matrix O n It can be expressed as:
wherein o is 1 Presentation considerationsInfluence of the received power of the subscriber, o 2 Indicating the influence of considering the bit error rate of the user, pp m1 Indicating the probability, pp, that the threshold alpha for the user's received power is met in the case where the number of user accesses is m m2 Which represents the probability of satisfying the threshold value beta of the user error rate in case that the number of user accesses is m,
the threshold values α and β satisfy:
pp m1 and pp m2 The calculation method of (2) is as follows:
wherein δ and ε satisfy:
0<δ≤1,0<ε≤1
step (4), after the user number transfer matrix and the feedback observation matrix are constructed in the steps (1) to (3), the optimal resource allocation at each moment is calculated, and the user reliability state probability (BS) is calculated according to the following formula, namely according to the antenna opening number T n Corresponding state transition matrix S n And a feedback observation matrix O n And the user BS value b(s) of the last time k-1 nlm Calculate b at this time k nlm The value:
η=1/Pr(o|b,T)
wherein S is n (s' m |s m ,T n ) Indicates that the number of antennas turned on at the base station is T n Subscriber access number from s m Transfer to s' m Probability of (A) of n (o l |s' m ,T n ) Indicates that the number of antennas turned on at the base station is T n In time, the number s of access users' m Probability corresponding to feedback QoS optimization target of the ith (l =1 or l = 2), eta is an intermediate variable, and b(s) is initial at the first moment nlm The values are set as:
after being respectively calculated, are respectively substituted into b nlm Value, can obtain the number of antennas turned on at the base station from 1 to T N B '(s') corresponding to two feedback QoS optimization targets, wherein the number of access users ranges from 1 to M nlm Matrix:
step (5), calculating the number of the base station starting antennas as T n The number of access users is s m System transmission rate of time c nm Namely, the data transmission rate C obtained for each case:
wherein the content of the first and second substances,
step (6)) B '(s') obtained according to the step (4) nlm And (5) obtaining the data transmission rate C, and when the computing system considers the feedback QoS optimization targets of the l (l =1 or l = 2), the base station antenna opening number is from 1 to T n Obtained system benefits
Wherein R is nm =c nm ·b nlm ;
And (7) determining an optimization target:
step (7.1), corresponding to the first feedback QoS optimization target method, determining the profitExpressed as:
namely to selectMiddle maximum R nm Corresponding T n That is, when the current time k is, the number of base station antennas that should be turned on when the ith (l =1 or l = 2) feedback QoS optimization target method is correspondingly considered, and the corresponding s m I.e. the initial state b(s) of the cell user access number at the next time k +1 nlm ;
Step (7.2), the maximum benefit of the user receiving power and the data transmission error rate are comprehensively considered, and the maximum benefit is expressed as:
wherein, γ and λ are weight coefficients corresponding to two feedback QoS optimization target methods, respectively, and satisfy:
the invention has the advantages that in a communication cell with a multi-antenna base station and multiple users, the number of the antennas of the base station in the cell and the number of the users in the cell are enabled to reach the optimal resource allocation by considering the change of the number of the users accessing the base station in the cell and combining the receiving power of the users and the data transmission error rate. On the other hand, the optimization direction of the system is further improved and the performance of the system is improved by considering the feedback QoS optimization target. The performance influence of the dynamic resource allocation method of the POMDP-based controlled wireless network system on the opening of the base station antenna and the number of access users in a cell is investigated through simulation experiments.
Drawings
Fig. 1 shows a communication cell model including a schematic structure of a base station and users.
Fig. 2 is a flow chart of a design of a dynamic resource allocation method for a POMDP-based controlled wireless network system.
FIG. 3 is a graph comparing the received power of users in a cellWhich is representative of the method of the present invention,a method of representing the received power of an uncoded user.
Fig. 4 is a comparison graph of the bit error rate of user data transmission in a cell. In the drawingsWhich is representative of the method of the present invention,indicating a non-rate of investigation of a userA method for data transmission error rate.
Fig. 5 is a graph comparing average earnings under different conditions in a cell. In the figureExpressed in terms of a feedback optimization strategy to be considered,the representation only considers the user received power situation,indicating that only the user data transmission error rate situation is considered,represents the method of the invention.
Fig. 6 is a diagram comparing the number of users accessing a cell with the number of antennas turned on by a base station.The situation that 8 users need to access the base station of the cell under the condition of the method of the invention is shown,the situation that 8 users need to access the cell base station under the condition of not considering the feedback optimization strategy is shown.
Detailed Description
The following describes the technical solution of the dynamic resource allocation method of the controlled wireless network system with reference to the accompanying drawings and embodiments.
The flow chart of the method of the invention is shown in figure 2, and comprises the following steps:
step 1, system initialization: setting the number of base station antennas and the number of users in a cell, and setting the transmission power and the path fading coefficient of a base station;
step 2, repeating multiple observation to determine a state probability transition matrix of the number of the cell user connection base stations;
step 3, setting a user receiving power threshold value a and a user data transmission error rate threshold value beta according to actual requirements, respectively calculating the probability of meeting two feedback QoS target requirements, and constructing a feedback observation matrix;
and 4, calculating the user reliability state probability b '(s') at the moment according to the user number transfer matrix, the feedback observation matrix and the user reliability state probability (belief state, BS) at the last moment.
And step 5, respectively calculating various user access numbers corresponding to various antenna opening numbers to obtain the transmission rate C.
Step 6, according to the obtained b '(s') and the obtained data transmission rate C, calculating the system benefits corresponding to the opening numbers of the antennas of different base stations when the system considers the user receiving power or the user data transmission error rate
Step 7, selecting the maximum system benefitThe corresponding base station antenna opening number and user access number are the base station antenna opening number and the user access number of the next moment which enable the cell to obtain the optimal resource allocation when the user receiving power or the user data transmission error rate is considered, the user receiving power and the user data transmission error rate are comprehensively considered, and the maximum benefit of the system can be obtained:
the simulation of the invention on the PC is realized by using Matlab language for programming. MATLAB is a high-level matrix language that contains control statements, functions, data structures, inputs and outputs, and object-oriented programming features, and is a collection of vast computing algorithms. The system has more than 600 mathematical operation functions used in engineering, and can conveniently realize various calculation functions required by users.
Fig. 3 is a diagram comparing the received power of users in a cell. As can be seen from fig. 3, in the method of the present invention, under the condition that different base stations turn on the number of antennas, the value of the received power of the user is always better than the case of not considering the feedback of the received power. When the number of the base station starting antennas is 3, the user receiving power corresponding to the method can reach 72.5W, and the user receiving power corresponding to the method without considering feedback optimization is only 62.5W. It can be concluded that the user received power is related to the number of antennas turned on by the base station, and the total trend increases with the increase of the number of antennas turned on by the base station, but the user received power obtained based on the content of the present invention is always better than the situation corresponding to the feedback optimization method.
Fig. 4 is a comparison diagram of the bit error rate of user data transmission in a cell. As can be seen from fig. 4, in the data transmission process, under the condition that different base stations turn on the number of antennas, the bit error rate of user data transmission in the cell also changes. When the number of the base station starting antennas is 4, the user data transmission error rate corresponding to the method is only 6.30%, and the user data transmission error rate corresponding to the method without considering feedback optimization is 9.62%. Meanwhile, under the condition of the same transmission error rate, compared with a feedback optimization method which is not considered, the method can obviously reduce the number of the antennas of the base station, thereby achieving the purpose of energy conservation. For example, when the transmission error rates all need to reach about 10%, the base station only needs to turn on 1 antenna by using the method of the present invention, and the base station needs to turn on 4 antennas by using the method without considering feedback optimization.
Fig. 5 is a comparison graph of the number of users accessing the cell and the number of antennas of the base station. As shown in fig. 5, when the base station turns on the same number of antennas, compared with the method without feedback optimization, the method of the present invention can significantly increase the number of users accessing the cell. At a certain moment, when 8 users need to access the base station in a cell, the base station only needs to start 4 base stations, and the base station needs to start 6 antennas by adopting a method without considering feedback optimization. Therefore, when the number of users is the same, the method can effectively reduce the number of the opened antennae of the base station, in other words, if the number of the opened antennae of the base station is the same, the base station of the cell can access more users.
For comparison with the feedback optimization objectives in the prior art and the method of the present invention, fig. 6 simultaneously performed simulation experiments on the system gains of different methods. FIG. 6 is a graph comparing system gains for the method of the present invention and prior art methods based on different feedback optimization objectives. As can be seen from fig. 6, under the condition that the base station turns on any number of antennas, when the user reception power and the user data transmission error rate are considered comprehensively in the scheme of the present invention, the maximum system gain can be obtained, the system gain obtained by considering only the power feedback strategy in the method is better than the system gain obtained by considering only the user reception power, but the system gains obtained in the three cases are better than the method without considering the feedback optimization, which further proves that the system can obtain a greater gain by using the present invention.
Claims (1)
1. The dynamic resource allocation method of the controlled wireless network system based on the POMDP is characterized in that: in a certain communication cell, a base station with N antennas and users with M single antennas are included, after a state transition probability matrix of the cell user access number and an observation matrix of a QoS index for feeding back the network user receiving power and the data transmission error rate are known, the base station antenna opening number with the maximum profit at the moment and the cell user optimal access number at the next moment are obtained according to the credibility state probability of the user access number at a certain moment, and the method is specifically realized by the following steps in sequence:
step (1), initializing a system, wherein the method comprises the following steps according to actual conditions:
m single-antenna users are contained in a cell, and the number of users needing to access a base station at a certain moment is represented as s 1 ,s 2 ,…,s m ,…,s M ,s m It shows that there are m users accessing the base station, and at the same time, the base station contains one N antennas, the number of the open antennas is T 1 ,T 2 ,…,T n ,…,T N ,T n Indicating that the base station turns on n antennas; between base station and each userHas a transmission bandwidth of B and channel fading coefficients of h S,D The base station transmission power is P total Each transmitting antenna is the same and corresponds to the transmitting power P of each antenna tr =P total The system noise power is expressed as σ;
step (2), constructing a state transition matrix of the number of the access base stations of the user: determining a transition probability matrix of user access numbers in a cell according to the number of antennas started by a base station, wherein when the number of antennas started by the base station is T n Time, cell user access number transition probability matrix S n Expressed as:
by s i The current time is represented by i, wherein i is more than or equal to 1 and less than or equal to M and s' j The number of the user access base stations at the next moment is j, wherein j is more than or equal to 1 and less than or equal to M, p ij The probability of the number of the user access base stations from i to j is represented, and the calculation method is represented as follows:
when the number of the base station starting antennas is T n When the number of the user access base stations is shifted from i to j total B (B is less than or equal to A), the probability p ij Expressed as:
step (3), constructing a feedback observation matrix: according to a feedback control strategy, aiming at a feedback QoS target to be optimized by a system, namely user receiving power and a user transmission error rate, an observation matrix is determined, and the method specifically comprises the following steps:
step (3.1), when the number of the started antennas is T n When the number of the access base stations of the user is m, calculating the receiving power of the user, and expressing as follows:
wherein, the transmitting power of the base station is P total The transmission power of each antenna is denoted as P tr =P total /N,l n Is the distance between the base station and the user, H n For the user antenna height, h S,D Is a path fading coefficient;
step (3.2), when the number of the started antennas is T n And when the number of the user access base stations is m, calculating the transmission error rate of the user:
the bit error rate of the data sent by the base station and received by the users in the cell is related to the number of the opened base stations and the path loss along with the number of the accessed users, so the bit error rate received by the users is expressed as:
step (3.3), when the number of the antennas started by the base station is T n Then, according to the user receiving power and user transmission error rate, the computing system feeds back the observation probability matrix O n Expressed as:
wherein o is 1 Indicating the effect of taking into account the received power of the user, o 2 Indicating the influence of considering the error rate of data transmission, pp m1 Indicating the probability, pp, that the threshold alpha for the user's received power is met in the case where the number of user accesses is m m2 The probability that the threshold value beta of the user error rate is met under the condition that the user access number is m is shown, and the threshold values alpha and beta respectively meet:
pp m1 and pp m2 The calculation method of (2) is as follows:
wherein δ and ε satisfy:
0<δ≤1,0<ε≤1
step (4), after the user number transfer matrix and the feedback observation matrix are constructed in the steps (1) to (3), the optimal resource allocation at each moment is calculated, and the user reliability state probability is calculated according to the following formula, namely according to the antenna opening number T n Corresponding state transition matrix S n And a feedback observation matrix O n And the user confidence level status value b(s) of the last time k-1 nlm Calculate b at this time k nlm The value:
η=1/Pr(o|b,T)
wherein S is n (s' m |s m ,T n ) Indicates that the number of antennas turned on at the base station is T n Subscriber access number from s m Transfer to s' m Probability of (A) of n (o l |s' m ,T n ) Indicating on the base station on dayNumber of lines T n In time, the number s of access users' m Probability corresponding to feedback QoS optimization goal of the first kind, i =1 or l =2, eta is an intermediate variable, and b(s) is initial at the first moment nlm The values are set as:
after being respectively calculated, are respectively substituted into b nlm Value, obtained at the base station antenna turn-on number from 1 to T N B '(s') corresponding to two feedback QoS optimization targets, wherein the number of access users ranges from 1 to M nlm Matrix:
step (5), calculating the number of the base station starting antennas as T n The number of access users is s m System transmission rate of time c nm Namely, the data transmission rate C obtained for each case:
wherein, the first and the second end of the pipe are connected with each other,
step (6), according to b '(s') obtained in step (4) nlm And (5) obtaining the data transmission rate C, wherein when the computing system considers the feedback QoS optimization target of the ith type, l =1 or l =2, the opening number of the base station antenna is from 1 to T n Obtained system benefits
Wherein R is nm =c nm ·b nlm ;
And (7) determining an optimization target:
step (7.1), corresponding to the first feedback QoS optimization target method, determining the profitExpressed as:
namely to selectMiddle maximum R nm Corresponding T n That is, when the current time k is, the number of base station antennas to be turned on when the ith feedback QoS optimization target method is considered correspondingly, i =1 or i =2, and corresponding s m I.e. the initial state b(s) of the cell user access number at the next time k +1 nlm ;
Step (7.2), the maximum benefit of the user receiving power and the data transmission error rate are comprehensively considered, and the maximum benefit is expressed as:
wherein, γ and λ are weight coefficients corresponding to two feedback QoS optimization target methods, respectively, and satisfy:
if the user receiving power is considered preferentially, gamma is greater than lambda; if the data transmission error rate is considered preferentially, γ < λ is provided.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510271561.XA CN105007582B (en) | 2015-05-25 | 2015-05-25 | Controlled Radio Network System dynamic resource allocation method based on POMDP |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510271561.XA CN105007582B (en) | 2015-05-25 | 2015-05-25 | Controlled Radio Network System dynamic resource allocation method based on POMDP |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105007582A CN105007582A (en) | 2015-10-28 |
CN105007582B true CN105007582B (en) | 2018-03-16 |
Family
ID=54380061
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510271561.XA Expired - Fee Related CN105007582B (en) | 2015-05-25 | 2015-05-25 | Controlled Radio Network System dynamic resource allocation method based on POMDP |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105007582B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105827294B (en) * | 2016-04-27 | 2019-05-21 | 东南大学 | A kind of method of uplink extensive MIMO combined optimization antenna for base station number and user emission power |
CN107493583B (en) * | 2017-06-29 | 2020-09-25 | 南京邮电大学 | Price perception user scheduling algorithm based on multi-slope online game |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101398914A (en) * | 2008-11-10 | 2009-04-01 | 南京大学 | Preprocess method of partially observable Markov decision process based on points |
CN102946643A (en) * | 2012-10-24 | 2013-02-27 | 复旦大学 | Orthogonal frequency division multiple access (OFDMA) resource scheduling method adopting cross-layer feedback information |
CN103188813A (en) * | 2013-02-25 | 2013-07-03 | 南京邮电大学 | Distributed resource scheduling method combined with dynamic management of control information |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102098784B (en) * | 2009-12-14 | 2014-06-04 | 华为技术有限公司 | Resource allocation method and equipment |
-
2015
- 2015-05-25 CN CN201510271561.XA patent/CN105007582B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101398914A (en) * | 2008-11-10 | 2009-04-01 | 南京大学 | Preprocess method of partially observable Markov decision process based on points |
CN102946643A (en) * | 2012-10-24 | 2013-02-27 | 复旦大学 | Orthogonal frequency division multiple access (OFDMA) resource scheduling method adopting cross-layer feedback information |
CN103188813A (en) * | 2013-02-25 | 2013-07-03 | 南京邮电大学 | Distributed resource scheduling method combined with dynamic management of control information |
Non-Patent Citations (2)
Title |
---|
Stability analysis of networked systems with similar dynamics;Rene Schuh ET AL.;《Control Conference (ECC), 2013 European》;20131202;第4359-4364页 * |
部分可观测马尔可夫决策过程算法综述;桂林等;《***工程与电子技术》;20080615;第30卷(第6期);第1058-1064页 * |
Also Published As
Publication number | Publication date |
---|---|
CN105007582A (en) | 2015-10-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112383922B (en) | Deep reinforcement learning frequency spectrum sharing method based on prior experience replay | |
CN111585816B (en) | Task unloading decision method based on adaptive genetic algorithm | |
CN111491358B (en) | Adaptive modulation and power control system based on energy acquisition and optimization method | |
CN102595570B (en) | Hidden Markov model based spectrum accessing method for cognitive radio system | |
CN113596785B (en) | D2D-NOMA communication system resource allocation method based on deep Q network | |
CN111162888B (en) | Distributed antenna system, remote access unit, power distribution method, and medium | |
CN109787696B (en) | Cognitive radio resource allocation method based on case reasoning and cooperative Q learning | |
CN105007582B (en) | Controlled Radio Network System dynamic resource allocation method based on POMDP | |
CN107528650A (en) | A kind of Forecasting Methodology of the cognitive radio networks frequency spectrum based on GCV RBF neurals | |
CN105188124A (en) | Robustness gaming power control method under imperfect CSI for multi-user OFDMA relay system | |
CN111083786B (en) | Power distribution optimization method of mobile multi-user communication system | |
CN105792218A (en) | Optimization method of cognitive radio network with radio frequency energy harvesting capability | |
Kang | Reinforcement learning based adaptive resource allocation for wireless powered communication systems | |
Guo et al. | An energy-efficiency multi-relay selection and power allocation based on deep neural network for Amplify-and-Forward cooperative transmission | |
Ji et al. | Reconfigurable intelligent surface enhanced device-to-device communications | |
CN114126021B (en) | Power distribution method of green cognitive radio based on deep reinforcement learning | |
CN113473580B (en) | User association joint power distribution method based on deep learning in heterogeneous network | |
CN115065728A (en) | Multi-strategy reinforcement learning-based multi-target content storage method | |
CN113242066B (en) | Multi-cell large-scale MIMO communication intelligent power distribution method | |
Liu et al. | Power allocation in ultra-dense networks through deep deterministic policy gradient | |
Zhong et al. | Online sparse beamforming in C-RAN: A deep reinforcement learning approach | |
Huang et al. | Joint AMC and resource allocation for mobile wireless networks based on distributed MARL | |
CN113395757B (en) | Deep reinforcement learning cognitive network power control method based on improved return function | |
CN113595609B (en) | Collaborative signal transmission method of cellular mobile communication system based on reinforcement learning | |
Du et al. | Joint time and power control of energy harvesting CRN based on PPO |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180316 |