CN109660375A - A kind of highly reliable adaptive MAC layer scheduling method - Google Patents
A kind of highly reliable adaptive MAC layer scheduling method Download PDFInfo
- Publication number
- CN109660375A CN109660375A CN201710946487.6A CN201710946487A CN109660375A CN 109660375 A CN109660375 A CN 109660375A CN 201710946487 A CN201710946487 A CN 201710946487A CN 109660375 A CN109660375 A CN 109660375A
- Authority
- CN
- China
- Prior art keywords
- probability
- duty ratio
- feedback
- node
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/04—Network management architectures or arrangements
- H04L41/044—Network management architectures or arrangements comprising hierarchical management structures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/06—Testing, supervising or monitoring using simulated traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/18—Self-organising networks, e.g. ad-hoc networks or sensor networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a kind of highly reliable adaptive MAC layer scheduling methods.Mainly solve the problems, such as that leader cluster node is since idle listening causes a large amount of energy consumptions in wireless sensor network.The described method includes: carrying out model foundation to wireless sensor network;Particular frame format is generated, queue occupancy and delay are embedded in frame control field;Initialization action set, select probability set and feedback set;Coordinator is interacted using learning automaton method with ambient enviroment, its movement and state are updated;Entire learning process is divided into three phases: initial stage, exploratory stage and greedy stage, taking corresponding search strategy;The effect of assessment movement and environmental interaction updates feedback and select probability set;The relevant parameter for determining duty ratio is chosen based on feedback set, realizes adaptive MAC layer scheduling.The embodiment of the present invention, guaranteeing node, interior adaptive adjustment duty ratio, minimum power consumption are with a wide range of applications during operation.
Description
Technical field
The invention belongs to wireless sensor network technology field, in particular to a kind of highly reliable adaptive MAC layer scheduling
Method.
Background technique
Wireless sensor network (WSN) node is usually battery powered, and in many deployed environments, replacement electricity
Pond or to be electromagnetically charged all be expensive or even infeasible.Therefore, low-power consumption is considered as network communication of wireless sensor association
The most important index of view.Specifically, node does not know when data occur for other nodes, even if therefore node in idle state
Under, transceiver will also be continuously in reception pattern.Idle listening is considered as one of main problem of energy waste.
Currently, most adopted IEEE802.15.4 standard defines several different types of nodes extensively: global function is set
Standby (FFD) is also referred to as the equipment for enabling beacon, can be used as personal area network's coordinator, cluster head or router operation, part
Function device (RFD) is also referred to as non-beacon equipment, can only run as terminal device.When FFD serves as cluster head, due to FFD without
When its data is sent to them by method prediction other sensors node, therefore they are needed always in a receive mode to connect
All information being collected into are received, its energy can be exhausted rapidly in this way.In order to overcome this problem, standard specification definition enables
The mode of beacon.This mode supports beacon frame to be transferred to the terminal device for allowing node synchronous from coordinator.Make to own in this way
Equipment carries out suspend mode between coordinating transmissions, helps to reduce idle listening, to extend network life.
In recent years, in response to this, many duty cycle adjustment algorithms are proposed, such as present in modification mac frame head
Retain frame control field, the transmit queue of collector node occupies and the information such as end-to-end delay select duty ratio;There are also one
For kind scheme by the way of intensified learning, main target is to find optimal duty ratio, devises one kind in WSN environment
The scheme of SMAC agreement sleeping time is adjusted, the scheme proposed is to wait in line the frame number of transmission as state, to retain
Activity time be movement.However, this meaning needs to store a large amount of state-movement pair, in the resource-constrained wireless biography of memory
It is worthless in sensor node.There is the extension of the CAP for the busy tone that equipment issues at the end of proposing based on standard CAP recently.Only
Busy tone is just sent when equipment sends the failure of its all data frame.If deposited in the transmission queue of any equipment at the end of CAP
In some real time datas, then CAP is extended.However, these extensions are not inconsistent standardization, need to modify superframe structure.
Summary of the invention
The embodiment of the present invention provides a kind of highly reliable adaptive MAC layer scheduling method, during operation interior adaptive tune
Whole duty ratio, does not need human intervention, so as to minimum power consumption, while balancing probability and the application of successful data transmitting
Deferred constraint.
In order to achieve the above objectives, it the embodiment of the invention provides a kind of highly reliable adaptive MAC layer scheduling method, answers
For the coordinator unit in wireless sensor network, method includes:
Model foundation is carried out according to wireless sensor network environment, wireless sensor network environment model is by three-dimensional array E
=(α, β, p) indicates that wherein α indicates that node learns the behavior aggregate inputted automatically, indicates the duty ratio of node in the present invention
Set;β indicates that node selects feedback signal of the suitable duty ratio later with environmental interaction output.
Specifically, environment can be divided into P- model and Q- model according to the difference of β value type: in P- model, instead
Feedback signal is Boolean (0 or 1);In Q- model, feedback signal is [0,1] interior continuous random variable.P- model is due to it
Controlling model is easy to use, therefore the present invention uses P- model.P={ p1, p2..., prIndicate a series of rewards and punishments probability, and
Each learning automaton acts αiThere is a corresponding pi。
Node generates specific frame structure format, is prolonged using the reserved bit insertion queue occupancy of frame control field with queuing
When etc. parameters.
Specifically, in order to avoid introducing any additional expense, frame of each terminal device in each data frame of transmission
Queue occupancy O and queueing delay D is embedded in control structure, which is protected using 3 of frame control field as shown in Figure 3
Position is stayed to be embedded in.
It should be noted that each sender indicates the queue occupancy o of 4 different stages using two bitsi,
And queueing delay diIt is divided into 2 ranks.
Coordinator (FFD) carries out flow estimation, generates the duty ratio set an of adaptive-flow.
It should be noted that coordinator collection terminal is set present invention assumes that wireless sensor network is stelliform connection topology configuration
The data that preparation is sent.Each coordinator accumulated by idle listening in computing terminal equipment transmit queue, grouping and postpone come
Estimate incoming business.
Its set of actions is initialized for coordinator, acts select probability set and feedback set.
Specifically, learning automaton is a learning tool based on probability, it passes through stochastic activity probability vector Pi
(t) activity is selected, activity probability vector is the main member of learning automaton, so must keep updating at any time.
It should be noted that in the initial stage to prevent from losing in the case that wireless sensor network data flow is very big
The case where mass data, occurs, and is acted and is chosen to be largest duty cycle, i.e., coordinator is in always receives state, and its is right
The movement select probability answered also is 1, guarantees that coordinator early period can be collected into the more information of network.
Coordinator (FFD) is interacted using learning automaton method (LA) method with ambient enviroment.
Specifically, the learning automaton model of varistructure can be indicated by three-dimensional array LA=(α, β, p), wherein α
={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1, β2..., βrIndicate the feedback signal that environment is given
Collection, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) it indicates by the n-th wheel study
The α of processiCorresponding movement probability.
Strategy is explored in selection: different times select different exploration strategy
Specifically, the exploration strategy of entire part is divided into 3 stages: initial stage, exploratory stage and greedy rank
Section;
Initial stage explores the everything in set using cyclic search strategy in deterministic fashion, and node is initial
When select highest duty ratio, slowly reduce duty ratio until minimum duty cycle, which ensures that all duty ratio set all
It is attempted.
Exploratory stage randomly explores the movement than currently selecting higher duty ratio, if select probability increases
Bright reward increases.Otherwise, if reward remains unchanged or declines, it can randomly explore low duty ratio action.
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially poor to the cognition of environment
Few, this is can to start autonomous selection movement.
The influence transmitted after the movement and environmental interaction to data is assessed, feedback set, update action select probability are updated
Set
Specifically, coordinator by using the feedback received in the last one active duration from sender more
The reward of new each beacon interval.
Selection movement is selected BO the and SO standard parameter for determining duty ratio based on feedback set, realizes adaptive MAC
Scheduling
After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.
In order to achieve the above objectives, it the embodiment of the invention provides a kind of highly reliable adaptive MAC layer dispatching device, answers
For the coordinator unit in wireless sensor network, described device includes:
Generation unit: generating specific frame control structure format, is embedded in queue occupancy in the reserved bit of frame control field
With the parameters such as queuing delay;
Transmission unit: each sensor node is sent to each sensing according to the frame format that setting generates the case where itself
Device node;
Receiving unit: the data frame sent after accessing channel for receiving each sensor node;The data frame is extremely
It less include the parameters such as queue occupancy and queuing delay;
Assessment unit: according to the selection of the parameter and the working state evaluation of coordination of sensor node the transmission movement
Probability;
Autonomous learning element: node updates the set of actions of itself using learning automaton method, acts select probability collection
Conjunction and feedback set;
Policy selection unit: which time slot judgement is in, and takes corresponding strategy, and the stage takes circulation to explore in the early stage
Strategy takes randomized policy in the exploratory stage, takes Greedy strategy in the last greedy stage;
Automatic adjusument unit: it after selection movement, based on feedback collection and behavior aggregate adjusting parameter BO and SO, completes
Adaptive MAC scheduling.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will to embodiment or
Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
Some embodiments of the present invention, for those of ordinary skill in the art, without creative efforts, also
Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 is the flow diagram of the highly reliable adaptive MAC layer scheduling method of one kind provided in an embodiment of the present invention;
Fig. 2 is the model schematic of learning automaton provided in an embodiment of the present invention;
The structural schematic diagram of the position Fig. 3 frame control format provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of the highly reliable adaptive MAC layer dispatching device of one kind provided in an embodiment of the present invention;
Fig. 5 is MAC layer scheduling node provided in an embodiment of the present invention transmission collision schematic diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its
His embodiment, shall fall within the protection scope of the present invention.
According to attached drawing, technical solution of the present invention is illustrated.
The highly reliable adaptive MAC layer scheduling method of described one kind, comprising the following steps:
S101 carries out model foundation to wireless sensor network.
Specifically, wireless sensor network environment model is indicated by three-dimensional array E=(α, β, p), wherein α indicates node
The automatic study i.e. behavior aggregate of input indicates the duty ratio set of node in the present invention;β indicates that node selection suitably accounts for
Feedback signal after sky ratio with environmental interaction output.
Specifically, environment can be divided into P- model and Q- model according to the difference of β value type: in P- model, instead
Feedback signal is Boolean (0 or 1);In Q- model, feedback signal is [0,1] interior continuous random variable, is suitable for actual
Control field;P- model is widely used in wireless sensor network research since its Controlling model is easy to use.P=
{p1, p2..., prIndicate a series of rewards and punishments probability, and each learning automaton acts αiThere is a corresponding pi.?
In the present invention, we carry out model foundation to wireless sensor network environment using P- model.
S102, node generate specific frame structure format, utilize the reserved bit insertion queue occupancy of frame control field and row
The parameters such as team's delay.
Specifically, in order to avoid introducing any additional expense, frame of each terminal device in each data frame of transmission
Queue occupancy O and queueing delay D is embedded in control structure, which is protected using 3 of frame control field as shown in Figure 3
Position is stayed to be embedded in.
It should be noted that each sender indicates the queue occupancy o of 4 different stages using two bitsi,
And queueing delay diIt is divided into 2 ranks.By this information, coordinator can estimate queue occupancy O and queueing delay
D.Queue occupancy O is defined as follows:
Wherein, it can store the maximum frame number in its queue if any node is met or exceeded, be equal to 1.It is no
Then, it is equal to the average queuing occupancy in inactive period, i.e. occupancy highest time in grouping accumulation CAP.Queue
Occupancy O is indicated using 2Bits, can not only save space, but also can reduce the fluctuation range of the value.
S103, coordinator (FFD) carry out flow estimation, generate the duty ratio set an of adaptive-flow.
It should be noted that coordinator collection terminal is set present invention assumes that wireless sensor network is stelliform connection topology configuration
The data that preparation is sent.Each coordinator accumulated by idle listening in computing terminal equipment transmit queue, grouping and postpone come
Estimate incoming business.The expression formula of idle listening IL is as follows:
IL=1.0-SFu (2)
Wherein, SFuIndicate superframe utilization rate, be terminal device occupy superframe time with can be used for the total of data communication
Ratio between time, is defined as:
Wherein, SD is super-frame durations, TbCoordinator is the time that beacon transmissions are spent, TcIndicate due to frame conflict and
The time of equipment busy channel, TrIt is the time for data receiver.
Illustratively, see attached drawing 5, in Class1 (C1), the sender's node (A node) considered terminates it first
Transmission, and the transmission of other nodes continues to.In type 2 (C2), sender A completes transmission after clashing.Most
Afterwards, in type 3 (C3), two nodes terminate to transmit simultaneously.In order to detect C1 and C2, A or B can monitor channel to detect
Other transmission, if they are in the range of each other.Therefore, sender thinks to rush when if listening to channel after transmission
It is prominent, then it detects busy channel and is not received by acknowledgement frame 2.On the other hand, in order to detect C3, receiver perceives its CCA
Reception energy in threshold value increases, but it is not synchronous with start frame separator.
S104 acts select probability set and feedback set to coordinate to initialize its set of actions.
Specifically, learning automaton is a learning tool based on probability, it passes through stochastic activity probability vector Pi
(t) activity is selected, activity probability vector is the main member of learning automaton, so must keep updating at any time.Study is certainly
Motivation AiActivity probability vector be expressed as follows:
Wherein, Pi(t) it indicates in moment t, node niThe probability of a certain duty ratio is selected, in the present invention, probability setting
For the desired value of corresponding duty ratio general feedback return, it is defined as follows:
It should be noted that being lost in the case that in the initial stage, wireless sensor network data flow is very big in order to prevent
The case where losing mass data occurs, and is acted that be chosen to be duty ratio be zero, i.e., coordinator is in always receives state, and its
Corresponding movement select probability is also 1, guarantees that coordinator early period can be collected into the more information of network.
S105, coordinator (FFD) are interacted using learning automaton method (LA) method with ambient enviroment.
It should be noted that the learning automaton model of varistructure can be indicated by three-dimensional array LA=(α, β, p),
Wherein, α={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1, β2..., βrIndicate environment give it is anti-
Feedback signal collection, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) it indicates to pass through n-th
Take turns the α of learning processiCorresponding movement probability.Meet and probability updating formula p (n+1)=T (α (n), β (n), p (n)), T table
Show learning algorithm, the general learning algorithm mechanism of learning automaton is defined as follows:
Wherein, α (n) and b (n) is linear function giAnd hiWeight coefficient, linear function or constant can be defined as, answered
Depending on concrete application;Using P- environmental model, feedback signal value is 0 or 1, and when feedback signal takes 0, environment is awarded
Signal.When feedback signal takes 0, corresponding probability updating is expressed as follows shown:
When feedback signal takes 1, corresponding probability updating is expressed as follows shown:
Strategy is explored in S106, selection: different times select different exploration strategy.
Although relying solely on movement select probability it should be noted that having movement select probability herein and being possible to
Cause coordinator adjustment more slow, can not be in time to environment as reflecting, in the present invention, movement select probability selection is made
Measured for a parameter, assist a kind of explorations tactful simultaneously external so that coordinator to the variation of ambient enviroment more
It is sensitive.
Specifically, the exploration strategy of entire part is divided into 3 stages: initial stage, exploratory stage and greedy stage;
Initial stage explores the everything in set using cyclic search strategy in deterministic fashion, and node is initial
When select highest duty ratio, slowly reduce duty ratio until minimum duty cycle, which ensures that all duty ratio set all
It is attempted, i.e. the set of actions of learning automaton has been enumerated completely.
Exploratory stage, once after all movements are selected, we use following strategy:
Specifically, this strategy includes the movement randomly explored than currently selecting higher duty ratio, if select probability
It increases and illustrates that reward increases.Otherwise, if reward remains unchanged or declines, it can randomly explore low duty ratio
Action.
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially poor to the cognition of environment
Few, this is can to start autonomous selection movement, at this moment uses strategy as follows:
Specifically, the movement with best P value in subset of actions of the Greedy strategy selection with lower action value, changes speech
It, selects that the higher duty ratio selected than last moment.There are several movements of identical P value in selected subset
In the case of, select the movement with lowest duty cycle (highest action value).This means that we select have lowest duty cycle
Best movement shows that it selects more preferably P value if reward is equal to or less than the reward received in previous stage.Therefore, exist
Under stable condition, preferred minimum duty cycle.Once selected movement, if new action value from it is different on last stage, increase
The exploration probability of node.
S107 assesses the influence transmitted after the movement and environmental interaction to data, updates feedback set, update action choosing
Select Making by Probability Sets
It should be noted that coordinator from sender by using receiving in the last one active duration
Feedback updates the reward of each beacon interval.Reward function is defined as follows:
Wherein, β is expressed as the combination of punishment (negative) value of the performance of stage duty ratio selection.It is by above formula as it can be seen that best
Reward be a zero (not punishing) because it indicates no idle listening, transmission queue is non-spill.
Specifically, reward is based on queue occupancy O and threshold value OmaxBetween comparison.If queue occupancy is higher than upper
Limit threshold value Omax, then prize signal is negative (- 1), it means that bigger OmaxSetting, the necessary packet discard of final equipment
Chance is more, therefore the reward obtained is lower.Threshold value OmaxSelection indicate coordinator to the sensitivity of frame loss.The ginseng
Several settings can be configured according to the reliability requirement of application.It can be set to 0.8 under normal conditions, if queue accounts for
There is rate O to be less than threshold value Omax, then feedback signal is defined as the negative value equal to idle listening amount, since idle listening is that energy disappears
One of the main reason for consumption.Therefore the lower it the better.Only when idle listening is zero and queue occupancy O is expressed as no data frame
When loss, zero maximum reward (no punishment) can be only achieved.This means that realizing between bandwidth availability ratio and energy consumption
The target of optimal tradeoff.
S108, selection movement select BO the and SO standard parameter for determining duty ratio based on feedback set, realize adaptive
MAC scheduling
After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.Adjustment is defined as follows:
BO=max (4, | A | → (BI-SD) < δ) (13)
SO ← max (0, BO- αt) (14)
It should be noted that the selection is based on data frame delay experienced, parameter value BO and SO, which are embedded in, to be broadcast to
In the beacon frame of terminal device, to synchronize.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to
Cover non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that
A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or
The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", not
There is also other identical elements in the process, method, article or apparatus that includes the element for exclusion.
Claims (9)
1. a kind of highly reliable adaptive MAC layer scheduling method, which comprises the following steps:
The first step carries out model foundation according to wireless sensor network environment, learning automaton method is applied to wireless sensing
Among the environment of device network, three bit array E=(α, β, p) of sensor network environment model are indicated, wherein α={ α1,
α2..., αnIndicate node duty ratio set, β={ β1, β2..., βmIndicate the feedback letter that node and environmental interaction export
Number, p={ p1, p2..., pnIndicate a series of rewards and punishments probability;
Second step, node generate specific frame structure format, utilize the reserved bit insertion queue occupancy of frame control field and queuing
The parameters such as delay.Specifically, each terminal device is embedded in queue occupancy O in the frame control structure of each data frame of transmission
With queueing delay D, which is embedded in using 3 reserved bits of frame control field as shown in Figure 3;
Third step, coordinator (FFD) carry out flow estimation, generate the duty ratio set an of adaptive-flow, each coordinator
Incoming business is estimated by idle listening, grouping accumulation and the delay in computing terminal equipment transmit queue;
4th step acts select probability set and feedback set to coordinate to initialize its set of actions;
5th step, coordinator (FFD) is interacted using learning automaton method (LA) method with ambient enviroment, using P- environment
Model, feedback signal value are 0 or -1, if feedback signal takes 0, definition of probability is as follows:
If feedback signal takes -1, definition of probability is as follows:
Strategy is explored in 6th step, selection: different times select different exploration strategy, and entire learning process is divided into three ranks
Section, initial stage use cyclic search strategy, and the exploratory stage uses random searching strategy, and the greedy stage then uses Greedy strategy;
7th step assesses the influence transmitted after the movement and environmental interaction to data, updates feedback set, and update action selection is general
Rate set;
8th step, selection movement are selected BO the and SO standard parameter for determining duty ratio based on feedback set, select optimal account for
Empty ratio, wherein BO parameter definition is as follows:
BO=max (4, | A | → (BI-SD) < δ) (3)
2. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that the net
Network environmental model establish, specifically, wireless sensor network environment model by three-dimensional array E=(α, β, p) indicate, wherein α=
{α1, α2..., αnIndicate that node learns the limited action collection inputted automatically, the duty ratio collection of node is indicated in the present invention
It closes;β={ β1, β2..., βmIndicate the feedback signal exported after node selects suitable duty ratio with environmental interaction, p=
{p1, p2..., pnIndicate a series of rewards and punishments probability, each probability penalty piAll and given input variable αiIt is related.
Environment can be divided into p-type environment, Q type ring border, 3 seed type of S type ring border by β based on the feedback signal.The present invention uses P-
Model carries out model foundation to wireless sensor network environment, and feedback signal is Boolean (0 or 1), i.e., β only uses binary zero
With 1 description.
Wherein, αi(αi∈ α) indicate the activity that learning automaton selects, p (t) indicates the probability vector in t moment, uses PrewardTable
Show the reward factor, uses PpenaltyIt indicates penalty factor, is determined increase respectively with the two factors or reduce movable probability,
Probability vector P (t) update is defined as follows:
If activity is rewarded by random environment, activity probability vector P (t) update is defined as follows:
3. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that node generates
Specific frame structure format, particularly, each terminal device are embedded in queue in the frame control structure of each data frame of transmission
Occupancy O and queueing delay D, the information are embedded in using 3 reserved bits of frame control field as shown in Figure 3.
Specifically, each sending node indicates the queue occupancy o of 4 different stages using two bitsi, and queueing delay
diIt is divided into 2 ranks.By this information, coordinator can estimate queue occupancy O and queueing delay D.Queue occupies
Rate O is defined as follows:
It should be noted that passing through the information of 3 reserved bits, coordinator can estimate queue occupancy O and queueing delay D, such as
There are node devices to meet or exceed the maximum frame number that can store in its queue for fruit, then is equal to 1.Otherwise, it, which is equal to, indicates
The average queuing occupancy of the first message received in the CAP of grouping accumulation, i.e. inactive period queue occupancy are most
The high time.
It should be noted that the queuing delay position D of each terminal device iiIndicate the delay threshold of current Beacon Interval BI and definition
DthOne comparison of minimum value, if it is less than the minimum value, then queuing delay position DiIt is ' 0 ', is otherwise ' 1 '.Coordinator will prolong
Late marking is the maximum delay that node device is sent.This is done to guarantee any node when queueing delay is higher than threshold value still
Can so it carry out data transmission.
4. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that coordinator
(FFD) flow estimation is carried out, specifically each coordinator is tired by the idle listening in computing terminal equipment transmit queue, grouping
It accumulates and postpones to estimate incoming business.The expression formula of idle listening IL is as follows:
IL=1.0-SFu (7)
Wherein, SFuIndicate superframe utilization rate, be terminal device occupy superframe time with can be used for data communication total time it
Between ratio, is defined as:
Wherein, SD is super-frame durations, TbCoordinator is the time that beacon transmissions are spent, TcEquipment is indicated due to frame conflict
The time of busy channel, TrIt is the time for data receiver, TsIt is defined as follows:
Ts=TCCA+TDATA+TIFS+TACK (9)
Wherein, TCCAIndicate the channel assessment time in each frame data transmission process, TDATAIndicate data transmission period, TIFSTable
Show interframe space, TACKIndicate confirmation receiving time.
5. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that initialize it
Set of actions acts select probability set and feedback set.Specifically, learning automaton is a study work based on probability
Tool, it passes through stochastic activity probability vector Pi(t) activity being selected, activity probability vector is the main member of learning automaton,
So must keep updating at any time.Learning automaton AiActivity probability vector be expressed as follows:
Wherein, Pi(t) it indicates in moment t, node niThe probability of a certain duty ratio is selected, in the present invention, probability is expressed as corresponding to
Duty ratio general feedback return desired value:
It being acted when it should be noted that initial and is chosen to be largest duty cycle, i.e., coordinator is in always receives state, and its
Corresponding movement select probability is also 1, guarantees that coordinator early period can be collected into the more information of network.
6. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that coordinator
(FFD) it is interacted using learning automaton method (LA) method with ambient enviroment.Specifically, learning automaton model can be by
Three-dimensional array LA=(α, β, p) is indicated, wherein α={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1,
β2..., βrIndicate the feedback signal collection that environment is given, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) α by the n-th wheel learning process is indicatediCorresponding movement probability.Meet and probability updating is public
Formula p (n+1)=T (α (n), β (n), p (n)).
Specifically, using P- environmental model, feedback signal value is 0 or 1, when feedback signal takes 0, and environment is awarded signal.
When feedback signal takes 0 or -1, corresponding probability updating respectively indicates as follows:
It should be noted that can constantly receive environment to one using learning automaton method during adjusting duty ratio
A feedback β, the feedback always received can be understood as feeding back the summation with future feedback immediately, as follows:
Wherein, γ is discount factor, and γ ∈ [0,1] indicates a weight to future feedback.
7. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that different times
Select different exploration strategy, particularly, the exploration strategy of entire part is divided into 3 stages: initial stage, explores rank
Section and greedy stage.
Initial stage, explore the everything in set in deterministic fashion using cyclic search strategy, node selects at the beginning
Highest duty ratio is selected, slowly reduces duty ratio until minimum duty cycle, which ensures that all duty ratio set all to be tasted
Examination, the i.e. set of actions of learning automaton have been enumerated completely.
Exploratory stage, once after all movements are selected, will randomly it explore higher than the duty ratio of current generation selection
Movement, if corresponding β in feedback set βt iIt increases, then it represents that movement αiRepresentative duty ratio is more preferable.Otherwise, if
Feedback set β is remained unchanged or corresponding βt iReduce, it can randomly explore the action of lower duty ratio, and strategy is such as
Under:
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially similar to the cognition of environment
, optimal action value at this time is found using Greedy strategy, as β corresponding in feedback set βt iHigher than a upper stage
Illustrate that flow increases, Greedy strategy selection has the subset of actions of lower action value, that is, selects higher duty ratio;If feedback
Corresponding β in set βt iBelow or equal to a upper stageGreedy strategy selection has movement of higher action value
Collection, that is, select lower duty ratio;Therefore, under stable condition, preferred minimum duty cycle is selected when next stage and this stage
Duty ratio it is different, then increase the probability of search, otherwise, then reduce study and explore probability, when movement best to avoid selection
It generates, strategy is as follows:
Wherein, β is expressed as the combination of punishment (negative) value of the performance of stage duty ratio selection.Have formula (16) it can be concluded that, most
Good reward is a zero (not punishing), because it indicates no idle listening, transmission queue is non-spill.
It should be noted that study and exploration rate are reduced if new action is equal to the last one selected action, to avoid
Select the oscillation of optimizing behavior.
8. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that selection movement,
BO the and SO standard parameter that decision duty ratio is selected based on feedback set is realized adaptive MAC scheduling, particularly, selected
After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.Adjustment formula is defined as follows:
BO=max (4, | A | → (BI-SD) < δ) (17)
SO ← max (0, BO- αt) (18)
It should be noted that the selection is based on data frame delay experienced, parameter value BO and SO insertion are broadcast to terminal and set
In standby beacon frame, to synchronize.
9. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that described device
Include:
Generation unit generates specific frame control structure format, is embedded in queue occupancy and row in the reserved bit of frame control field
The parameters such as team's delay;
Transmission unit, each sensor node are sent to each sensor section according to the frame format that setting generates the case where itself
Point;
Receiving unit, the data frame sent after accessing channel for receiving each sensor node;The data frame at least wraps
Include the parameters such as queue occupancy and queuing delay;
Assessment unit, according to the select probability of the parameter and the working state evaluation of coordination of sensor node the transmission movement;
Autonomous learning element, node update itself set of actions using learning automaton method, act select probability set with
And feedback set;
Which time slot policy selection unit, judgement are in, take corresponding strategy, and the stage takes circulation to explore strategy in the early stage,
Randomized policy is taken in the exploratory stage, takes Greedy strategy in the last greedy stage;
Automatic adjusument unit after selection movement, based on feedback collection and behavior aggregate adjusting parameter BO and SO, is completed adaptive
MAC scheduling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710946487.6A CN109660375B (en) | 2017-10-11 | 2017-10-11 | High-reliability self-adaptive MAC (media Access control) layer scheduling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710946487.6A CN109660375B (en) | 2017-10-11 | 2017-10-11 | High-reliability self-adaptive MAC (media Access control) layer scheduling method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109660375A true CN109660375A (en) | 2019-04-19 |
CN109660375B CN109660375B (en) | 2020-10-02 |
Family
ID=66108497
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710946487.6A Active CN109660375B (en) | 2017-10-11 | 2017-10-11 | High-reliability self-adaptive MAC (media Access control) layer scheduling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109660375B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110856264A (en) * | 2019-11-08 | 2020-02-28 | 山东大学 | Distributed scheduling method for optimizing information age in sensor network |
CN111542070A (en) * | 2020-04-17 | 2020-08-14 | 上海海事大学 | Efficient multi-constraint deployment method for industrial wireless sensor network |
CN114666880A (en) * | 2022-03-16 | 2022-06-24 | 中南大学 | Method for reducing end-to-end delay in delay-sensitive wireless sensor network |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103260229A (en) * | 2013-06-04 | 2013-08-21 | 东北林业大学 | Wireless sensor network MAC protocol based on forecast and feedback |
-
2017
- 2017-10-11 CN CN201710946487.6A patent/CN109660375B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103260229A (en) * | 2013-06-04 | 2013-08-21 | 东北林业大学 | Wireless sensor network MAC protocol based on forecast and feedback |
Non-Patent Citations (2)
Title |
---|
CHEN HAO等: "Traffic Adaptive Duty Cycle MAC Protocol for Wireless Sensor Networks", 《IEEE》 * |
范清峰等: "无线传感器网络自适应MAC协议", 《计算机工程与应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110856264A (en) * | 2019-11-08 | 2020-02-28 | 山东大学 | Distributed scheduling method for optimizing information age in sensor network |
CN111542070A (en) * | 2020-04-17 | 2020-08-14 | 上海海事大学 | Efficient multi-constraint deployment method for industrial wireless sensor network |
CN111542070B (en) * | 2020-04-17 | 2023-03-14 | 上海海事大学 | Efficient multi-constraint deployment method for industrial wireless sensor network |
CN114666880A (en) * | 2022-03-16 | 2022-06-24 | 中南大学 | Method for reducing end-to-end delay in delay-sensitive wireless sensor network |
CN114666880B (en) * | 2022-03-16 | 2024-04-26 | 中南大学 | Method for reducing end-to-end delay in delay-sensitive wireless sensor network |
Also Published As
Publication number | Publication date |
---|---|
CN109660375B (en) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sherazi et al. | A comprehensive review on energy harvesting MAC protocols in WSNs: Challenges and tradeoffs | |
JP6498818B2 (en) | Communication system, access network node and method for optimizing energy consumed in a communication network | |
de Paz Alberola et al. | Duty cycle learning algorithm (DCLA) for IEEE 802.15. 4 beacon-enabled wireless sensor networks | |
CN102026329B (en) | Wireless communication network and self-adaptive route selecting communication method thereof | |
Juang et al. | Energy-efficient computing for wildlife tracking: Design tradeoffs and early experiences with ZebraNet | |
CN1898900B (en) | Supporter, client and corresponding method for power saving in a wireless local area network | |
JP6266773B2 (en) | Communication system and method for determining optimal duty cycle to minimize overall energy consumption | |
Pinto et al. | An approach to implement data fusion techniques in wireless sensor networks using genetic machine learning algorithms | |
Magistretti et al. | A mobile delay-tolerant approach to long-term energy-efficient underwater sensor networking | |
Sarang et al. | A QoS MAC protocol for prioritized data in energy harvesting wireless sensor networks | |
CN108712760B (en) | High-throughput relay selection method based on random Learning Automata and fuzzy algorithmic approach | |
CN102740365B (en) | Single-stream bulk data acquisition method suitable for wireless sensor network | |
CN109660375A (en) | A kind of highly reliable adaptive MAC layer scheduling method | |
CN110167054A (en) | A kind of QoS CR- LDP method towards the optimization of edge calculations node energy | |
Noh et al. | Low-Latency Geographic Routing for Asynchronous Energy-Harvesting WSNs. | |
Gholamzadeh et al. | Concepts for designing low power wireless sensor network | |
CN111278161B (en) | WLAN protocol design and optimization method based on energy collection and deep reinforcement learning | |
Hong et al. | ROSS: Receiver oriented sleep scheduling for underwater sensor networks | |
Koutsandria et al. | Wake-up radio-based data forwarding for green wireless networks | |
Pal et al. | A distributed power control and routing scheme for rechargeable sensor networks | |
Kinoshita et al. | A data gathering scheme for environmental energy-based wireless sensor networks | |
CN114449629B (en) | Wireless multi-hop network channel resource optimization method driven by edge intelligence | |
Rucco et al. | A bird's eye view on reinforcement learning approaches for power management in WSNs | |
Afroz et al. | QX-MAC: Improving QoS and Energy Performance of IoT-based WSNs using Q-Learning | |
Sah et al. | TDMA policy to optimize resource utilization in Wireless Sensor Networks using reinforcement learning for ambient environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |