CN109660375A - A kind of highly reliable adaptive MAC layer scheduling method - Google Patents

A kind of highly reliable adaptive MAC layer scheduling method Download PDF

Info

Publication number
CN109660375A
CN109660375A CN201710946487.6A CN201710946487A CN109660375A CN 109660375 A CN109660375 A CN 109660375A CN 201710946487 A CN201710946487 A CN 201710946487A CN 109660375 A CN109660375 A CN 109660375A
Authority
CN
China
Prior art keywords
probability
duty ratio
feedback
node
strategy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710946487.6A
Other languages
Chinese (zh)
Other versions
CN109660375B (en
Inventor
刘元安
张洪光
王怡浩
范文浩
吴帆
谢刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201710946487.6A priority Critical patent/CN109660375B/en
Publication of CN109660375A publication Critical patent/CN109660375A/en
Application granted granted Critical
Publication of CN109660375B publication Critical patent/CN109660375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/044Network management architectures or arrangements comprising hierarchical management structures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/06Testing, supervising or monitoring using simulated traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a kind of highly reliable adaptive MAC layer scheduling methods.Mainly solve the problems, such as that leader cluster node is since idle listening causes a large amount of energy consumptions in wireless sensor network.The described method includes: carrying out model foundation to wireless sensor network;Particular frame format is generated, queue occupancy and delay are embedded in frame control field;Initialization action set, select probability set and feedback set;Coordinator is interacted using learning automaton method with ambient enviroment, its movement and state are updated;Entire learning process is divided into three phases: initial stage, exploratory stage and greedy stage, taking corresponding search strategy;The effect of assessment movement and environmental interaction updates feedback and select probability set;The relevant parameter for determining duty ratio is chosen based on feedback set, realizes adaptive MAC layer scheduling.The embodiment of the present invention, guaranteeing node, interior adaptive adjustment duty ratio, minimum power consumption are with a wide range of applications during operation.

Description

A kind of highly reliable adaptive MAC layer scheduling method
Technical field
The invention belongs to wireless sensor network technology field, in particular to a kind of highly reliable adaptive MAC layer scheduling Method.
Background technique
Wireless sensor network (WSN) node is usually battery powered, and in many deployed environments, replacement electricity Pond or to be electromagnetically charged all be expensive or even infeasible.Therefore, low-power consumption is considered as network communication of wireless sensor association The most important index of view.Specifically, node does not know when data occur for other nodes, even if therefore node in idle state Under, transceiver will also be continuously in reception pattern.Idle listening is considered as one of main problem of energy waste.
Currently, most adopted IEEE802.15.4 standard defines several different types of nodes extensively: global function is set Standby (FFD) is also referred to as the equipment for enabling beacon, can be used as personal area network's coordinator, cluster head or router operation, part Function device (RFD) is also referred to as non-beacon equipment, can only run as terminal device.When FFD serves as cluster head, due to FFD without When its data is sent to them by method prediction other sensors node, therefore they are needed always in a receive mode to connect All information being collected into are received, its energy can be exhausted rapidly in this way.In order to overcome this problem, standard specification definition enables The mode of beacon.This mode supports beacon frame to be transferred to the terminal device for allowing node synchronous from coordinator.Make to own in this way Equipment carries out suspend mode between coordinating transmissions, helps to reduce idle listening, to extend network life.
In recent years, in response to this, many duty cycle adjustment algorithms are proposed, such as present in modification mac frame head Retain frame control field, the transmit queue of collector node occupies and the information such as end-to-end delay select duty ratio;There are also one For kind scheme by the way of intensified learning, main target is to find optimal duty ratio, devises one kind in WSN environment The scheme of SMAC agreement sleeping time is adjusted, the scheme proposed is to wait in line the frame number of transmission as state, to retain Activity time be movement.However, this meaning needs to store a large amount of state-movement pair, in the resource-constrained wireless biography of memory It is worthless in sensor node.There is the extension of the CAP for the busy tone that equipment issues at the end of proposing based on standard CAP recently.Only Busy tone is just sent when equipment sends the failure of its all data frame.If deposited in the transmission queue of any equipment at the end of CAP In some real time datas, then CAP is extended.However, these extensions are not inconsistent standardization, need to modify superframe structure.
Summary of the invention
The embodiment of the present invention provides a kind of highly reliable adaptive MAC layer scheduling method, during operation interior adaptive tune Whole duty ratio, does not need human intervention, so as to minimum power consumption, while balancing probability and the application of successful data transmitting Deferred constraint.
In order to achieve the above objectives, it the embodiment of the invention provides a kind of highly reliable adaptive MAC layer scheduling method, answers For the coordinator unit in wireless sensor network, method includes:
Model foundation is carried out according to wireless sensor network environment, wireless sensor network environment model is by three-dimensional array E =(α, β, p) indicates that wherein α indicates that node learns the behavior aggregate inputted automatically, indicates the duty ratio of node in the present invention Set;β indicates that node selects feedback signal of the suitable duty ratio later with environmental interaction output.
Specifically, environment can be divided into P- model and Q- model according to the difference of β value type: in P- model, instead Feedback signal is Boolean (0 or 1);In Q- model, feedback signal is [0,1] interior continuous random variable.P- model is due to it Controlling model is easy to use, therefore the present invention uses P- model.P={ p1, p2..., prIndicate a series of rewards and punishments probability, and Each learning automaton acts αiThere is a corresponding pi
Node generates specific frame structure format, is prolonged using the reserved bit insertion queue occupancy of frame control field with queuing When etc. parameters.
Specifically, in order to avoid introducing any additional expense, frame of each terminal device in each data frame of transmission Queue occupancy O and queueing delay D is embedded in control structure, which is protected using 3 of frame control field as shown in Figure 3 Position is stayed to be embedded in.
It should be noted that each sender indicates the queue occupancy o of 4 different stages using two bitsi, And queueing delay diIt is divided into 2 ranks.
Coordinator (FFD) carries out flow estimation, generates the duty ratio set an of adaptive-flow.
It should be noted that coordinator collection terminal is set present invention assumes that wireless sensor network is stelliform connection topology configuration The data that preparation is sent.Each coordinator accumulated by idle listening in computing terminal equipment transmit queue, grouping and postpone come Estimate incoming business.
Its set of actions is initialized for coordinator, acts select probability set and feedback set.
Specifically, learning automaton is a learning tool based on probability, it passes through stochastic activity probability vector Pi (t) activity is selected, activity probability vector is the main member of learning automaton, so must keep updating at any time.
It should be noted that in the initial stage to prevent from losing in the case that wireless sensor network data flow is very big The case where mass data, occurs, and is acted and is chosen to be largest duty cycle, i.e., coordinator is in always receives state, and its is right The movement select probability answered also is 1, guarantees that coordinator early period can be collected into the more information of network.
Coordinator (FFD) is interacted using learning automaton method (LA) method with ambient enviroment.
Specifically, the learning automaton model of varistructure can be indicated by three-dimensional array LA=(α, β, p), wherein α ={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1, β2..., βrIndicate the feedback signal that environment is given Collection, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) it indicates by the n-th wheel study The α of processiCorresponding movement probability.
Strategy is explored in selection: different times select different exploration strategy
Specifically, the exploration strategy of entire part is divided into 3 stages: initial stage, exploratory stage and greedy rank Section;
Initial stage explores the everything in set using cyclic search strategy in deterministic fashion, and node is initial When select highest duty ratio, slowly reduce duty ratio until minimum duty cycle, which ensures that all duty ratio set all It is attempted.
Exploratory stage randomly explores the movement than currently selecting higher duty ratio, if select probability increases Bright reward increases.Otherwise, if reward remains unchanged or declines, it can randomly explore low duty ratio action.
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially poor to the cognition of environment Few, this is can to start autonomous selection movement.
The influence transmitted after the movement and environmental interaction to data is assessed, feedback set, update action select probability are updated Set
Specifically, coordinator by using the feedback received in the last one active duration from sender more The reward of new each beacon interval.
Selection movement is selected BO the and SO standard parameter for determining duty ratio based on feedback set, realizes adaptive MAC Scheduling
After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.
In order to achieve the above objectives, it the embodiment of the invention provides a kind of highly reliable adaptive MAC layer dispatching device, answers For the coordinator unit in wireless sensor network, described device includes:
Generation unit: generating specific frame control structure format, is embedded in queue occupancy in the reserved bit of frame control field With the parameters such as queuing delay;
Transmission unit: each sensor node is sent to each sensing according to the frame format that setting generates the case where itself Device node;
Receiving unit: the data frame sent after accessing channel for receiving each sensor node;The data frame is extremely It less include the parameters such as queue occupancy and queuing delay;
Assessment unit: according to the selection of the parameter and the working state evaluation of coordination of sensor node the transmission movement Probability;
Autonomous learning element: node updates the set of actions of itself using learning automaton method, acts select probability collection Conjunction and feedback set;
Policy selection unit: which time slot judgement is in, and takes corresponding strategy, and the stage takes circulation to explore in the early stage Strategy takes randomized policy in the exploratory stage, takes Greedy strategy in the last greedy stage;
Automatic adjusument unit: it after selection movement, based on feedback collection and behavior aggregate adjusting parameter BO and SO, completes Adaptive MAC scheduling.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only Some embodiments of the present invention, for those of ordinary skill in the art, without creative efforts, also Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 is the flow diagram of the highly reliable adaptive MAC layer scheduling method of one kind provided in an embodiment of the present invention;
Fig. 2 is the model schematic of learning automaton provided in an embodiment of the present invention;
The structural schematic diagram of the position Fig. 3 frame control format provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of the highly reliable adaptive MAC layer dispatching device of one kind provided in an embodiment of the present invention;
Fig. 5 is MAC layer scheduling node provided in an embodiment of the present invention transmission collision schematic diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its His embodiment, shall fall within the protection scope of the present invention.
According to attached drawing, technical solution of the present invention is illustrated.
The highly reliable adaptive MAC layer scheduling method of described one kind, comprising the following steps:
S101 carries out model foundation to wireless sensor network.
Specifically, wireless sensor network environment model is indicated by three-dimensional array E=(α, β, p), wherein α indicates node The automatic study i.e. behavior aggregate of input indicates the duty ratio set of node in the present invention;β indicates that node selection suitably accounts for Feedback signal after sky ratio with environmental interaction output.
Specifically, environment can be divided into P- model and Q- model according to the difference of β value type: in P- model, instead Feedback signal is Boolean (0 or 1);In Q- model, feedback signal is [0,1] interior continuous random variable, is suitable for actual Control field;P- model is widely used in wireless sensor network research since its Controlling model is easy to use.P= {p1, p2..., prIndicate a series of rewards and punishments probability, and each learning automaton acts αiThere is a corresponding pi.? In the present invention, we carry out model foundation to wireless sensor network environment using P- model.
S102, node generate specific frame structure format, utilize the reserved bit insertion queue occupancy of frame control field and row The parameters such as team's delay.
Specifically, in order to avoid introducing any additional expense, frame of each terminal device in each data frame of transmission Queue occupancy O and queueing delay D is embedded in control structure, which is protected using 3 of frame control field as shown in Figure 3 Position is stayed to be embedded in.
It should be noted that each sender indicates the queue occupancy o of 4 different stages using two bitsi, And queueing delay diIt is divided into 2 ranks.By this information, coordinator can estimate queue occupancy O and queueing delay D.Queue occupancy O is defined as follows:
Wherein, it can store the maximum frame number in its queue if any node is met or exceeded, be equal to 1.It is no Then, it is equal to the average queuing occupancy in inactive period, i.e. occupancy highest time in grouping accumulation CAP.Queue Occupancy O is indicated using 2Bits, can not only save space, but also can reduce the fluctuation range of the value.
S103, coordinator (FFD) carry out flow estimation, generate the duty ratio set an of adaptive-flow.
It should be noted that coordinator collection terminal is set present invention assumes that wireless sensor network is stelliform connection topology configuration The data that preparation is sent.Each coordinator accumulated by idle listening in computing terminal equipment transmit queue, grouping and postpone come Estimate incoming business.The expression formula of idle listening IL is as follows:
IL=1.0-SFu (2)
Wherein, SFuIndicate superframe utilization rate, be terminal device occupy superframe time with can be used for the total of data communication Ratio between time, is defined as:
Wherein, SD is super-frame durations, TbCoordinator is the time that beacon transmissions are spent, TcIndicate due to frame conflict and The time of equipment busy channel, TrIt is the time for data receiver.
Illustratively, see attached drawing 5, in Class1 (C1), the sender's node (A node) considered terminates it first Transmission, and the transmission of other nodes continues to.In type 2 (C2), sender A completes transmission after clashing.Most Afterwards, in type 3 (C3), two nodes terminate to transmit simultaneously.In order to detect C1 and C2, A or B can monitor channel to detect Other transmission, if they are in the range of each other.Therefore, sender thinks to rush when if listening to channel after transmission It is prominent, then it detects busy channel and is not received by acknowledgement frame 2.On the other hand, in order to detect C3, receiver perceives its CCA Reception energy in threshold value increases, but it is not synchronous with start frame separator.
S104 acts select probability set and feedback set to coordinate to initialize its set of actions.
Specifically, learning automaton is a learning tool based on probability, it passes through stochastic activity probability vector Pi (t) activity is selected, activity probability vector is the main member of learning automaton, so must keep updating at any time.Study is certainly Motivation AiActivity probability vector be expressed as follows:
Wherein, Pi(t) it indicates in moment t, node niThe probability of a certain duty ratio is selected, in the present invention, probability setting For the desired value of corresponding duty ratio general feedback return, it is defined as follows:
It should be noted that being lost in the case that in the initial stage, wireless sensor network data flow is very big in order to prevent The case where losing mass data occurs, and is acted that be chosen to be duty ratio be zero, i.e., coordinator is in always receives state, and its Corresponding movement select probability is also 1, guarantees that coordinator early period can be collected into the more information of network.
S105, coordinator (FFD) are interacted using learning automaton method (LA) method with ambient enviroment.
It should be noted that the learning automaton model of varistructure can be indicated by three-dimensional array LA=(α, β, p), Wherein, α={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1, β2..., βrIndicate environment give it is anti- Feedback signal collection, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) it indicates to pass through n-th Take turns the α of learning processiCorresponding movement probability.Meet and probability updating formula p (n+1)=T (α (n), β (n), p (n)), T table Show learning algorithm, the general learning algorithm mechanism of learning automaton is defined as follows:
Wherein, α (n) and b (n) is linear function giAnd hiWeight coefficient, linear function or constant can be defined as, answered Depending on concrete application;Using P- environmental model, feedback signal value is 0 or 1, and when feedback signal takes 0, environment is awarded Signal.When feedback signal takes 0, corresponding probability updating is expressed as follows shown:
When feedback signal takes 1, corresponding probability updating is expressed as follows shown:
Strategy is explored in S106, selection: different times select different exploration strategy.
Although relying solely on movement select probability it should be noted that having movement select probability herein and being possible to Cause coordinator adjustment more slow, can not be in time to environment as reflecting, in the present invention, movement select probability selection is made Measured for a parameter, assist a kind of explorations tactful simultaneously external so that coordinator to the variation of ambient enviroment more It is sensitive.
Specifically, the exploration strategy of entire part is divided into 3 stages: initial stage, exploratory stage and greedy stage;
Initial stage explores the everything in set using cyclic search strategy in deterministic fashion, and node is initial When select highest duty ratio, slowly reduce duty ratio until minimum duty cycle, which ensures that all duty ratio set all It is attempted, i.e. the set of actions of learning automaton has been enumerated completely.
Exploratory stage, once after all movements are selected, we use following strategy:
Specifically, this strategy includes the movement randomly explored than currently selecting higher duty ratio, if select probability It increases and illustrates that reward increases.Otherwise, if reward remains unchanged or declines, it can randomly explore low duty ratio Action.
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially poor to the cognition of environment Few, this is can to start autonomous selection movement, at this moment uses strategy as follows:
Specifically, the movement with best P value in subset of actions of the Greedy strategy selection with lower action value, changes speech It, selects that the higher duty ratio selected than last moment.There are several movements of identical P value in selected subset In the case of, select the movement with lowest duty cycle (highest action value).This means that we select have lowest duty cycle Best movement shows that it selects more preferably P value if reward is equal to or less than the reward received in previous stage.Therefore, exist Under stable condition, preferred minimum duty cycle.Once selected movement, if new action value from it is different on last stage, increase The exploration probability of node.
S107 assesses the influence transmitted after the movement and environmental interaction to data, updates feedback set, update action choosing Select Making by Probability Sets
It should be noted that coordinator from sender by using receiving in the last one active duration Feedback updates the reward of each beacon interval.Reward function is defined as follows:
Wherein, β is expressed as the combination of punishment (negative) value of the performance of stage duty ratio selection.It is by above formula as it can be seen that best Reward be a zero (not punishing) because it indicates no idle listening, transmission queue is non-spill.
Specifically, reward is based on queue occupancy O and threshold value OmaxBetween comparison.If queue occupancy is higher than upper Limit threshold value Omax, then prize signal is negative (- 1), it means that bigger OmaxSetting, the necessary packet discard of final equipment Chance is more, therefore the reward obtained is lower.Threshold value OmaxSelection indicate coordinator to the sensitivity of frame loss.The ginseng Several settings can be configured according to the reliability requirement of application.It can be set to 0.8 under normal conditions, if queue accounts for There is rate O to be less than threshold value Omax, then feedback signal is defined as the negative value equal to idle listening amount, since idle listening is that energy disappears One of the main reason for consumption.Therefore the lower it the better.Only when idle listening is zero and queue occupancy O is expressed as no data frame When loss, zero maximum reward (no punishment) can be only achieved.This means that realizing between bandwidth availability ratio and energy consumption The target of optimal tradeoff.
S108, selection movement select BO the and SO standard parameter for determining duty ratio based on feedback set, realize adaptive MAC scheduling
After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.Adjustment is defined as follows:
BO=max (4, | A | → (BI-SD) < δ) (13)
SO ← max (0, BO- αt) (14)
It should be noted that the selection is based on data frame delay experienced, parameter value BO and SO, which are embedded in, to be broadcast to In the beacon frame of terminal device, to synchronize.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to Cover non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", not There is also other identical elements in the process, method, article or apparatus that includes the element for exclusion.

Claims (9)

1. a kind of highly reliable adaptive MAC layer scheduling method, which comprises the following steps:
The first step carries out model foundation according to wireless sensor network environment, learning automaton method is applied to wireless sensing Among the environment of device network, three bit array E=(α, β, p) of sensor network environment model are indicated, wherein α={ α1, α2..., αnIndicate node duty ratio set, β={ β1, β2..., βmIndicate the feedback letter that node and environmental interaction export Number, p={ p1, p2..., pnIndicate a series of rewards and punishments probability;
Second step, node generate specific frame structure format, utilize the reserved bit insertion queue occupancy of frame control field and queuing The parameters such as delay.Specifically, each terminal device is embedded in queue occupancy O in the frame control structure of each data frame of transmission With queueing delay D, which is embedded in using 3 reserved bits of frame control field as shown in Figure 3;
Third step, coordinator (FFD) carry out flow estimation, generate the duty ratio set an of adaptive-flow, each coordinator Incoming business is estimated by idle listening, grouping accumulation and the delay in computing terminal equipment transmit queue;
4th step acts select probability set and feedback set to coordinate to initialize its set of actions;
5th step, coordinator (FFD) is interacted using learning automaton method (LA) method with ambient enviroment, using P- environment Model, feedback signal value are 0 or -1, if feedback signal takes 0, definition of probability is as follows:
If feedback signal takes -1, definition of probability is as follows:
Strategy is explored in 6th step, selection: different times select different exploration strategy, and entire learning process is divided into three ranks Section, initial stage use cyclic search strategy, and the exploratory stage uses random searching strategy, and the greedy stage then uses Greedy strategy;
7th step assesses the influence transmitted after the movement and environmental interaction to data, updates feedback set, and update action selection is general Rate set;
8th step, selection movement are selected BO the and SO standard parameter for determining duty ratio based on feedback set, select optimal account for Empty ratio, wherein BO parameter definition is as follows:
BO=max (4, | A | → (BI-SD) < δ) (3)
2. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that the net Network environmental model establish, specifically, wireless sensor network environment model by three-dimensional array E=(α, β, p) indicate, wherein α= {α1, α2..., αnIndicate that node learns the limited action collection inputted automatically, the duty ratio collection of node is indicated in the present invention It closes;β={ β1, β2..., βmIndicate the feedback signal exported after node selects suitable duty ratio with environmental interaction, p= {p1, p2..., pnIndicate a series of rewards and punishments probability, each probability penalty piAll and given input variable αiIt is related.
Environment can be divided into p-type environment, Q type ring border, 3 seed type of S type ring border by β based on the feedback signal.The present invention uses P- Model carries out model foundation to wireless sensor network environment, and feedback signal is Boolean (0 or 1), i.e., β only uses binary zero With 1 description.
Wherein, αii∈ α) indicate the activity that learning automaton selects, p (t) indicates the probability vector in t moment, uses PrewardTable Show the reward factor, uses PpenaltyIt indicates penalty factor, is determined increase respectively with the two factors or reduce movable probability, Probability vector P (t) update is defined as follows:
If activity is rewarded by random environment, activity probability vector P (t) update is defined as follows:
3. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that node generates Specific frame structure format, particularly, each terminal device are embedded in queue in the frame control structure of each data frame of transmission Occupancy O and queueing delay D, the information are embedded in using 3 reserved bits of frame control field as shown in Figure 3.
Specifically, each sending node indicates the queue occupancy o of 4 different stages using two bitsi, and queueing delay diIt is divided into 2 ranks.By this information, coordinator can estimate queue occupancy O and queueing delay D.Queue occupies Rate O is defined as follows:
It should be noted that passing through the information of 3 reserved bits, coordinator can estimate queue occupancy O and queueing delay D, such as There are node devices to meet or exceed the maximum frame number that can store in its queue for fruit, then is equal to 1.Otherwise, it, which is equal to, indicates The average queuing occupancy of the first message received in the CAP of grouping accumulation, i.e. inactive period queue occupancy are most The high time.
It should be noted that the queuing delay position D of each terminal device iiIndicate the delay threshold of current Beacon Interval BI and definition DthOne comparison of minimum value, if it is less than the minimum value, then queuing delay position DiIt is ' 0 ', is otherwise ' 1 '.Coordinator will prolong Late marking is the maximum delay that node device is sent.This is done to guarantee any node when queueing delay is higher than threshold value still Can so it carry out data transmission.
4. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that coordinator (FFD) flow estimation is carried out, specifically each coordinator is tired by the idle listening in computing terminal equipment transmit queue, grouping It accumulates and postpones to estimate incoming business.The expression formula of idle listening IL is as follows:
IL=1.0-SFu (7)
Wherein, SFuIndicate superframe utilization rate, be terminal device occupy superframe time with can be used for data communication total time it Between ratio, is defined as:
Wherein, SD is super-frame durations, TbCoordinator is the time that beacon transmissions are spent, TcEquipment is indicated due to frame conflict The time of busy channel, TrIt is the time for data receiver, TsIt is defined as follows:
Ts=TCCA+TDATA+TIFS+TACK (9)
Wherein, TCCAIndicate the channel assessment time in each frame data transmission process, TDATAIndicate data transmission period, TIFSTable Show interframe space, TACKIndicate confirmation receiving time.
5. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that initialize it Set of actions acts select probability set and feedback set.Specifically, learning automaton is a study work based on probability Tool, it passes through stochastic activity probability vector Pi(t) activity being selected, activity probability vector is the main member of learning automaton, So must keep updating at any time.Learning automaton AiActivity probability vector be expressed as follows:
Wherein, Pi(t) it indicates in moment t, node niThe probability of a certain duty ratio is selected, in the present invention, probability is expressed as corresponding to Duty ratio general feedback return desired value:
It being acted when it should be noted that initial and is chosen to be largest duty cycle, i.e., coordinator is in always receives state, and its Corresponding movement select probability is also 1, guarantees that coordinator early period can be collected into the more information of network.
6. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that coordinator (FFD) it is interacted using learning automaton method (LA) method with ambient enviroment.Specifically, learning automaton model can be by Three-dimensional array LA=(α, β, p) is indicated, wherein α={ α1, α2..., αrIndicate learning automaton behavior aggregate, β={ β1, β2..., βrIndicate the feedback signal collection that environment is given, p={ p1, p2..., prExpression movement probability set, meetWherein pi(n) α by the n-th wheel learning process is indicatediCorresponding movement probability.Meet and probability updating is public Formula p (n+1)=T (α (n), β (n), p (n)).
Specifically, using P- environmental model, feedback signal value is 0 or 1, when feedback signal takes 0, and environment is awarded signal. When feedback signal takes 0 or -1, corresponding probability updating respectively indicates as follows:
It should be noted that can constantly receive environment to one using learning automaton method during adjusting duty ratio A feedback β, the feedback always received can be understood as feeding back the summation with future feedback immediately, as follows:
Wherein, γ is discount factor, and γ ∈ [0,1] indicates a weight to future feedback.
7. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that different times Select different exploration strategy, particularly, the exploration strategy of entire part is divided into 3 stages: initial stage, explores rank Section and greedy stage.
Initial stage, explore the everything in set in deterministic fashion using cyclic search strategy, node selects at the beginning Highest duty ratio is selected, slowly reduces duty ratio until minimum duty cycle, which ensures that all duty ratio set all to be tasted Examination, the i.e. set of actions of learning automaton have been enumerated completely.
Exploratory stage, once after all movements are selected, will randomly it explore higher than the duty ratio of current generation selection Movement, if corresponding β in feedback set βt iIt increases, then it represents that movement αiRepresentative duty ratio is more preferable.Otherwise, if Feedback set β is remained unchanged or corresponding βt iReduce, it can randomly explore the action of lower duty ratio, and strategy is such as Under:
In the greedy stage, after explorative research strategy carries out study a period of time, node is just substantially similar to the cognition of environment , optimal action value at this time is found using Greedy strategy, as β corresponding in feedback set βt iHigher than a upper stage Illustrate that flow increases, Greedy strategy selection has the subset of actions of lower action value, that is, selects higher duty ratio;If feedback Corresponding β in set βt iBelow or equal to a upper stageGreedy strategy selection has movement of higher action value Collection, that is, select lower duty ratio;Therefore, under stable condition, preferred minimum duty cycle is selected when next stage and this stage Duty ratio it is different, then increase the probability of search, otherwise, then reduce study and explore probability, when movement best to avoid selection It generates, strategy is as follows:
Wherein, β is expressed as the combination of punishment (negative) value of the performance of stage duty ratio selection.Have formula (16) it can be concluded that, most Good reward is a zero (not punishing), because it indicates no idle listening, transmission queue is non-spill.
It should be noted that study and exploration rate are reduced if new action is equal to the last one selected action, to avoid Select the oscillation of optimizing behavior.
8. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that selection movement, BO the and SO standard parameter that decision duty ratio is selected based on feedback set is realized adaptive MAC scheduling, particularly, selected After selecting action value, adjustment determines BO the and SO standard parameter of duty ratio.Adjustment formula is defined as follows:
BO=max (4, | A | → (BI-SD) < δ) (17)
SO ← max (0, BO- αt) (18)
It should be noted that the selection is based on data frame delay experienced, parameter value BO and SO insertion are broadcast to terminal and set In standby beacon frame, to synchronize.
9. the highly reliable adaptive MAC layer scheduling method of one kind according to claim 1, which is characterized in that described device Include:
Generation unit generates specific frame control structure format, is embedded in queue occupancy and row in the reserved bit of frame control field The parameters such as team's delay;
Transmission unit, each sensor node are sent to each sensor section according to the frame format that setting generates the case where itself Point;
Receiving unit, the data frame sent after accessing channel for receiving each sensor node;The data frame at least wraps Include the parameters such as queue occupancy and queuing delay;
Assessment unit, according to the select probability of the parameter and the working state evaluation of coordination of sensor node the transmission movement;
Autonomous learning element, node update itself set of actions using learning automaton method, act select probability set with And feedback set;
Which time slot policy selection unit, judgement are in, take corresponding strategy, and the stage takes circulation to explore strategy in the early stage, Randomized policy is taken in the exploratory stage, takes Greedy strategy in the last greedy stage;
Automatic adjusument unit after selection movement, based on feedback collection and behavior aggregate adjusting parameter BO and SO, is completed adaptive MAC scheduling.
CN201710946487.6A 2017-10-11 2017-10-11 High-reliability self-adaptive MAC (media Access control) layer scheduling method Active CN109660375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710946487.6A CN109660375B (en) 2017-10-11 2017-10-11 High-reliability self-adaptive MAC (media Access control) layer scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710946487.6A CN109660375B (en) 2017-10-11 2017-10-11 High-reliability self-adaptive MAC (media Access control) layer scheduling method

Publications (2)

Publication Number Publication Date
CN109660375A true CN109660375A (en) 2019-04-19
CN109660375B CN109660375B (en) 2020-10-02

Family

ID=66108497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710946487.6A Active CN109660375B (en) 2017-10-11 2017-10-11 High-reliability self-adaptive MAC (media Access control) layer scheduling method

Country Status (1)

Country Link
CN (1) CN109660375B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110856264A (en) * 2019-11-08 2020-02-28 山东大学 Distributed scheduling method for optimizing information age in sensor network
CN111542070A (en) * 2020-04-17 2020-08-14 上海海事大学 Efficient multi-constraint deployment method for industrial wireless sensor network
CN114666880A (en) * 2022-03-16 2022-06-24 中南大学 Method for reducing end-to-end delay in delay-sensitive wireless sensor network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103260229A (en) * 2013-06-04 2013-08-21 东北林业大学 Wireless sensor network MAC protocol based on forecast and feedback

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103260229A (en) * 2013-06-04 2013-08-21 东北林业大学 Wireless sensor network MAC protocol based on forecast and feedback

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN HAO等: "Traffic Adaptive Duty Cycle MAC Protocol for Wireless Sensor Networks", 《IEEE》 *
范清峰等: "无线传感器网络自适应MAC协议", 《计算机工程与应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110856264A (en) * 2019-11-08 2020-02-28 山东大学 Distributed scheduling method for optimizing information age in sensor network
CN111542070A (en) * 2020-04-17 2020-08-14 上海海事大学 Efficient multi-constraint deployment method for industrial wireless sensor network
CN111542070B (en) * 2020-04-17 2023-03-14 上海海事大学 Efficient multi-constraint deployment method for industrial wireless sensor network
CN114666880A (en) * 2022-03-16 2022-06-24 中南大学 Method for reducing end-to-end delay in delay-sensitive wireless sensor network
CN114666880B (en) * 2022-03-16 2024-04-26 中南大学 Method for reducing end-to-end delay in delay-sensitive wireless sensor network

Also Published As

Publication number Publication date
CN109660375B (en) 2020-10-02

Similar Documents

Publication Publication Date Title
Sherazi et al. A comprehensive review on energy harvesting MAC protocols in WSNs: Challenges and tradeoffs
JP6498818B2 (en) Communication system, access network node and method for optimizing energy consumed in a communication network
de Paz Alberola et al. Duty cycle learning algorithm (DCLA) for IEEE 802.15. 4 beacon-enabled wireless sensor networks
CN102026329B (en) Wireless communication network and self-adaptive route selecting communication method thereof
Juang et al. Energy-efficient computing for wildlife tracking: Design tradeoffs and early experiences with ZebraNet
CN1898900B (en) Supporter, client and corresponding method for power saving in a wireless local area network
JP6266773B2 (en) Communication system and method for determining optimal duty cycle to minimize overall energy consumption
Pinto et al. An approach to implement data fusion techniques in wireless sensor networks using genetic machine learning algorithms
Magistretti et al. A mobile delay-tolerant approach to long-term energy-efficient underwater sensor networking
Sarang et al. A QoS MAC protocol for prioritized data in energy harvesting wireless sensor networks
CN108712760B (en) High-throughput relay selection method based on random Learning Automata and fuzzy algorithmic approach
CN102740365B (en) Single-stream bulk data acquisition method suitable for wireless sensor network
CN109660375A (en) A kind of highly reliable adaptive MAC layer scheduling method
CN110167054A (en) A kind of QoS CR- LDP method towards the optimization of edge calculations node energy
Noh et al. Low-Latency Geographic Routing for Asynchronous Energy-Harvesting WSNs.
Gholamzadeh et al. Concepts for designing low power wireless sensor network
CN111278161B (en) WLAN protocol design and optimization method based on energy collection and deep reinforcement learning
Hong et al. ROSS: Receiver oriented sleep scheduling for underwater sensor networks
Koutsandria et al. Wake-up radio-based data forwarding for green wireless networks
Pal et al. A distributed power control and routing scheme for rechargeable sensor networks
Kinoshita et al. A data gathering scheme for environmental energy-based wireless sensor networks
CN114449629B (en) Wireless multi-hop network channel resource optimization method driven by edge intelligence
Rucco et al. A bird's eye view on reinforcement learning approaches for power management in WSNs
Afroz et al. QX-MAC: Improving QoS and Energy Performance of IoT-based WSNs using Q-Learning
Sah et al. TDMA policy to optimize resource utilization in Wireless Sensor Networks using reinforcement learning for ambient environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant