CN105094751B

CN105094751B - A kind of EMS memory management process for stream data parallel processing

Info

Publication number: CN105094751B
Application number: CN201510427494.6A
Authority: CN
Inventors: 彭群; 张广兴; 谢高岗
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2015-07-20
Filing date: 2015-07-20
Publication date: 2018-01-09
Anticipated expiration: 2035-07-20
Also published as: CN105094751A

Abstract

The present invention provides a kind of EMS memory management process for stream data parallel processing, streamline framework for stream data parallel processing includes multigroup producers and consumers for establishing one-to-one peer-to-peer, each producers and consumers with peer-to-peer include with privately owned partial cache pond, the EMS memory management process for stream data parallel processing：1) in stream data parallel process is carried out, each producer applies for internal memory to its privately owned partial cache pond, and it is space-time in the partial cache pond of the producer, swap operation is triggered to obtain memory block, the swap operation is exchanged by the partial cache pond of the producer and with the partial cache pond of its reciprocity consumer；2) in stream data parallel process is carried out, partial cache pond releasing memory of each consumer to the consumer.The present invention can be effectively reduced internal memory operation expense, improve the utilization rate of internal memory.

Description

A kind of EMS memory management process for stream data parallel processing

Technical field

The present invention relates to stream data parallel processing technique field, and specifically, the present invention relates to one kind to be used for streaming number According to the EMS memory management process of parallel processing.

Background technology

Stream data processing includes：Network packet processing (hereinafter referred to as processing data packets), video flow processing, text Present treatment, Message Processing etc..With network infrastructure and booming, network size and the service complexity of Internet industry All constantly increasing.Traditional stream data treatment technology can not meet the performance requirement that express network is brought, based on more The stream data parallel processing technique of core processor has become new trend.By taking processing data packets as an example, Fig. 1 shows one Individual general packet multi-core parallel concurrent processing framework.With reference to figure 1, in this packet multi-core parallel concurrent handles framework, each place A processing thread is all bound on reason device core, all processing threads together constitute a parallel pipeline layout.Input Packet is dispatched with middle by handling layer by layer since the streamline first order, finally arrives at streamline afterbody.In reality In, the number of threads of the pipeline series of system and every level production line is all adjustable.

On the other hand, in parallel processing technique field, multiple processing threads how to be reduced to being competed while shared resource, All the time it is a key problem, it is no exception as the stream data parallel processing of representative using packet parallel processing.In packet In parallel processing, most common shared resource is exactly internal memory.Each pending packet can be allocated one piece of internal memory to deposit Related data is stored up, this block internal memory will be released after being disposed.Under multi-core parallel concurrent processing environment, multiple threads need to enter simultaneously The application of row internal memory and release, it is easy to cause the sustainable competition to memory source.Also, processing number of threads mean it is competing It is fiercer to strive, caused by compete expense it is also bigger.For the data packet handling system of high-throughput, its is per second to be required for locating Packet up to a million is managed, so as to bring the internal memory operation of up to a million times.Therefore, efficient EMS memory management process is for improving number It is most important for performance according to bag multi-core parallel concurrent processing system.

In the prior art, common memory management method mainly has the shared type memory management method of global storage and thread office The major class of portion's caching type EMS memory management process two, is introduced separately below.

1st, the shared type memory management method of global storage

The memory manager of early stage is all the shared type of global storage, such as by Doug in earlier version Glibc storehouses The memory manager that Lea is realized.In the case where the overall situation stores shared type memory management method, memory manager is first from operating system One piece of global storage region of preliminery application in the heap memory of management, and tissue is carried out using certain data structure, such as separate idle List.When there are multiple threads to perform internal memory applications and release operation simultaneously, (such as returned in order to avoid resource contention occurs Same memory block is to different threads) or internal data structure (such as the entanglement of memory block chained list node pointer) is destroyed, generally need Each operation is synchronized using incompatible lock mechanism, so as to bring memory source to compete expense.And packet multinuclear is simultaneously In row processing, number of threads is generally more, easily causes the resource contention of fierceness, brings very big memory contention expense.Cause This, the memory manager of globally shared type is obviously unsuitable for handling the stream data for representative with packet multi-core parallel concurrent Parallel processing application scenarios.

2nd, thread local caching type EMS memory management process

In order to reduce shared competition expense of the type memory management method under multi-thread concurrent environment of global storage, with line The memory allocator for multithreading optimization based on journey partial cache has been increasingly becoming main flow, such as Hoard memory managers With the TCMalloc memory managers increased income, and memory management module ptmalloc in the Glibc storehouses of more recent version etc..This The general principle of a little Memory Managements is all to add thread local caching, i.e., memory manager is except safeguarding in one piece of global heap Can also be that each thread of application program separately maintains one piece of associated thread local caching beyond depositing.

When application program internal memory, memory manager can first check for applying for that the associated part of the thread of internal memory is delayed Deposit, then directly obtained from partial cache if not empty, otherwise just obtained from global heap memory.Due to being the part from oneself Take in caching, therefore need not all be locked under normal circumstances.Under this scheme, as long as most internal memory application is all By partial cache, then Memory Allocation performance can is very high.

When application program releasing memory, similarly, memory manager needs to attempt the memory block being discharged into thread office Portion caches, to avoid locking by the write-in that localizes, while also can be by filling partial cache come for follow-up internal memory Shen It please be ready.Unlike internal memory application, internal memory release can have two kinds of different schemes, and the first is that to be released back into this interior Thread cache belonging to during counterfoil original application, second is the partial cache for being directly released into this thread oneself.

The memory management of thread local caching type under in general multi-thread environment can effectively improving performance, but Found in practice, for handling the stream data parallel processing application scenarios for representative, traditional line with packet multi-core parallel concurrent Journey partial cache scheme still has very big room for promotion.

The content of the invention

Therefore, task of the invention is to provide a kind of memory management solution party particularly suitable for stream data parallel processing Case.

According to an aspect of the invention, there is provided a kind of EMS memory management process for stream data parallel processing, bag Include the following steps：

1) it is that producers and consumers establish one-to-one equity according to the framework of stream data parallel processing streamline Relation, privately owned partial cache pond is established to establish each producer of peer-to-peer, to establish each of peer-to-peer Consumer establishes privately owned partial cache pond；

2) in stream data parallel process is carried out, each producer is into its privately owned partial cache pond application Deposit, and be space-time in the partial cache pond of the producer, triggering swap operation to obtain memory block, the swap operation be by The partial cache pond of the producer and exchanged with the partial cache pond of its reciprocity consumer；

3) in stream data parallel process is carried out, each consumer is into the partial cache pond release of the consumer Deposit.It should be noted that above-mentioned steps 2) and the execution sequence of step 3) can exchange, the two is in no particular order.

Wherein, in the step 2), the swap operation passes through the partial cache of the reciprocity producers and consumers of exchange The pointer or object reference in pond are realized.

Wherein, the step 1) also includes：Establish global memory pool.

Wherein, the step 2) also includes：After triggering swap operation, if the now internal memory block number of reciprocity consumer's cache pool Mesh is less than first threshold, stops the swap operation, the producer directly obtains memory block from the global memory pool, if now right Memory block number Deng consumer's cache pool reaches the first threshold, then continues executing with the swap operation.

Wherein, the step 3) also includes：Thread local cache pool releasing memory from consumer to the consumer when, if this When the consumer the memory block number in partial cache pond exceed Second Threshold, then the consumer stops releasing to its partial cache pond Internal memory is put, is changed to global memory pool's releasing memory, if now the memory block number in the partial cache pond of the consumer is not More than the Second Threshold, then the consumer continues to its partial cache pond releasing memory, and the Second Threshold is more than described the One threshold value.

Wherein, the step 2) includes substep：

21) when the producer starts streaming data progress streamline first order data processing, trigger as this batch streaming number According to internal memory application；

22) producer judges whether local local cache pool is empty, if it is, step 23) is continued executing with, if It is no, then perform step 25)；

23) judge whether the capacity in the partial cache pond of the consumer at reciprocity end is more than the first threshold, if it is, Step 24) is performed if it is not, then performing step 26)；

24) the thread local cache pool of the producer is exchanged with the thread local cache pool of its reciprocity consumer；

25) producer obtains required memory block from partial cache pond；

26) producer obtains memory block from global memory pool.

Wherein, the step 3) includes substep：

31) consumer triggers this batch stream data when completing the afterbody data processing of a batch stream data Internal memory release.

32) consumer judges whether the partial cache pond storage of local is less than the Second Threshold, if it is, performing step It is rapid 33), otherwise, perform step 34)；

33) internal memory is discharged into local partial cache pond；

34) internal memory is directly released into global memory pool.

Wherein, the step 2) also includes：Each producer's dynamic adjusts the first threshold of itself.

Wherein, in the step 2), the method for dynamically adjusting the first threshold is as follows：Each producer, which records, exchanges behaviour Make triggering frequency F1 and directly access the frequency F2 of global memory pool, as (C1*F1)>(C2*F2) when, the institute of the producer is improved First threshold is stated, as (C1*F1)<(C2*F2) when, the first threshold of the producer is reduced, wherein C1 represents single exchange The expense of operation, C2 represent the expense of single reference global memory pool.

Wherein, the step 3) also includes：Consumer's dynamic adjusts the Second Threshold.

Wherein, the method for dynamically adjusting the Second Threshold is as follows：When the memory block of global memory pool is beyond default complete During office's memory threshold, the Second Threshold is improved.

Compared with prior art, the present invention has following technique effect：

1st, the present invention can be effectively reduced the internal memory operation expense of stream data parallel processing.

2nd, the present invention can improve the utilization rate of the internal memory of stream data parallel processing.

3rd, more the invention is particularly suited to pipeline series, degree of parallelism is higher, the complicated application scenario of pipeline organization.

Brief description of the drawings

Hereinafter, embodiments of the invention are described in detail with reference to accompanying drawing, wherein：

Fig. 1 shows a general packet multi-core parallel concurrent processing framework；

Fig. 2 shows the dirty waterline of scheme that local cache is discharged into using traditional thread local buffering scheme and internal memory Internal memory one-way circulation flow schematic diagram；

Fig. 3 shows the memory management framework of the multipoint buffer and peer switch in the present embodiment；

Fig. 4 shows the step of the EMS memory management process for stream data parallel processing of a preferred embodiment of the invention Rapid 2 sub-process schematic diagram；

Fig. 5 shows the step of the EMS memory management process for stream data parallel processing of a preferred embodiment of the invention Rapid 3 sub-process schematic diagram.

Embodiment

Inventor has made intensive studies to the processing of packet multi-core parallel concurrent, finds directly by existing general thread local The EMS memory management process for caching type is applied in the packet multi-core parallel concurrent processing based on streamline framework, easily causes resource The problems such as wasting and competing expense.

On the one hand, in a pipeline, it is often middle by first order thread application internal memory, afterbody thread releasing memory Level thread simply uses the apllied internal memory of first order thread, such as from the apllied memory copying initial data of first order thread Bag, read-write header fields etc., and need not be extraly each intergrade thread application internal memory, here it is in processing data packets " Producer-consumer problem " internal memory use pattern.But in traditional thread local buffering scheme, still can be each thread One piece of associated thread local caching is separately maintained, this results in the waste of resource.

On the other hand, under " Producer-consumer problem " internal memory use pattern of streamline, traditional thread local caching side The measure that those in case are used to reduce competition expense may fail.As it was noted above, the internal memory pipe of thread local caching type The internal memory application process of reason method is all consistent, but has two kinds of different schemes during release caching, in order to comprehensively assess the party Competition expense of the method under the processing of packet multi-core parallel concurrent, is analyzed separately below.

1) it may be associated when if internal memory discharges using the scheme for being released back into raw cache, each thread cache area by it Thread and other one or more threads for holding the memory block for being subordinated to the buffer area while access, therefore apply and release every time Operation is put to be required for locking.This expense that can constitute competition dramatically increases.

2) just need to add using the scheme for being discharged into local cache, only special application and release when if internal memory discharges Lock.However, under " Producer-consumer problem " pattern, this method can cause partial cache to fail.Local cache is discharged into internal memory Scheme under, internal memory always from the producer caching is delivered to consumer caching, then by consumer caching be discharged into global memory block, So as to form the one-way flow that the producer is cached to consumer's caching and arrives global memory block again, this will cause the slow of producer thread Deposit and drained, after draining the caching of producer thread and filling up the caching of consumer thread, all internal memory applications and release please Asking will be completed by accessing global memory block, be eventually formed in global memory block, producer's caching, consumer cache three it Between one-way circulation flow, Fig. 2 is shown is discharged into the scheme of local cache using traditional thread local buffering scheme and internal memory The internal memory one-way circulation flow schematic diagram of dirty waterline.In this state, all internal memory applications and releasing request will pass through Access global memory block to complete, this adds increased the access of the locking to global memory block, the expense that constitutes competition dramatically increases.

Based on above-mentioned analysis, according to one embodiment of present invention, there is provided a kind of for stream data parallel processing EMS memory management process, this method can apply to packet parallel processing, video flowing parallel processing, text parallel processing, message In the scene of the various stream data parallel processings such as parallel processing.

In the present embodiment, the EMS memory management process for stream data parallel processing comprises the following steps：

Step 1：According to the framework of stream data parallel processing streamline, establish based in multipoint buffer and peer switch Deposit control construction.

In the present embodiment, streamline is using conventional " Producer-consumer problem " internal memory use pattern, under the pattern, by the The thread of one level production line is referred to as " producer ", and the thread of afterbody streamline is referred to as " consumer ".In multipoint buffer and right In memory management system Deng exchange, the producers and consumers of identical quantity are established and mapped one by one, it is " right to be referred to as between each other Deng " relation.For example, consumer corresponding with some producer is properly termed as the reciprocity consumer of the producer, with some consumption The producer corresponding to person is properly termed as the reciprocity producer of the consumer.

Fig. 3 shows the memory management framework of the multipoint buffer and peer switch in the present embodiment, in the framework, for stream Waterline establishes globally shared global memory pool, and a private is established for the producers and consumers each in the presence of " equity " relation Some partial cache ponds.Global memory pool, producer thread partial cache pond are managed by the way of self-built memory pool and is disappeared The person's of expense thread local cache pool, with the tissue and occupation mode of voluntarily managing internal memory.When initial, global memory pool and the producer's Cache pool can all be placed into a certain amount of memory block from the pre- first to file of operating system, and the cache pool of consumer can remain It is empty.

It should be noted that when the number of first order thread (i.e. the producer) in streamline framework and afterbody thread When (i.e. consumer) number is identical, it can directly establish for producers and consumers and map one by one, and when in streamline framework When producer's number is with consumer's number difference, then chooses the same number of producers and consumers' foundation and map one by one, it is remaining Producers or consumers region be directly facing global memory pool application or releasing memory, no longer perform follow-up step 2,3,4.

Memory management generally has two kinds of implementations, and one kind is dynamic memory distribution, i.e., is directly entered using system library function The application of row internal memory and release (malloc functions and free functions in such as C language storehouse)；Another kind is exactly self-built memory pool.In dynamic Although depositing the method for salary distribution realizes simple (application program itself need not be concerned about), because system library function needs to consider enough lead to With property, therefore performance and efficiency is not often high.Data packet handling system is very high to performance requirement, and due to procotol Limitation, the internal memory block size required for raw data packets generally also compares fixation, therefore self-built memory pool is used in the present embodiment Mode.On the one hand, the operation of memory pool many simpler than universal memory batch operation, therefore operation overhead very little；The opposing party Face, because memory pool is all the memory block of distribution fixed size, therefore external fragmentation will not be produced, memory usage is relatively more It is high.

Step 2：In stream data parallel process is carried out, each producer caches to the thread local of the producer Apply for internal memory in pond.If now the thread local cache pool of the producer is sky, swap operation is triggered to obtain memory block.Hand over It is to exchange the thread local cache pool of the producer with the thread local cache pool of its reciprocity consumer to change operation.Swap operation Can allow producers and consumers continue operated on respective cache pool, without spend access global memory pool, without Recycling for cache resources is realized on the premise of global memory pool.In realization, swap operation only needs to exchange respective office The pointer (or object reference) of portion's cache pool, it is not necessary to lock, therefore it is very small to operate cost.Further, since the only producer Thread local cache pool can just trigger exchange for space-time, therefore share the cost very little of each internal memory operation equally.

Step 3：In stream data parallel process is carried out, each consumer caches to the thread local of the consumer Pond releasing memory.

Above by multipoint buffer application and the design of releasing memory, most internal memory applications and release can be allowed to operate All only need to access the privately owned partial cache of thread, it is not necessary to lock, therefore performance is very high；On the other hand, compared to tradition Thread local buffering scheme, it is slow that this method need not be that unrelated thread (such as middle streamline thread, management thread) is established Deposit pond, and the cache pool of consumer is initially also without filling, therefore can further lift memory usage.

Further, when realizing stream data parallel processing, some thread pipelines there may be load balance scheduling Mechanism, this may cause the producer to apply for that the speed of internal memory and the speed of corresponding reciprocity consumer's releasing memory are inconsistent, this Sample, in fact it could happen that the phenomenon that the internal memory number of blocks exchanged every time between part producing person-consumer gradually reduces, this again in turn Promote swap operation more frequent, or even it is all sky the producer and the cache pool of reciprocity consumer occur, leads to not what is exchanged Situation.In order to avoid the generation of such case, in the EMS memory management process of an alternative embodiment of the invention, add it is a set of Weigh regulation mechanism, and the well-balanced adjustment mechanism includes：A, when producer's cache pool is space-time, if now reciprocity consumer's cache pool is few In first threshold (alternatively referred to as Low threshold), then it is assumed that the storage of reciprocity end memory block is inadequate, and swap operation is not calculated, therefore not Peer switch operation is carried out again, but allows the producer directly to obtain memory block from global pool.B, when consumer's releasing memory, if The storage in partial cache pond is more than Second Threshold (alternatively referred to as high threshold), then it is assumed that partial cache pond occupies excessively current Memory source, therefore no longer releasing memory is to partial cache pond, but it is directly released into global memory pool.

Fig. 4 shows the step of the EMS memory management process for stream data parallel processing of a preferred embodiment of the invention Rapid 2 sub-process schematic diagram, with reference to figure 4, in the EMS memory management process for stream data parallel processing of this preferred embodiment, Step 2 includes substep：

Step 21：When the producer starts streaming data progress streamline first order data processing, trigger as this batch stream The internal memory application of formula data.

Step 22：The producer judges whether local local cache pool (i.e. the partial cache pond of the producer) is sky, such as Fruit is then to continue executing with step 23, if it is not, then performing step 25.

Step 23：Judge whether the capacity in the partial cache pond of the consumer at reciprocity end is more than first threshold, if it is, Step 24 is performed, if it is not, then performing step 26.

Step 24：Swap operation is performed, will the thread local cache pool of the producer and the thread of its reciprocity consumer Partial cache pond exchanges.The swap operation is completed by exchanging the pointer (or object reference) in respective partial cache pond.

Step 25：The producer obtains required memory block from partial cache pond, then goes to step 27.

Step 26：The producer obtains memory block from global memory pool, then goes to step 27.

Step 27：This internal memory application is completed, and is returned.

Fig. 5 shows the step of the EMS memory management process for stream data parallel processing of a preferred embodiment of the invention Rapid 3 sub-process schematic diagram, with reference to figure 5, in the EMS memory management process for stream data parallel processing of this preferred embodiment, Step 3 includes substep：

Step 31：Consumer triggers this batch streaming when completing the afterbody data processing of a batch stream data The internal memory release of data.

Step 32：The consumer judges whether partial cache pond (i.e. the partial cache pond of the consumer) storage of local is small In Second Threshold, if it is, performing step 33, otherwise, step 34 is performed.

Step 33：Internal memory is discharged into local partial cache pond, then goes to step 35.

Step 34：Internal memory is directly released into global memory pool, then goes to step 35.

Step 35：The release of this internal memory is completed, and is returned.

Further, in a preferred embodiment, in order to further reduce internal memory operation expense, also to described first Threshold value and Second Threshold enter Mobile state adjustment.

As it was noted above, first threshold is a Low threshold, this threshold value is smaller, then more (this is swap operation number Because give-and-take conditions are easier to meet), and well-balanced adjustment triggering times are then fewer.Conversely, first threshold is bigger, then swap operation Number is fewer, and well-balanced adjustment triggering times are then more.Similarly, Second Threshold is a high threshold, and this threshold value is smaller, then hands over It is more (because average each exchange capacity is reduced, causing the partial cache pond of the producer to be easier to use up) to change number of operations, And memory source utilization rate is then higher (because internal memory can less be deposited in partial cache pond)；Conversely, Second Threshold is got over Greatly, then swap operation number is fewer, and memory source utilization rate is then lower.

Swap operation and well-balanced adjustment can all produce certain expense, under different scenes and running status, first, Influence of the size of two threshold values to this two classes expense is different.Therefore, threshold value, energy are dynamically adjusted according to historical feedback information Enough continuous iteration reduce the larger operation overhead of current accounting, and then minimize global overhead, and further improve internal memory Service efficiency.

Analyzed based on more than, present embodiments provide a kind of side for entering Mobile state adjustment to first threshold and Second Threshold Method, comprise the following steps：

41) global buffer pond is accessed when the expense (being designated as C1) and well-balanced adjustment of assessing or measure in advance single exchange operation Expense (being designated as C2)；

42) each thread separately maintains the threshold value (producer safeguards Low threshold, and consumer safeguards high threshold) of oneself, with root Personalized adjustment is done according to thread itself actual operating state；

43) each thread two dynamic values of additional maintenance：Swap operation triggering frequency (being designated as F1), well-balanced adjustment triggering frequency Spend (being designated as F2), as historical feedback information when running；

44) as (C1*F1)>(C2*F2) when, illustrate that the current accounting of expense of swap operation is larger, therefore can improve low Threshold value, to reduce swap operation expense；

45) as (C1*F1)<(C2*F2) when, illustrate that the current accounting of expense of well-balanced adjustment is larger, therefore can reduce low Threshold value, reduce equilibrium condition expense；

46), can be with when memory source (can be according to the default global memory's threshold decision) enough of global memory pool High threshold is improved, to reduce swap operation expense, i.e., performance is exchanged for by internal memory.

Experiment shows that the present invention can be effectively reduced the internal memory operation expense of stream data parallel processing, improves internal memory Utilization rate.Also, pipeline organization complicated application scenario more present invention is particularly suitable for pipeline series.

The schematical embodiment of the present invention is the foregoing is only, is not limited to the scope of the present invention.It is any Those skilled in the art, equivalent variations, modification and the combination made on the premise of the design of the present invention and principle is not departed from, The scope of protection of the invention all should be belonged to.

Claims

1. a kind of EMS memory management process for stream data parallel processing, the stream data parallel processing is based on streamline structure Frame realizes that the streamline framework includes multigroup producers and consumers for establishing one-to-one peer-to-peer, with pair There is privately owned partial cache pond etc. each producer of relation, each consumer with peer-to-peer is also with privately owned Partial cache pond, the EMS memory management process for stream data parallel processing comprises the following steps：

1) in stream data parallel process is carried out, each producer applies for internal memory to its privately owned partial cache pond, and And in the partial cache pond of the producer be space-time, for triggering swap operation to obtain memory block, the swap operation is to give birth to this The pointer or object in the pointer or object reference in the partial cache pond of production person and the partial cache pond of the consumer reciprocity with it draw With swapping；

2) in stream data parallel process is carried out, partial cache pond releasing memory of each consumer to the consumer.

2. the EMS memory management process according to claim 1 for stream data parallel processing, it is characterised in that the step It is rapid 1) in, the swap operation pass through exchange equity producers and consumers partial cache pond pointer or object reference Realize.

3. the EMS memory management process according to claim 1 for stream data parallel processing, it is characterised in that the step It is rapid 1) also to include：After triggering swap operation, if now the memory block number of reciprocity consumer's cache pool is less than first threshold, stop The swap operation, the producer directly obtains memory block from global memory pool, if the now memory block of reciprocity consumer's cache pool Number reaches the first threshold, then continues executing with the swap operation.

4. the EMS memory management process according to claim 3 for stream data parallel processing, it is characterised in that the step It is rapid 2) also to include：Thread local cache pool releasing memory from consumer to the consumer when, if now the part of the consumer is delayed The memory block number for depositing pond exceedes Second Threshold, then the consumer stops, to its partial cache pond releasing memory, being changed to described Global memory pool's releasing memory, if now the memory block number in the partial cache pond of the consumer is not less than the Second Threshold, Then the consumer continues to be more than the first threshold to its partial cache pond releasing memory, the Second Threshold.

5. the EMS memory management process according to claim 3 for stream data parallel processing, it is characterised in that the step It is rapid 1) to include substep：

11) when the producer starts streaming data progress streamline first order data processing, trigger as this batch stream data Internal memory application；

12) producer judges whether local local cache pool is empty, if it is, step 13) is continued executing with, if it is not, then Perform step 15)；

13) judge whether the capacity in the partial cache pond of the consumer at reciprocity end is more than the first threshold, if it is, performing Step 14) is if it is not, then perform step 16)；

14) the thread local cache pool of the producer is exchanged with the thread local cache pool of its reciprocity consumer；

15) producer obtains required memory block from partial cache pond；

16) producer obtains memory block from global memory pool.

6. the EMS memory management process according to claim 4 for stream data parallel processing, it is characterised in that the step It is rapid 2) to include substep：

21) consumer triggers the interior of this batch stream data when completing the afterbody data processing of a batch stream data Deposit release；

22) consumer judges whether the partial cache pond storage of local is less than the Second Threshold, if it is, performing step 23) step 24), otherwise, is performed；

23) internal memory is discharged into local partial cache pond；

24) internal memory is directly released into global memory pool.

7. the EMS memory management process according to claim 3 for stream data parallel processing, it is characterised in that the step It is rapid 1) also to include：Each producer's dynamic adjusts the first threshold of itself.

8. the EMS memory management process according to claim 7 for stream data parallel processing, it is characterised in that the step It is rapid 1) in, the method for dynamically adjusting the first threshold is as follows：Each producer records swap operation triggering frequency F1 and directly The frequency F2 of global memory pool is accessed, as (C1*F1)>(C2*F2) when, the first threshold of the producer is improved, as (C1* F1)<(C2*F2) when, the first threshold of the producer is reduced, wherein C1 represents the expense of single exchange operation, and C2 is represented The expense of single reference global memory pool.

9. the EMS memory management process according to claim 4 for stream data parallel processing, it is characterised in that the step It is rapid 2) also to include：Consumer's dynamic adjusts the Second Threshold.

10. the EMS memory management process according to claim 9 for stream data parallel processing, it is characterised in that dynamic The method for adjusting the Second Threshold is as follows：When the memory block of global memory pool exceeds default global memory's threshold value, improve The Second Threshold.