US20100088325A1 - Streaming Queries - Google Patents
Streaming Queries Download PDFInfo
- Publication number
- US20100088325A1 US20100088325A1 US12/246,509 US24650908A US2010088325A1 US 20100088325 A1 US20100088325 A1 US 20100088325A1 US 24650908 A US24650908 A US 24650908A US 2010088325 A1 US2010088325 A1 US 2010088325A1
- Authority
- US
- United States
- Prior art keywords
- operator
- recursive
- stream
- query
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Definitions
- Computers are very effective at storing large amounts of data, such as in a database.
- techniques have been refined for establishing computational options, such as accessing or querying the stored data, viewing the data, modifying the data, etc.
- the data can be thought of as relatively static and so the techniques, such as database querying techniques tend not to be very applicable to time sensitive scenarios, such as those involving real-time or near real-time.
- a database query technique designed to retrieve a definition of a word from a dictionary database need not be time sensitive since the data is statically stored in the database.
- a temperature sensor may be configured to periodically output a time-stamped signal corresponding to a sensed temperature. When viewed collectively this output can be thought of as a stream of data or a data stream.
- database querying techniques are not generally applicable in the data stream scenarios. Instead, stream processing techniques have been developed for use with data streams.
- Stream processing techniques offer much more limited computational options that those available in traditional database scenarios. Stated another way, a very small set of computations can presently be performed with stream processing. The present concepts introduce new stream processing techniques that greatly increase the set of computations that can be accomplished with stream processing.
- the described implementations relate to recursive streaming queries.
- One method or technique processes a recursive streaming query through a query graph.
- the technique also detects when output produced by executing the query graph advances to a specific point.
- Another implementation is manifested as a method that processes at least one input stream associated with a recursive streaming query.
- the technique also advances time for the recursive streaming query to a specific point when at least one input stream has advanced to the specific point and recursive computations on the input stream are complete to the point.
- FIGS. 1-5 show exemplary graphs for processing recursive streaming queries in accordance with some implementations of the present concepts.
- FIG. 6 is a flowchart of exemplary recursive streaming query processing techniques in accordance with some implementations of the present concepts.
- a data stream or streaming data can be thought of as events or notifications that are generated in real-time or near real-time.
- an event can be thought of as including event data or payload and a timestamp.
- Processing recursive streaming queries can entail the use of one or more recursions.
- a recursion can be thought of as a function that is defined in terms of itself so that it can involve potentially infinite or unbounded computation.
- computation resources are reserved for specific events until the resources are no longer needed.
- the present implementations offer solutions for detecting when recursive processing is completed up to a specific point in time. Thus, the recursion may remain infinite, but the present techniques can identify specific time periods for which the recursive processing of streaming queries is complete. Computation resources can then be freed up to the specific point in time.
- FIG. 1 illustrates an exemplary recursive streaming query processing method generally at 100
- the method processes at least one input stream (i.e., streaming data 102 ) associated with a recursive streaming query at 104 .
- the method also advances time for the recursive streaming query to a specific point when two conditions are met. First, the one input stream has advanced to the specific point and second, recursive computations on the input stream are complete to the point.
- streaming data 102 is emitted from a temperature sensor 108 and processed on a query graph 110 .
- the temperature sensor is offered as a simple example of a source of streaming data and the skilled artisan should recognize many other sources, some examples of which are described below in relation to FIGS. 2-4 .
- a single data stream 102 is input into query graph 110 .
- Other examples where multiple data streams are input into a query graph are described below in relation to FIGS. 2-5 .
- a recursive streaming query relating to streaming data 102 can be performed on query graph 110 such as by performing a recursive step 112 via a recursive loop 114 .
- a recursive streaming query based on streaming data 102 can, in some instances, be characterized as infinite or running forever. However, portions of the recursive streaming query can be executed on recursive loop 114 to generate an output 116 .
- the present implementations can detect when output 116 has advanced to a specific point in time.
- the present implementations can detect when the query graph 110 has advanced to a specific point in time as portions of the recursive query are completed. This can also be thought of detecting forward time progress. Stated another way, the technique can detect when a region of the query graph upstream of a certain point, such as point 118 has completed processing including recursive processing relative to a specific point in time. The technique can cause the query graph to issue a notice from point 118 that computations upstream from that point have advanced to the specific point in time.
- FIG. 1 introduces the concept that query graph 110 can process a recursive streaming query.
- FIG. 2 introduces examples of components that can accomplish the computations associated with processing a recursive streaming query.
- FIG. 2 shows a query graph 210 that includes a plurality of operators 212 for processing a recursive search query from two input streams 214 ( 1 ), 214 ( 2 ).
- query graph 210 includes six operators 212 ( 1 ), 212 ( 2 ), 212 ( 3 ), 212 ( 4 ), 212 ( 5 ), and 212 ( 6 ).
- the term “operator” 212 is used in that the operators “operate”, or perform computations, upon the streaming data responsive to the recursive search query to generate an output from the graph at 216 .
- an operator can receive one or more inputs and process the inputs according to a set of conditions. If the conditions are satisfied, then the operator can generate an output that can be delivered to one or more other operators.
- operator 212 ( 1 ) can be termed a “project” operator; operator 212 ( 2 ) can be termed a “union” operator; operator 212 ( 3 ) can be termed a “join” operator; operator 212 ( 4 ) can be termed a “select” operator; operator 212 ( 5 ) is another project operator; and operator 212 ( 6 ) can be termed a (flying fixed-point (FFP)) operator.
- FFP fixed-point
- query graph 210 can be viewed as being defined by its operators since a number, type, and/or arrangement of operators can be adapted to specific recursive search queries. So, a query is achieved by operating on one or more input streams with the selected operators to generate an output.
- Input data streams 214 ( 1 ) and 214 ( 2 ) describe a changing graph, composed of nodes and edges.
- 214 ( 2 ) describes the (possibly changing) nodes in the changing graph, while 214 ( 1 ) describes the changing set of edges between nodes.
- 214 ( 1 ) and 214 ( 2 ) can be thought of as defining a dynamic input graph that is operated on by query graph 210 .
- the graph is dynamic in that the input streams can change over time.
- the present concepts can be applied to many interesting streaming graph-search problems, such as finding a minimum path to a destination on a road network from a changing location and given changing traffic conditions.
- Another potential application can be regular expression matching over streams.
- Another application can be any form of looping where the process cannot bound the number of iterations at the time the recursive streaming query is created.
- Query graph 210 offers an example of how streaming query results are computed recursively through an example query.
- graph reachability query
- query graph 210 provides two input data streams 214 ( 1 ) and 214 ( 2 ).
- Data stream 214 ( 1 ) relates to edges and data stream 214 ( 2 ) relates to source nodes.
- the plan is a directed graph of streaming versions of relational operators, where each arrow in the diagram is a data stream, and is labeled with the schema of the events traveling along the data stream. Assume for discussion purposes that all stream events are tagged with the application time Vs at which the event becomes valid.
- the data streams are can be interpreted as describing a changing relation. Since the present discussion assumes a single window of infinite duration, the contents of the relation at any time t can be all of the events with Vs ⁇ t. Operators 212 ( 1 )- 212 ( 6 ) then output event streams that describe the changing view computed over the changing input according to the relational semantics of individual operators.
- the present configuration utilizes an FFP operator 212 ( 6 ).
- the FFP operator offers a means to achieve recursion.
- the FFP generates a multicast output 220 that is forwarded to a conventional, non-recursive output indicated at 220 ( 1 ), as well as to one of its descendants in the operator graph.
- output 220 ( 2 ) recursively loops back to union operator 212 ( 2 ) thereby forming a recursive loop 222 .
- the result can be thought of as a form of recursion which terminates when a fixed point is reached.
- FIG. 3 shows a graph 310 that can be used as input for input stream 214 ( 2 ) of FIG. 2 .
- FIG. 3 illustrates nodes 302 ( 1 ), 302 ( 2 ), 302 ( 3 ), and 302 ( 4 ).
- Individual nodes 302 ( 1 )- 302 ( 4 ) are labeled with both the node name as well as the valid time for the node insertion event.
- the graph also illustrates edges 304 ( 1 ), 304 ( 2 ), 304 ( 3 ), and 304 ( 4 ) with accompanying valid times of their edge insertion events.
- nodes 302 ( 1 )- 302 ( 4 ) are what would flow in on the nodes input 214 ( 2 ).
- nodes 304 ( 1 )- 304 ( 4 ) are what flow in on the edges input 214 ( 1 ).
- Time 1 the technique receives four input events on the nodes data stream 214 ( 2 ), which correspond to nodes n 1 , n 2 , n 3 , and n 4 (i.e., 302 ( 1 )- 302 ( 4 ).
- an event can be thought of as a payload and a timestamp. So for instance, node 302 ( 1 ) with a timestamp of 1 is an event. Note that the projection above the nodes stream produces the following 4 events:
- these events then travel through the union operator 212 ( 2 ) and lodge in the join operator's right join synopsis as indicated at 224 . Since there is no input on the left side of the join operator 212 ( 3 ), the process has reached a fixed point. (If it was desired to limit the set of nodes considered as source nodes for reachability, the technique could limit the nodes stream 214 ( 2 ) to only those nodes.)
- Time 2 the technique can receive one event in the edges data stream 214 ( 1 ). This edge travels up to the join operator 212 ( 3 ), which then lodges it in its left synopsis at 226 .
- the event is:
- the select operator 212 ( 4 ) checks if there is a cycle by seeing if the path described above already includes the destination in the new, derived path. This determination is made by checking the 1st bit, since the technique is following the path to n 1 . Since this bit is not set, the event reaches the project operator 212 ( 5 ), which removes unneeded columns and sets the appropriate bit in bv. The result is:
- the technique now reaches the FFP operator 212 ( 6 ), which both outputs the result from the query graph at 220 ( 1 ), and inserts it into the union operator 212 ( 2 ) below the join operator 212 ( 3 ) via output 220 ( 2 ).
- the join operator 212 ( 3 ) then lodges the event in the right synopsis at 224 , but is unable to join it to anything in its left synopsis at 226 .
- the technique has now reached a fixed point.
- Time 3 The technique receives two events in the edges data stream 214 ( 1 ). These events travel up to the join operator 212 ( 3 ) and lodge in its left synopsis 226 . The events are:
- join operator 212 ( 3 ) produces:
- the technique checks for cycles using select operator 212 ( 4 ). Unlike previous times, this time, the technique finds a cycle. The second event has already visited n 3 . The technique therefore does not pass this event through to the next round of recursion and only continues with the first and third events. After projection, these become:
- Time 4 The technique receives an event in the edges data stream 214 ( 1 ). This edge lodges in the join operator's left synopsis 226 , and is:
- join operator 212 ( 3 ) then produces:
- the query avoided infinite loops by maintaining a careful notion of progress in the form of the visited bit vector.
- This notion of progress can be a key to proving that a particular recursive query terminates with the correct answer, and is discussed formally in the formalism section below.
- a stream R is a potentially unbounded sequence e 1 , e 2 , . . . of events.
- a payload will typically be a relational tuple (i.e., an ordered sequence of data values), but might be something else, such as a punctuation pattern.
- the technique utilizes a notion of conformance of a payload p to a schema RR. In other words, a stream R conforms to schema RR if the payload of every event in R conforms to RR.
- control parameters varies from system to system.
- Some of the alternative implementations can include a single control parameter that contains a sequence number assigned at the inputs to a query.
- Another example can include a control parameter that indicates what the event represents (regular tuple, punctuation, end of stream), and a second control parameter giving a timestamp supplied by the stream source.
- Another example can include a control parameter indicating whether the event represents a positive tuple (insertion) or negative tuple (deletion).
- Still a further example can include a pair of control parameters defining a time interval over which the payload is valid.
- any prefix P of R can be reconstituted into a linear sequence r 1 , r 2 , . . . , rm of snapshots over RR.
- Each snapshot is just a finite relation over RR. It is useful to think of how each additional event modifies the reconstitution.
- the technique can treat an event ⁇ sn, p> as adding a new snapshot to the list that adds p to the previous snapshot. That is, it extends r 1 , r 2 , . . . , rsn-1 to r 1 , r 2 , . . .
- the technique can view snapshots being indexed by timestamps, and an event ⁇ s, e; p> as inserting p into any snapshot rtk in rt 1 , rt 2 , . . . , rtm where s ⁇ tk ⁇ e, plus possibly adding a snapshot re to the end of the list if e>tm.
- Some implementations can treat a stream R as representing a potentially infinite list r 1 , r 2 , . . . that is the limit for the reconstitution as the technique takes longer and longer prefixes of R.
- This sequence can be thought of as the canonical history of R, and consider the intent of applying a function f to R to be a stream S whose canonical history is f(r 1 ), f(r 2 ), . . . .
- R converges to a well defined canonical history in the limit. New events might continue to update a particular snapshot indefinitely.
- some implementations can require that a stream make progress, meaning that for each snapshot ri, there comes a point in the stream where ri no longer changes.
- stream R progresses if for any index j, there is a point after which for any event e, stable(e) ⁇ j. At that point, snapshot rj is stabilized—it will no longer change. If R progresses, then every snapshot eventually stabilizes, and the canonical history is well defined. In this case, the technique can use R@i to denote snapshot ri in the canonical history of R.
- snapshots in a reconstitution or canonical history need not be indexed by sequential integers. Any strictly increasing sequence can work and some implementations can use timestamps in the sequel.
- stream R explicitly progresses if for any index j, there is some event e in R that is punctuation at i, where i>j.
- “normal” events can serve as punctuations.
- these implementations can utilize specific punctuation events (flagged as such with a control parameter). It can be assumed that stream operators produce explicitly progressing output given explicitly progressing inputs. Thus, the stream operator can propagate punctuation appropriately.
- FFP operator 212 ( 6 ) can also have speculative punctuation, which is similar to regular punctuation, but does not actually guarantee stream progress.
- the following discussion will refer to non-speculative punctuation as definite punctuation for purposes of distinguishing the two.
- the discussion below uses dp(i) to denote a definite punctuation event at index i, and sp(i) to denote a speculative punctuation event at index i.
- CEDR complex event detection and response
- CEDR system further refines application time into occurrence time and valid time, thereby providing a tri-temporal model of occurrence time, valid time, and system time.
- a temporal stream model is used to characterize streams, engine operator semantics, and consistency levels for handling out-of-order or invalidated data.
- the tritemporal model is employed.
- the temporal model employed herein, however, is simplified in the sense of modeling valid time and system time (occurrence time is omitted). For the purposes of this description, this is sufficient, since only these two notions of time are necessary to understand the disclosed speculative output and consistency levels.
- a CEDR data stream is modeled as a time varying relation. For most operators, an interpretation is used that a data stream models a series of updates on the history of a table, in contrast to conventional work which models the physical table updates themselves.
- a stream is modeled as an append-only relation.
- Each tuple in the relation is an event, and has a logical ID and a payload.
- Each tuple also has a validity interval, which indicates the range of time when the payload is in the underlying table. Similar to the convention in temporal databases, the interval is closed at the beginning, and open at the end. Valid start and end times are denoted as Vs and Ve, respectively.
- CEDR or system time
- C CEDR (or system) time
- CEDR has the ability to introduce the history of new payloads with insert events. Since these insert events model the history of the associated payload, both valid start and valid end times are provided. In addition, CEDR streams can also shrink the lifetime of payloads using retraction events. These retractions can reduce the associated valid end times, but are not permitted to change the associated valid start times. Retraction events provide new valid end times, and are uniquely associated with the payloads whose lifetimes are being reduced. A full retraction is a retraction where the new valid end time is equal to the valid start time. Further details about CEDR technologies can be obtained from U.S.
- EventType there are four control parameters, EventType, VStart, VEnd, and VNewEnd. Snapshots in CEDR are indexed by timestamps.
- the EventType can be Insert, Retract, CTI or EOS.
- For Insert, VStart and VEnd indicate the range of snapshot indices over which the payload is valid. That is, the payload belongs to all snapshots in that range. Note that the interval is closed at the beginning and open at the end, so the payload is not in the snapshot associated with VEnd.
- VStart For Retract, all of VStart, VEnd and the payload should match a previously seen event e, and VNewEnd, where VStart ⁇ VNewEnd ⁇ VEnd, effectively specifies a new VEnd for e.
- a CTI (current time increment) event is a (definite) punctuation at index VStart.
- EOS stands for “End of Stream”, and is only issued if a stream is ceasing output.
- the present techniques can view a relational query Q over which a fixed point can be computed as having two relational parameters, r and s, designated as Q(r, s).
- Parameter r can name an external input (and can be generalized to a set of relations).
- Parameter s can name the recursion parameter, which represents data headed around the recursive loop.
- the technique can specify that tuple t has level i if it appears in Qi(r).
- the fixed point of Q on r is
- One potential goal for recursive queries over a stream R is to compute the fixed point of each snapshot in the canonical history of R. That is, given progressing stream R and Query Q, it can be desirable to produce a progressing stream S such that, for every index i,
- Q Q with appropriate algebraic operators can allow FFP operators to be used with a target query Q(r, s).
- the algebraic operators are appropriate if they express Q with algebraic operators such that they behave appropriately with regard to speculative punctuation.
- a streaming operator G can be considered to be speculation-friendly if the following three conditions hold.
- G does not block on definite punctuation.
- G is forward moving.
- G speculates correctly if given a speculative punctuation sp(i) in one input stream, and that every other input stream is explicitly progressing, G will eventually emit speculative punctuation sp(j) where j ⁇ i. Moreover, if it turns out that sp(i) actually holds (that is, G receives no later event e with stable(e) ⁇ i), then sp(j) actually holds (G will emit no event d with stable(d) ⁇ j). Also, if G has previously emitted a definite punctuation dp(k), then j ⁇ min(i, k). This last condition says that G doesn't “back up” from previously emitted definite punctuation. In practice, it will always turn out that i>k, so j>k.
- G will typically track definite punctuation on its other inputs.
- i j.
- an alternative implementation (termed the probing approach) is described below where j is sometimes less than i.
- G In relation to the second condition, where G does not block on definite punctuation, G is presumed to produce explicitly progressing output on explicitly progressing input. This method further instructs operators to emit output in the absence of any particular definite punctuation. Such a G must output the same collection of non-punctuation events on any two input streams with the same non-punctuation events. Any monotonic operator has a non-blocking implementation. (Handling non-monotonic operators by being able to revise previous outputs is discussed below under the heading “FFP in CEDR”).
- the third condition where G is forward moving utilizes an input event e. If input event e for G contributes to output event d, then the technique specifies that stable(e) ⁇ stable(d). In practice, it is unlikely that an operator G could arbitrarily shift events backward in time without violating the first condition.
- O, Ir and Is are essentially “ports” of this query tree, where O connects to an output stream, Ir connects to an external input stream R, and Is will be for recursive input.
- the FFP operator can also be viewed as having ports: FFP(I, OE, OR). Here I connects to an input stream, OE connects to the external output stream, and OR connects to the recursive output stream.
- the technique can apply FFP to T and R to make the following connections:
- OE will connect either directly to a client, or to the input of a downstream operator.
- This discussion denotes this arrangement of operators by FFP(R, T).
- FFP, T and R are connected in this manner, a recursive loop is created that passes from OR to Is to O to I.
- FIG. 4 shows the recursive loop in this reachability query as a dashed line 402 .
- FIG. 4 retains the operators 212 ( 1 )- 212 ( 6 ) and data streams 214 ( 1 )- 214 ( 2 ) introduced in relation to FIG. 2 and these components will not be reintroduced here for sake of brevity.
- Q and hence T, has two external input streams, one for nodes and one for edges.
- Q can be thought of as a conceptual entity.
- Q will be represented as a tree of operators, T.
- T can consist of the operators 212 ( 1 )- 212 ( 5 ), but not 212 ( 6 ), which is the FFP operator.
- this technique views the FFP operator as operating in phases, iterating over segments of its input separated by speculative punctuations. (These phases in general will be different from the levels of recursion defined earlier.)
- the discussion assumes that at startup, the FFP operator 212 ( 6 ) emits a speculative punctuation sp(tmin) on OR at 404 , where tmin is known to be before the stable points of all events on all external input streams.
- a segment of input for FFP operator 212 is a maximal sequence of events e 1 , e 2 , . . . , em, sp(t) received on I, where none of the ei's is a speculative punctuation.
- e 1 must either be the first event on I, or be preceded immediately by a speculative punctuation.
- the present techniques can allow that a segment can have e 1 , e 2 , . . . , em be the empty list.
- the constant c can be chosen as the minimal possible time interval, sometimes called a chronon.
- FFP operator 212 ( 6 ) may only ever have one speculative punctuation circulating on the recursive loop at a time. Its strategy is to keep circulating a speculative punctuation ps(t) until it determines that the punctuation is valid, then it converts it to a definite punctuation and starts speculating at a later point. The next section will present conditions under which such speculation must always eventually succeed.
- T(O, Ir, Is) be a query tree for a strongly convergent query Q(r, s). If T uses speculation-friendly operators and R is an explicitly progressing stream, then FFP(R, T) outputs an explicitly progressing stream S ⁇ Q*(R).
- a proof is provided below in two main parts.
- the first part establishes that S is a fixed-point stream for R under Q.
- the second part shows that S is explicitly progressing.
- This proof is provided for discussion purposes in relation to specific implementations. Other implementations can achieve recursive data stream processing without relying on the absolute assertion expressed in this proof.
- FFP will always see the end of a segment (that is, the next speculative punctuation). After FFP emits any events on OR in step F 2 , it will necessarily emit a speculative punctuation on OR in step F 3 . a or F 3 . b . Because every operator on the recursive loop is speculation-friendly, each must eventually pass on the speculative punctuation until it gets back to I. Now consider segment e 1 , e 2 , . . . , em, sp(t) that satisfies the if-statement in step F 3 . a . When e 1 , e 2 , . . .
- a speculative punctuation sp(t) can only be recirculated a finite number of times by step F 3 . b before step F 3 . a applies. Since the input of FFP progresses, as shown in the first part of the proof, there must eventually be a segment where e 1 , e 2 , . . . , em all have stable points after t. Further, each time the technique uses step F 3 . a , it increases the index for the speculative punctuation by at least c. Thus, the technique must eventually speculate at some index v ⁇ u.
- CEDR events may already contain definite punctuations called CTIs (current time increment). These punctuations come with a timestamp t.
- CTIs current time increment
- t timestamp
- CEDR operators except “align”, do not need these events to unblock output, since they do not have to block in the first place. Rather, they can produce speculative results incorporating all the received events, and correct these results later if necessary using retractions.
- specCTI speculative CTIs
- the described strategy can require that SpecCTIs loop through the recursion unchanged. This restriction may force the SpecCTI to become lodged until another operator input catches up or until a SpecCTI may be safely emitted for the requested time.
- S 1 is easily upheld for unitary or unary operators. After unblocking any necessary output, thus possibly producing speculative output, the technique simply allows the SpecCTI through.
- Binary operators are a bit trickier. Assuming that one branch is in the recursive loop, the technique lodges the SpecCTI in the binary operator until it receives, from the non-recursive child, a definite CTI with timestamp greater than or equal to the SpecCTI timestamp. This delay ensures that all input from the non-recursive side that could influence the output states prior to the SpecCTI has been absorbed and emitted by the operator before emitting the SpecCTI.
- S 3 is also trivially upheld by all operators in the CEDR algebra except AlterLifetime, which is the only operator in the algebra which can emit an event that includes valid times outside the range of the input event which generated it. Since AlterLifetime is used for windowing in the employed algebra, the technique could require that all windowing be done on inputs outside the recursive loop.
- FFP operator augments the CEDR multicast operator to handle specCTIs.
- This implementation tracks the “high water mark” of the stabilization point for all Insert and Retract events, and uses this value to speculate with. Its handling of specCTIs and other events follows the algorithm given in Section 3.4, except it performs steps F 1 -F 3 on the fly. To test the if-condition in step F 3 . a , it remembers the timestamp in the currently circulating specCTI, and sets a flag if it sees an earlier event before the specCTI returns. Thus, this implementation uses a fixed amount of state.
- the CEDR stream processing system uses operators that inherently speculate very aggressively by issuing full or partial retractions for previous events in the input stream. Using this mechanism, operators are free to speculate as aggressively as—at any given time—producing all output under the assumption that the input received so far is all the input. Speculation may then be throttled back using the Align operator, and permanence of output may be forced by the finalize operator for the purpose of managing state in the absence of frequent-enough CTIs.
- Some of the described implementations of FFP can handle the Retract events that are sometimes issued by speculative operators, by virtue of starting with operator implementations that handle Retracts.
- some techniques associate windows with data. More specifically, some of these techniques can associate with every event, an interval (as opposed to other systems, which use a single timestamp). This interval is actually the time during which a particular payload, associated with the event, is in the snapshots being modeled by the stream. This treatment has the effect of assigning payloads to windows, such that the valid time interval of the event determines the output times during which a windowed version of any operator includes the payload.
- the AlterLifetime operator can be used to explicitly set these windows.
- FIG. 5 shows the resulting plan in the form of query graph 510 .
- query graph 510 includes seven operators 512 ( 1 )- 512 ( 7 ).
- Operators 512 ( 1 ) and 512 ( 2 ) are join operators; operator 512 ( 3 ) is a project operator; operator 512 ( 4 ) is a union operator; operator 512 ( 5 ) is a multicast operator; operator 512 ( 6 ) is another project operator, and operator 512 ( 7 ) is an FFP operator.
- Two data streams serve as sample input to query graph 510 ; the first input stream is in the form of state machine 514 ( 1 ), while the second input stream is in the form of symbols 514 ( 2 ).
- An example of input data of the state machine 514 ( 1 ) is evidenced generally at 516 .
- An example of input data of the symbols 514 ( 2 ) is evidenced generally at 518 .
- Output from the query graph 510 is evidenced generally at 520 .
- the state machine is given as a streaming input, and may, in theory, change over time.
- the plan is actually a streaming program for executing arbitrary, evolving automata.
- the particular automata that are executed here searches for the pattern AB*A.
- the query can output all discovered event sequences that constitute partial and complete patterns, and their associated states in the automata.
- S The starting state
- F the final state
- the state machine input 514 ( 1 ) is described using a set of transitions such that each transition absorbs an accompanying input.
- the symbols input 514 ( 2 ) is a description of the sequence in which an attempt is made to find patterns. Each event has a sequence number, and a symbol, which may match a symbol in the automata transition table.
- the state machine input 514 ( 1 ) is loaded into a right join synopsis 522 of the lower join 512 ( 1 ).
- join 512 ( 1 ) finds all transitions which can be made using this symbol, and passes these transitions to the join 512 ( 2 ) above at left synopsis 526 , which looks for partial patterns which have ended in the starting state of one of the activated transitions, and which sequentially precede the new symbol. For all such matches, the technique has found a new (partial or complete) pattern, which is output and recursively inserted back into a right synopsis 528 of the upper join 512 ( 2 ).
- the technique creates a seed start state on each input symbol and recursively inserts it into the right join synopsis 528 of the upper join 512 ( 2 ).
- the input sequence is: ‘ABBA’.
- the technique should output the following patterns and their associated end sequence IDs:
- the above description offers systems and techniques for processing recursive streaming queries.
- the description further defines how query graphs utilized in the processing can be updated to specific points in time even while the recursive streaming query may remain ongoing.
- the above described techniques/methods and systems can be implemented on any type of networkable computing device(s) as should be recognized by the skilled artisan.
- FIG. 6 illustrates a flowchart of a method or technique 600 that is consistent with at least some implementations of the present concepts.
- the order in which the technique 600 is described is not intended to be construed as a limitation, and any number of the described blocks can be combined in any order to implement the technique, or an alternate technique.
- the technique can be implemented in any suitable hardware, software, firmware, or combination thereof, such that a computing device can implement the technique.
- the technique is stored on a computer-readable storage media as a set of instructions such that execution by a computing device causes the computing device to perform the technique.
- the technique processes a recursive streaming query through a query graph at block 602 .
- Query graphs consist of operators connected to one another via streams. Non-limiting examples of operators and potential arrangements of operators in a query graph are detailed above in relation to FIGS. 2-5 .
- the technique detects when output produced by executing the query graph advances to a specific point at block 604 .
- One implementation involves circulating speculative CTIs through a recursive loop of the query graph to detect when the output has advanced to the specific point. Examples of this and other exemplary techniques are described above.
- the present concepts can be employed in implementations that are sufficiently expressive to attack both graph-walking queries and regular-expression pattern matching.
- pattern matching the associated query plan is actually linear in the number of transitions of the finite automata which detects the pattern, resulting in a highly efficient algorithm.
- Even further expressiveness is achieved in CEDR by speculating when necessary to ensure disorder tolerance. This allows operators such as aggregation and difference to be used in recursive loops, which is useful for expressing branch and bound execution strategies.
- Detecting forward time progress is relatively straightforward with the addition of speculative CTIs, which function similarly to regular CTIs.
- the above discussion includes two implementations; a blocking speculative-CT strategy based on high water marks and a non-blocking version based on probing.
Abstract
The described implementations relate to recursive streaming queries. One technique processes a recursive streaming query through a query graph. The technique also detects when output produced by executing the query graph advances to a specific point.
Description
- Computers are very effective at storing large amounts of data, such as in a database. Over the last half century or so, techniques have been refined for establishing computational options, such as accessing or querying the stored data, viewing the data, modifying the data, etc. In these scenarios, the data can be thought of as relatively static and so the techniques, such as database querying techniques tend not to be very applicable to time sensitive scenarios, such as those involving real-time or near real-time. For instance, a database query technique designed to retrieve a definition of a word from a dictionary database need not be time sensitive since the data is statically stored in the database.
- In contrast, other scenarios tend to be involve streaming data in real-time or near real-time. For instance, a temperature sensor may be configured to periodically output a time-stamped signal corresponding to a sensed temperature. When viewed collectively this output can be thought of as a stream of data or a data stream. The above mentioned database querying techniques are not generally applicable in the data stream scenarios. Instead, stream processing techniques have been developed for use with data streams.
- Stream processing techniques offer much more limited computational options that those available in traditional database scenarios. Stated another way, a very small set of computations can presently be performed with stream processing. The present concepts introduce new stream processing techniques that greatly increase the set of computations that can be accomplished with stream processing.
- The described implementations relate to recursive streaming queries. One method or technique processes a recursive streaming query through a query graph. The technique also detects when output produced by executing the query graph advances to a specific point.
- Another implementation is manifested as a method that processes at least one input stream associated with a recursive streaming query. The technique also advances time for the recursive streaming query to a specific point when at least one input stream has advanced to the specific point and recursive computations on the input stream are complete to the point.
- The above listed examples are intended to provide a quick reference to aid the reader and are not intended to define the scope of the concepts described herein.
- The accompanying drawings illustrate implementations of the concepts conveyed in the present application. Features of the illustrated implementations can be more readily understood by reference to the following description taken in conjunction with the accompanying drawings. Like reference numbers in the various drawings are used wherever feasible to indicate like elements. Further, the left-most numeral of each reference number conveys the Figure and associated discussion where the reference number is first introduced.
-
FIGS. 1-5 show exemplary graphs for processing recursive streaming queries in accordance with some implementations of the present concepts. -
FIG. 6 is a flowchart of exemplary recursive streaming query processing techniques in accordance with some implementations of the present concepts. - This patent application pertains to stream processing and more specifically to recursive streaming queries. A data stream or streaming data can be thought of as events or notifications that are generated in real-time or near real-time. For introductory discussion purposes, an event can be thought of as including event data or payload and a timestamp.
- Processing recursive streaming queries can entail the use of one or more recursions. A recursion can be thought of as a function that is defined in terms of itself so that it can involve potentially infinite or unbounded computation. In a streaming data scenario, computation resources are reserved for specific events until the resources are no longer needed. The present implementations offer solutions for detecting when recursive processing is completed up to a specific point in time. Thus, the recursion may remain infinite, but the present techniques can identify specific time periods for which the recursive processing of streaming queries is complete. Computation resources can then be freed up to the specific point in time.
- For instance, consider introductory
FIG. 1 that illustrates an exemplary recursive streaming query processing method generally at 100 Accompanying streaming data upon which the method can be implemented is evidenced at 102. Generally, the method processes at least one input stream (i.e., streaming data 102) associated with a recursive streaming query at 104. At 106, the method also advances time for the recursive streaming query to a specific point when two conditions are met. First, the one input stream has advanced to the specific point and second, recursive computations on the input stream are complete to the point. - Assume for purposes of explanation, that
streaming data 102 is emitted from atemperature sensor 108 and processed on aquery graph 110. The temperature sensor is offered as a simple example of a source of streaming data and the skilled artisan should recognize many other sources, some examples of which are described below in relation toFIGS. 2-4 . Further, in this example only asingle data stream 102 is input intoquery graph 110. Other examples where multiple data streams are input into a query graph are described below in relation toFIGS. 2-5 . - A recursive streaming query relating to
streaming data 102 can be performed onquery graph 110 such as by performing arecursive step 112 via arecursive loop 114. - A recursive streaming query based on
streaming data 102 can, in some instances, be characterized as infinite or running forever. However, portions of the recursive streaming query can be executed onrecursive loop 114 to generate anoutput 116. The present implementations can detect whenoutput 116 has advanced to a specific point in time. - In summary, even though the recursive streaming query may run indefinitely, the present implementations can detect when the
query graph 110 has advanced to a specific point in time as portions of the recursive query are completed. This can also be thought of detecting forward time progress. Stated another way, the technique can detect when a region of the query graph upstream of a certain point, such aspoint 118 has completed processing including recursive processing relative to a specific point in time. The technique can cause the query graph to issue a notice frompoint 118 that computations upstream from that point have advanced to the specific point in time. -
FIG. 1 introduces the concept thatquery graph 110 can process a recursive streaming query.FIG. 2 introduces examples of components that can accomplish the computations associated with processing a recursive streaming query. -
FIG. 2 shows aquery graph 210 that includes a plurality ofoperators 212 for processing a recursive search query from two input streams 214(1), 214(2). - In this case,
query graph 210 includes six operators 212(1), 212(2), 212(3), 212(4), 212(5), and 212(6). The term “operator” 212 is used in that the operators “operate”, or perform computations, upon the streaming data responsive to the recursive search query to generate an output from the graph at 216. Briefly, an operator can receive one or more inputs and process the inputs according to a set of conditions. If the conditions are satisfied, then the operator can generate an output that can be delivered to one or more other operators. - In the present case, operator 212(1) can be termed a “project” operator; operator 212(2) can be termed a “union” operator; operator 212(3) can be termed a “join” operator; operator 212(4) can be termed a “select” operator; operator 212(5) is another project operator; and operator 212(6) can be termed a (flying fixed-point (FFP)) operator. The function of these operators is described in more detail below.
- Considered from one perspective,
query graph 210 can be viewed as being defined by its operators since a number, type, and/or arrangement of operators can be adapted to specific recursive search queries. So, a query is achieved by operating on one or more input streams with the selected operators to generate an output. - Input data streams 214(1) and 214(2) describe a changing graph, composed of nodes and edges. 214(2) describes the (possibly changing) nodes in the changing graph, while 214(1) describes the changing set of edges between nodes. In other words, 214(1) and 214(2) can be thought of as defining a dynamic input graph that is operated on by
query graph 210. The graph is dynamic in that the input streams can change over time. - For instance, consider an input graph where each edge is labeled with a number and the user wants to know what is the shortest path from one node to another node. Until the moment that the actual graph is generated, the number of steps that might be in that shortest path cannot be bound. So, that also has an unbounded nature in that the graph is unknown at the time of query creation.
- The present concepts can be applied to many interesting streaming graph-search problems, such as finding a minimum path to a destination on a road network from a changing location and given changing traffic conditions. Another potential application can be regular expression matching over streams. Another application can be any form of looping where the process cannot bound the number of iterations at the time the recursive streaming query is created.
-
Query graph 210 offers an example of how streaming query results are computed recursively through an example query. When viewed formally the present example can rely upon the following graph reachability query: -
- Given a directed graph G=(N, E) with nodes N={ni|i=1 . . . j}, a and edges E={(n1 i, n2 i)|i=1 . . . k}, compute all pairs (n1, n2), n1εN, n2 εN, such that n2 is reachable from n1 through one or more edges in E.
- Note that the present techniques solve the formal problem stated above under the assumption that the graph is not known at compile time. Furthermore, the graph may change over time. The description of the graph is, therefore, in and of itself streaming. While this example might seem contrived, it is, in fact, a good starting point for discussing streaming queries over networks and roads, where both edge properties (e.g., traffic conditions) and graph structure (e.g., links failing and recovering in a network) are volatile.
- This discussion introduces techniques for calculating results and lays the foundation for examining recursive streaming queries. For ease of explanation, assume that this recursive streaming query has a single window of infinite size, there are no retractions (for example, to revise erroneous or speculative items) in the input stream, and that there are no punctuations to deal with. All of these assumptions will be removed in later sections.
- As mentioned above
query graph 210 provides two input data streams 214(1) and 214(2). Data stream 214(1) relates to edges and data stream 214(2) relates to source nodes. Also note that the plan is a directed graph of streaming versions of relational operators, where each arrow in the diagram is a data stream, and is labeled with the schema of the events traveling along the data stream. Assume for discussion purposes that all stream events are tagged with the application time Vs at which the event becomes valid. - The data streams are can be interpreted as describing a changing relation. Since the present discussion assumes a single window of infinite duration, the contents of the relation at any time t can be all of the events with Vs≦t. Operators 212(1)-212(6) then output event streams that describe the changing view computed over the changing input according to the relational semantics of individual operators.
- As introduced above, the present configuration utilizes an FFP operator 212(6). The FFP operator offers a means to achieve recursion. The FFP generates a
multicast output 220 that is forwarded to a conventional, non-recursive output indicated at 220(1), as well as to one of its descendants in the operator graph. In this case, output 220(2) recursively loops back to union operator 212(2) thereby forming arecursive loop 222. The result can be thought of as a form of recursion which terminates when a fixed point is reached. - Another interesting feature of the illustrated configuration is the schema elements labeled “bv”. These are, in fact, bit vectors, each of which is k bits long. The present techniques can use this bit vector to track visited nodes in
query graph 210 and avoid infinite looping through cycles. -
FIG. 3 shows agraph 310 that can be used as input for input stream 214(2) ofFIG. 2 .FIG. 3 illustrates nodes 302(1), 302(2), 302(3), and 302(4). Individual nodes 302(1)-302(4) are labeled with both the node name as well as the valid time for the node insertion event. Similarly, the graph also illustrates edges 304(1), 304(2), 304(3), and 304(4) with accompanying valid times of their edge insertion events. Viewed in light ofFIG. 2 , nodes 302(1)-302(4) are what would flow in on the nodes input 214(2). Similarly, nodes 304(1)-304(4) are what flow in on the edges input 214(1). - For the sake of concreteness and clarity, the present discussion will follow the execution of the query plan to completion for each distinct moment in time. The discussion will also rely upon the assumption that each operator processes input events in batches such that all input events with the same valid time are processed at once. The discussion is directed to the behavior of this plan at the four distinct points in time from
time 1 totime 4. Since the present example includes 4 distinct nodes 302(1)-302(4), bv is 4 bits long. - Time 1: the technique receives four input events on the nodes data stream 214(2), which correspond to nodes n1, n2, n3, and n4 (i.e., 302(1)-302(4). Recall that an event can be thought of as a payload and a timestamp. So for instance, node 302(1) with a timestamp of 1 is an event. Note that the projection above the nodes stream produces the following 4 events:
- (1, n1, n1, 1000), (1, n2, n2, 0100),
- (1, n3, n3, 0010), (1, n4, n4, 0001)
- In
FIG. 2 , these events then travel through the union operator 212(2) and lodge in the join operator's right join synopsis as indicated at 224. Since there is no input on the left side of the join operator 212(3), the process has reached a fixed point. (If it was desired to limit the set of nodes considered as source nodes for reachability, the technique could limit the nodes stream 214(2) to only those nodes.) - Time 2: the technique can receive one event in the edges data stream 214(1). This edge travels up to the join operator 212(3), which then lodges it in its left synopsis at 226. The event is:
- (2, n3, n1)
- This event means that starting at time two, the input relation on the left side of the join operator 212(3) contains an edge going from n3 to n1. Given the join condition, this edge joins to one row on the right side: (1, n3, n3, 0010). The join operator 212(3) then outputs:
- (2, n3, n1, n3, n3, 0010)
- The select operator 212(4) then checks if there is a cycle by seeing if the path described above already includes the destination in the new, derived path. This determination is made by checking the 1st bit, since the technique is following the path to n1. Since this bit is not set, the event reaches the project operator 212(5), which removes unneeded columns and sets the appropriate bit in bv. The result is:
- (2, n3, n1, 1010)
- This result concludes that there exists a path from n3 to n1, and that this path first appeared at
valid time 2. The technique now reaches the FFP operator 212(6), which both outputs the result from the query graph at 220(1), and inserts it into the union operator 212(2) below the join operator 212(3) via output 220(2). The join operator 212(3) then lodges the event in the right synopsis at 224, but is unable to join it to anything in its left synopsis at 226. The technique has now reached a fixed point. - Time 3: The technique receives two events in the edges data stream 214(1). These events travel up to the join operator 212(3) and lodge in its
left synopsis 226. The events are: - (3, n1, n2), (3, n2, n3)
- Note that at this point, the
left join synopsis 226 contains the following entries: - (3, n1, n2), (3, n2, n3), (2, n3, n1)
- By joining the two new events to entries in the
right synopsis 224, the join operator 212(3) produces: - (3, n1, n2, n1, n1, 1000), (3, n1, n2, n3. n1, 1010),
- (3, n2, n3, n2, n2, 0100)
- All three events get past the select operator 212(4) since all the checked bits are 0, and therefore the process has not encountered a cycle yet. After projection by projection operator 212(5), these three events become:
- (3, n1, n2, 1100), (3, n3, n2, 1110),
- (3, n2, n3, 0110)
- These entries are now output by the FFP operator 212(6) and loop around again to lodge in the join operator's right synopsis at 224. This time, however, the technique has not yet reached a fixed point. By joining the three new events to the join operator's
left synopsis 226, the technique produces the following events: - (3, n2, n3, n1, n2, 1100), (3, n2, n3, n3, n2, 1110),
- (3, n3, n1, n2, n3, 0110)
- Continuing the query, the technique checks for cycles using select operator 212(4). Unlike previous times, this time, the technique finds a cycle. The second event has already visited n3. The technique therefore does not pass this event through to the next round of recursion and only continues with the first and third events. After projection, these become:
- (3, n1, n3, 1110), (3, n2, n1, 1110)
- These are now output and passed back to the union operator 212(2) for another round of recursion. These entries lodge in the join operator's
right synopsis 224, and produce two new events. It is not hard to see that these new events cannot get past the select operator 212(4) since the first three bits are set for both events. The technique has again reached a fixed point. Note that the following output has been produced so far: - (2, n3, n1, 1010), (3, n1, n2, 1100), (3, n3, n2, 1110),
- (3, n2, n3, 0110), (3, n1, n3, 1110), (3, n2, n1, 1110)
- This output succinctly says that each of the first three nodes is reachable from all the other first three nodes.
- Time 4: The technique receives an event in the edges data stream 214(1). This edge lodges in the join operator's
left synopsis 226, and is: - (4, n3, n4)
- The join operator 212(3) then produces:
- (4, n3, n4, n3, n3, 0010), (4, n3, n4, n2, n3, 0110),
- (4, n3, n4, n1, n3, 1110)
- All of these events get through the select operator 212(4) since none have their 4th bits set, and become:
- (4, n3, n4, 0011), (4, n2, n4, 0111),
- (4, n1, n4, 1111)
- The events are then output by the FFP operator 212(6), loop around, and lodge in join operator's
right synopsis 224 without joining to anything. The technique has again reached a fixed point. Note that the output attime 4 says that n4 may be reached from any other node. - There are a few interesting observations that can be derived from this example.
- First, for clarity, the above discussion presented the example in a way that quiesced the query between time increments. The same result, although possibly with a different output order, would have been achieved if new input were allowed into the
recursive loop 222 before a fixed point had been reached. This outcome is possible because of the order insensitivity of the operators used in this recursive query plan. Operators, such as aggregation and difference, do not have this property, and can require either quiescence of the recursive loop between increasing valid time increments or implementations capable of speculative execution, when used in recursive queries. There will be further discussion of this point in later sections. - Second, the query avoided infinite loops by maintaining a careful notion of progress in the form of the visited bit vector. This notion of progress can be a key to proving that a particular recursive query terminates with the correct answer, and is discussed formally in the formalism section below.
- Traditional notions of punctuations would likely fail if used in the context of this query, since operators in the recursive loop wait on themselves for a punctuation. The punctuations would therefore become blocked at the union and join operators 212(2), 212(3), respectively, which would receive punctuations from their non-recursive inputs, but never the recursive one. This issue is addressed fully in the formalism section below.
- The following discussion formally defines concepts related to streams, punctuations, and queries. The discussion also describes what is required for an operator implementation to be speculation friendly, and prove that the FFP operator 212(6) functions correctly with appropriate inputs, streams, and operators.
- The present concepts can utilize a formal model of streams that tends to encompass most previous stream models. Formally, a stream R is a potentially unbounded sequence e1, e2, . . . of events. An event e consists of one or more control parameters c1, c2, . . . , cn, plus an optional payload p, which is written as e=<c1, c2, . . . , cn; p>. A payload will typically be a relational tuple (i.e., an ordered sequence of data values), but might be something else, such as a punctuation pattern. The technique utilizes a notion of conformance of a payload p to a schema RR. In other words, a stream R conforms to schema RR if the payload of every event in R conforms to RR.
- The exact nature of control parameters varies from system to system. Some of the alternative implementations can include a single control parameter that contains a sequence number assigned at the inputs to a query. Another example can include a control parameter that indicates what the event represents (regular tuple, punctuation, end of stream), and a second control parameter giving a timestamp supplied by the stream source. Another example can include a control parameter indicating whether the event represents a positive tuple (insertion) or negative tuple (deletion). Still a further example can include a pair of control parameters defining a time interval over which the payload is valid.
- The present implementations do not constrain the details of the control parameters. Instead, some implementations require that for stream R(RR), any prefix P of R can be reconstituted into a linear sequence r1, r2, . . . , rm of snapshots over RR. Each snapshot is just a finite relation over RR. It is useful to think of how each additional event modifies the reconstitution. For example, with the first alternative described above, the technique can treat an event <sn, p> as adding a new snapshot to the list that adds p to the previous snapshot. That is, it extends r1, r2, . . . , rsn-1 to r1, r2, . . . , rsn-1, rsn, where rsn=rsn-1 ∪{p}. For the final alternative offered above, the technique can view snapshots being indexed by timestamps, and an event <s, e; p> as inserting p into any snapshot rtk in rt1, rt2, . . . , rtm where s≦tk<e, plus possibly adding a snapshot re to the end of the list if e>tm.
- Some implementations can treat a stream R as representing a potentially infinite list r1, r2, . . . that is the limit for the reconstitution as the technique takes longer and longer prefixes of R. This sequence can be thought of as the canonical history of R, and consider the intent of applying a function f to R to be a stream S whose canonical history is f(r1), f(r2), . . . . However, there is no guarantee that R converges to a well defined canonical history in the limit. New events might continue to update a particular snapshot indefinitely. Thus, some implementations can require that a stream make progress, meaning that for each snapshot ri, there comes a point in the stream where ri no longer changes.
- For an event e in stream R, let P be the prefix of R up to e, and P:e be P with the addition of e. Let the reconstitution of P be r1, r2, . . . rm, and the reconstitution of P:e be s1, s2, . . . , sn. Then define the stabilization point of e relative to R, stable(e), as the maximum i such that:
-
r1=s1,r2=s2, . . . , ri=si. - That is, e does not modify any of r1, r2, . . . , ri. It can be considered that stream R progresses if for any index j, there is a point after which for any event e, stable(e)≧j. At that point, snapshot rj is stabilized—it will no longer change. If R progresses, then every snapshot eventually stabilizes, and the canonical history is well defined. In this case, the technique can use R@i to denote snapshot ri in the canonical history of R.
- Note that snapshots in a reconstitution or canonical history need not be indexed by sequential integers. Any strictly increasing sequence can work and some implementations can use timestamps in the sequel.
- The above discussion considers only progressing streams, so that the canonical history is always defined. However, at least some of the implementations can detect progress and then make use of it. For some streams, this task is easy—for example, in the first alternative offered above, if events are assumed to be in order of increasing sequence number, then one approach entails handling disordered streams (at least in the recursive part of the query). This approach can utilize a form of punctuation to explicitly mark progress. An event e in stream R constitutes a punctuation at i if every event d after e in R has stable(d)>i. Then it can be stated that stream R explicitly progresses if for any index j, there is some event e in R that is punctuation at i, where i>j. In some cases, such as ordered streams, “normal” events can serve as punctuations. However, to handle disordered streams, these implementations can utilize specific punctuation events (flagged as such with a control parameter). It can be assumed that stream operators produce explicitly progressing output given explicitly progressing inputs. Thus, the stream operator can propagate punctuation appropriately.
- The above definition of FFP operator 212(6) can also have speculative punctuation, which is similar to regular punctuation, but does not actually guarantee stream progress. The following discussion will refer to non-speculative punctuation as definite punctuation for purposes of distinguishing the two. The discussion below uses dp(i) to denote a definite punctuation event at index i, and sp(i) to denote a speculative punctuation event at index i.
- The following discussion relates to an implementation that leverages complex event detection and response (CEDR) technologies. A brief introduction to CEDR technologies follows.
- Conventional stream systems separate the notion of application time and system time, where application time is the clock that event providers use to timestamp tuples created by the providers, and system time is the clock of the receiving stream processor. The disclosed architecture, referred to throughout this description as the CEDR system, further refines application time into occurrence time and valid time, thereby providing a tri-temporal model of occurrence time, valid time, and system time.
- A temporal stream model is used to characterize streams, engine operator semantics, and consistency levels for handling out-of-order or invalidated data. In one implementation, the tritemporal model is employed. The temporal model employed herein, however, is simplified in the sense of modeling valid time and system time (occurrence time is omitted). For the purposes of this description, this is sufficient, since only these two notions of time are necessary to understand the disclosed speculative output and consistency levels.
- A CEDR data stream is modeled as a time varying relation. For most operators, an interpretation is used that a data stream models a series of updates on the history of a table, in contrast to conventional work which models the physical table updates themselves. In CEDR, a stream is modeled as an append-only relation. Each tuple in the relation is an event, and has a logical ID and a payload. Each tuple also has a validity interval, which indicates the range of time when the payload is in the underlying table. Similar to the convention in temporal databases, the interval is closed at the beginning, and open at the end. Valid start and end times are denoted as Vs and Ve, respectively. When an event arrives at a CEDR stream processing system, its CEDR (or system) time, denoted as C, is assigned by the system clock. Since, in general, CEDR systems use different clocks from event providers, valid time and CEDR time are not assumed to be comparable.
- CEDR has the ability to introduce the history of new payloads with insert events. Since these insert events model the history of the associated payload, both valid start and valid end times are provided. In addition, CEDR streams can also shrink the lifetime of payloads using retraction events. These retractions can reduce the associated valid end times, but are not permitted to change the associated valid start times. Retraction events provide new valid end times, and are uniquely associated with the payloads whose lifetimes are being reduced. A full retraction is a retraction where the new valid end time is equal to the valid start time. Further details about CEDR technologies can be obtained from U.S. patent application Ser. No. 11/937,118, filed on Nov. 8, 2007, the contents of which are hereby incorporated by reference in their entirety.
- In some CEDR implementations, there are four control parameters, EventType, VStart, VEnd, and VNewEnd. Snapshots in CEDR are indexed by timestamps. The EventType can be Insert, Retract, CTI or EOS. For Insert, VStart and VEnd indicate the range of snapshot indices over which the payload is valid. That is, the payload belongs to all snapshots in that range. Note that the interval is closed at the beginning and open at the end, so the payload is not in the snapshot associated with VEnd. For Retract, all of VStart, VEnd and the payload should match a previously seen event e, and VNewEnd, where VStart≦VNewEnd<VEnd, effectively specifies a new VEnd for e. A Retract removes its payload from snapshots with indices equal or later than VNewEnd. In terms of progress, if e is an Insert event, stable(e)=VStart. If e is a Retract, then stable(e)=VNewEnd. A CTI (current time increment) event is a (definite) punctuation at index VStart. EOS stands for “End of Stream”, and is only issued if a stream is ceasing output.
- To accommodate the algebraic representation of queries with FFP operators, the present techniques can view a relational query Q over which a fixed point can be computed as having two relational parameters, r and s, designated as Q(r, s). Parameter r can name an external input (and can be generalized to a set of relations). Parameter s can name the recursion parameter, which represents data headed around the recursive loop. Some implementations can require that schema(Q)=schema(s), and that Q is monotone on its second argument. This can be represented as Q(r, s)⊂Q(r, s ∪s1) for any s1.
- The technique can now define the fixed point of Q on r. Let
-
Q 0(r)=Q(r,Ø) -
Q i(r)=Q(r,Q i-1(r))fori>0 - The technique can specify that tuple t has level i if it appears in Qi(r). The fixed point of Q on r is
-
Q*(r)=U 0≦i Q i(r). - One potential goal for recursive queries over a stream R is to compute the fixed point of each snapshot in the canonical history of R. That is, given progressing stream R and Query Q, it can be desirable to produce a progressing stream S such that, for every index i,
-
S@i=Q*(R@i). - An S that satisfies these conditions is called a fixed-point stream for R under Q, and write SεQ*(R). (This membership is utilized because there could be many streams with this property.)
- As noted in the introduction, it can be desirable to avoid certain kinds of divergent behavior in computing fixed points. The need for finite answers and finite derivations are captured in the following two definitions.
- Definition: Query Q(r, s) is convergent if for each value of r, there exists a k such that Qk(r)=Qk+1(r).
- If Q(r, s) converges at k, then
-
Q*(r)=U 0≦i≦k Q i(r). - This shows that the result of Q on any value of r is finite.
- Definition: Query Q(r, s) is strongly convergent if for each value of r, there exists a k such that Qk(r)=Ø.
- Note that strongly convergent implies convergent, and that for a strongly convergent query Q, there is a maximum level (k) that any tuple t in Q*(r) has, hence the number of derivations is finite.
- Expressing Q with appropriate algebraic operators can allow FFP operators to be used with a target query Q(r, s). The algebraic operators are appropriate if they express Q with algebraic operators such that they behave appropriately with regard to speculative punctuation. A streaming operator G can be considered to be speculation-friendly if the following three conditions hold.
- First, G speculates correctly.
- Second, G does not block on definite punctuation.
- Third, G is forward moving.
- These conditions are discussed below.
- In relation to the first condition, G speculates correctly if given a speculative punctuation sp(i) in one input stream, and that every other input stream is explicitly progressing, G will eventually emit speculative punctuation sp(j) where j≦i. Moreover, if it turns out that sp(i) actually holds (that is, G receives no later event e with stable(e)≦i), then sp(j) actually holds (G will emit no event d with stable(d)≦j). Also, if G has previously emitted a definite punctuation dp(k), then j≧min(i, k). This last condition says that G doesn't “back up” from previously emitted definite punctuation. In practice, it will always turn out that i>k, so j>k.
- In one instance, to speculate correctly, G will typically track definite punctuation on its other inputs. In this implementation i=j. However, an alternative implementation (termed the probing approach) is described below where j is sometimes less than i.
- In relation to the second condition, where G does not block on definite punctuation, G is presumed to produce explicitly progressing output on explicitly progressing input. This method further instructs operators to emit output in the absence of any particular definite punctuation. Such a G must output the same collection of non-punctuation events on any two input streams with the same non-punctuation events. Any monotonic operator has a non-blocking implementation. (Handling non-monotonic operators by being able to revise previous outputs is discussed below under the heading “FFP in CEDR”).
- The third condition where G is forward moving utilizes an input event e. If input event e for G contributes to output event d, then the technique specifies that stable(e)≦stable(d). In practice, it is unlikely that an operator G could arbitrarily shift events backward in time without violating the first condition.
- Using the FFP operator to compute fixed points relative to a query Q(r, s), utilizes an algebraic query tree T(O, Ir, Is) for Q. O, Ir and Is are essentially “ports” of this query tree, where O connects to an output stream, Ir connects to an external input stream R, and Is will be for recursive input. The FFP operator can also be viewed as having ports: FFP(I, OE, OR). Here I connects to an input stream, OE connects to the external output stream, and OR connects to the recursive output stream. The technique can apply FFP to T and R to make the following connections:
- OE will connect either directly to a client, or to the input of a downstream operator. This discussion denotes this arrangement of operators by FFP(R, T). When FFP, T and R are connected in this manner, a recursive loop is created that passes from OR to Is to O to I.
-
FIG. 4 shows the recursive loop in this reachability query as a dashedline 402.FIG. 4 retains the operators 212(1)-212(6) and data streams 214(1)-214(2) introduced in relation toFIG. 2 and these components will not be reintroduced here for sake of brevity. Note that for this example, Q, and hence T, has two external input streams, one for nodes and one for edges. Q can be thought of as a conceptual entity. For computation purposes, Q will be represented as a tree of operators, T. InFIG. 4 , T can consist of the operators 212(1)-212(5), but not 212(6), which is the FFP operator. - In defining the FFP operator 212(6), this technique views the FFP operator as operating in phases, iterating over segments of its input separated by speculative punctuations. (These phases in general will be different from the levels of recursion defined earlier.) The discussion assumes that at startup, the FFP operator 212(6) emits a speculative punctuation sp(tmin) on OR at 404, where tmin is known to be before the stable points of all events on all external input streams.
- A segment of input for FFP operator 212(6) is a maximal sequence of events e1, e2, . . . , em, sp(t) received on I, where none of the ei's is a speculative punctuation. In this implementation, by maximality, e1 must either be the first event on I, or be preceded immediately by a speculative punctuation. The present techniques can allow that a segment can have e1, e2, . . . , em be the empty list.
- For each segment e1, e2, . . . , em, sp(t) that the FFP operator 212(6) receives on I, it performs the following steps.
- F1. Emit e1, e2, . . . , em, on output OE.
- F2. Emit those events in e1, e2, . . . , em that are not definite punctuations on output OR.
- F3.a. If stable(ei)>t for 1≦i≦k, then emit dp(t) on output OR, followed by sp(u) for some u≧t+c (for a fixed constant c).
- F3.b. Otherwise, emit sp(t) on output OR.
- The constant c can be chosen as the minimal possible time interval, sometimes called a chronon. Note that FFP operator 212(6) may only ever have one speculative punctuation circulating on the recursive loop at a time. Its strategy is to keep circulating a speculative punctuation ps(t) until it determines that the punctuation is valid, then it converts it to a definite punctuation and starts speculating at a later point. The next section will present conditions under which such speculation must always eventually succeed.
- While this definition of FFP or FFP operator might seem to indicate that it operates in a batch-oriented fashion, in fact, as seen in the reachability example and the implementation in the “FFP in CEDR” section, steps F1-F3 can be pipelined and run in a continuous fashion. Hence the “Flying” in “Flying fixed-point” operator.
- This section describes the results of the foundation introduced above.
- Theorem: Let T(O, Ir, Is) be a query tree for a strongly convergent query Q(r, s). If T uses speculation-friendly operators and R is an explicitly progressing stream, then FFP(R, T) outputs an explicitly progressing stream SεQ*(R).
- A proof is provided below in two main parts. The first part establishes that S is a fixed-point stream for R under Q. The second part shows that S is explicitly progressing. This proof is provided for discussion purposes in relation to specific implementations. Other implementations can achieve recursive data stream processing without relying on the absolute assertion expressed in this proof.
- That S is a fixed-point stream for R under Q does not rely on the handling of speculative punctuations at all. Rather, it follows from the fact that FFP sends all input back around the recursive loop, that operators on that loop do not block on definite punctuations, and that R is progressing. The proof of this part is an induction on the level of recursion. Consider a specific snapshot r=R@t in the canonical history of R. The general statement is that FFP eventually receives (hence outputs to OE) all events needed for Qm(r) for every m.
- The basis case is that FFP receives Q0(r)=Q(r, Ø) on I. This case holds since R will eventually progress past t and stabilize r. Since T will have received all of Ø at this point, it will output all of Q(r, Ø) to I. (There is no problem if T receives more data, because Q is assumed monotone on its second input.)
- This case follows from the observation that if the FFP operator has received all of Qk−1 (r) on its input I, it will emit it on recursive output OR. Thus, T will eventually produce all tuples in
-
Q(r,Qk−1(r))=Qk(r). - Since Q is strongly convergent, there is some j such that Qj(r)=Ø. Thus once FFP has received all input up through Qj(r), there will be no more output events for Q*(r), and the output of FFP will progress past time t.
- Demonstrating the explicit progress of S requires two things. (1) Any dp(t) that FFP emits on OE must be correctly placed. That is, no later event e will be emitted with stable(e)<t. (2) For any index u, FFP will eventually emit a definite punctuation tp(t) for some t≧u.
- For (1), it is noted that FFP will always see the end of a segment (that is, the next speculative punctuation). After FFP emits any events on OR in step F2, it will necessarily emit a speculative punctuation on OR in step F3.a or F3.b. Because every operator on the recursive loop is speculation-friendly, each must eventually pass on the speculative punctuation until it gets back to I. Now consider segment e1, e2, . . . , em, sp(t) that satisfies the if-statement in step F3.a. When e1, e2, . . . , em are sent out again on OR, any event d they will produce in the next segment will have stable(d)>t, since all operators on the recursive loop are forward moving. This situation will be true for all subsequent segments, by similar reasoning. Thus the speculative punctuation sp(t) was actually valid, and FFP can convert it safely to dp(t). Since R is explicitly progressing, T will eventually produce a definite punctuation dp(u) where u≧t. That punctuation will be correctly placed in the output of T by the properties of its operators, and hence will be correctly placed in the output of FFP.
- For (2), it is noted that a speculative punctuation sp(t) can only be recirculated a finite number of times by step F3.b before step F3.a applies. Since the input of FFP progresses, as shown in the first part of the proof, there must eventually be a segment where e1, e2, . . . , em all have stable points after t. Further, each time the technique uses step F3.a, it increases the index for the speculative punctuation by at least c. Thus, the technique must eventually speculate at some index v≧u.
- The hypotheses in the above theorem are actually stronger than they need be. Any operators in T that are not on the recursive loop do not need to be speculation-friendly. They only need to satisfy the condition that they emit explicitly progressing output on explicitly progressing input.
- Until this section, the discussion of FFP operators is framed in a way which may be applied to most streaming systems. This section discusses how recursion can work in the CEDR stream-processing system. Among the discussed topics are how speculative CTIs fit into the CEDR event model, and how specific operators respond to these new events. Also discussed is the handling of speculative output, which is a native capability of the CEDR event processing system, in recursive queries. Further, the interaction between the CEDR style of windowing and recursion is discussed. Finally, consequences in terms of the sharing of computation between windows with shared events are discussed.
- In the CEDR event processing system, physical streams may already contain definite punctuations called CTIs (current time increment). These punctuations come with a timestamp t. When one of these events is received by the listener, there is a guarantee that all events which affect snapshots earlier than t have been received. Operators use this guarantee to garbage collect (i.e., reclaim) state that will not affect future output. CEDR operators, except “align”, do not need these events to unblock output, since they do not have to block in the first place. Rather, they can produce speculative results incorporating all the received events, and correct these results later if necessary using retractions.
- Like a definite CTI, a speculative CTIs (specCTI) comes with a timestamp t. Note that in order to handle specCTIs correctly, each operator should guarantee that it is handling these events in a speculative-friendly way. Recall our definition from the “formalism” section.
- G is speculation-friendly if the following three conditions hold:
- S1. G speculates correctly.
- S2. G does not block on definite punctuation.
- S3. G is forward moving.
- Some implementations guarantee these requirements in CEDR by treating the SpecCTI similarly to a definite CTI, except for two things:
- First, do not garbage collect based on speculative CTIs, as the recursion might not be finished.
- Second, the described strategy can require that SpecCTIs loop through the recursion unchanged. This restriction may force the SpecCTI to become lodged until another operator input catches up or until a SpecCTI may be safely emitted for the requested time.
- S1 is easily upheld for unitary or unary operators. After unblocking any necessary output, thus possibly producing speculative output, the technique simply allows the SpecCTI through. Binary operators are a bit trickier. Assuming that one branch is in the recursive loop, the technique lodges the SpecCTI in the binary operator until it receives, from the non-recursive child, a definite CTI with timestamp greater than or equal to the SpecCTI timestamp. This delay ensures that all input from the non-recursive side that could influence the output states prior to the SpecCTI has been absorbed and emitted by the operator before emitting the SpecCTI. Since the technique assumes that all non-recursive inputs are explicitly progressing, a time must exist where the specCTI becomes dislodged and passes through. At this time, since specCTIs are treated, for the purpose of producing output, like definite CTIs, all speculative output up to that point in time must have been emitted.
- S2 is trivially upheld by the CEDR operators, none of which block on definite punctuation except align. Rather, CEDR operators speculate as an alternative to blocking.
- S3 is also trivially upheld by all operators in the CEDR algebra except AlterLifetime, which is the only operator in the algebra which can emit an event that includes valid times outside the range of the input event which generated it. Since AlterLifetime is used for windowing in the employed algebra, the technique could require that all windowing be done on inputs outside the recursive loop.
- One implementation of the FFP operator augments the CEDR multicast operator to handle specCTIs. This implementation tracks the “high water mark” of the stabilization point for all Insert and Retract events, and uses this value to speculate with. Its handling of specCTIs and other events follows the algorithm given in Section 3.4, except it performs steps F1-F3 on the fly. To test the if-condition in step F3.a, it remembers the timestamp in the currently circulating specCTI, and sets a flag if it sees an earlier event before the specCTI returns. Thus, this implementation uses a fixed amount of state.
- An observant reader will note that rather than allowing SpecCTIs to become lodged in an operator, some implementations can immediately emit them with the latest timestamp that their other inputs allow, which, in these cases, would be guaranteed to be less than the original SpecCTI. Rather than using a scheme with a high water mark, these techniques could instead initially emit a SpecCTI with a timestamp of infinity, and then retry the timestamp that comes back until the technique can emit a definite CTI. After emitting the definite CTI, the technique could then emit another SpecCTI at infinity, etc. This alternative approach is termed “specCTI probing”.
- The CEDR stream processing system uses operators that inherently speculate very aggressively by issuing full or partial retractions for previous events in the input stream. Using this mechanism, operators are free to speculate as aggressively as—at any given time—producing all output under the assumption that the input received so far is all the input. Speculation may then be throttled back using the Align operator, and permanence of output may be forced by the finalize operator for the purpose of managing state in the absence of frequent-enough CTIs. Some of the described implementations of FFP can handle the Retract events that are sometimes issued by speculative operators, by virtue of starting with operator implementations that handle Retracts.
- Note that this form of speculation also allows these techniques to significantly increase the expressiveness of recursive queries, which, using this form of speculation, allows the recursive use of operators such as aggregation and difference.
- In CEDR, rather than associating windows with operators, some techniques associate windows with data. More specifically, some of these techniques can associate with every event, an interval (as opposed to other systems, which use a single timestamp). This interval is actually the time during which a particular payload, associated with the event, is in the snapshots being modeled by the stream. This treatment has the effect of assigning payloads to windows, such that the valid time interval of the event determines the output times during which a windowed version of any operator includes the payload. The AlterLifetime operator can be used to explicitly set these windows.
- This section explains how the user of FFP operators can implement arbitrary NFAs, a common paradigm for implementing pattern matching. As with the above examples, these techniques get the ability to speculate, incrementally window, and handle out-of-order inputs as a consequence of using existing operators.
-
FIG. 5 shows the resulting plan in the form of query graph 510. In the present case, query graph 510 includes seven operators 512(1)-512(7). Operators 512(1) and 512(2) are join operators; operator 512(3) is a project operator; operator 512(4) is a union operator; operator 512(5) is a multicast operator; operator 512(6) is another project operator, and operator 512(7) is an FFP operator. - Two data streams serve as sample input to query graph 510; the first input stream is in the form of state machine 514(1), while the second input stream is in the form of symbols 514(2). An example of input data of the state machine 514(1) is evidenced generally at 516. An example of input data of the symbols 514(2) is evidenced generally at 518. Output from the query graph 510 is evidenced generally at 520.
- Note that, rather than being compiled into the plan, the state machine is given as a streaming input, and may, in theory, change over time. Thus, the plan is actually a streaming program for executing arbitrary, evolving automata.
- For clarity, as with the above examples, the discussion again assumes that the window is infinite, and explains the role of the various operators 512(1)-512(7) with the given input. The particular automata that are executed here searches for the pattern AB*A. The query can output all discovered event sequences that constitute partial and complete patterns, and their associated states in the automata. The starting state is called S, and the final state is called F. (Note that there may be multiple final states, and that one could filter the output for final states if desired.)
- The state machine input 514(1) is described using a set of transitions such that each transition absorbs an accompanying input. The symbols input 514(2) is a description of the sequence in which an attempt is made to find patterns. Each event has a sequence number, and a symbol, which may match a symbol in the automata transition table.
- A detailed query description was provided above in relation to the reachability example. For sake of brevity only a sketch of query behavior is provided in the present discussion. The state machine input 514(1) is loaded into a
right join synopsis 522 of the lower join 512(1). When input comes along the symbols input 514(2) toleft synopsis 524, join 512(1) finds all transitions which can be made using this symbol, and passes these transitions to the join 512(2) above atleft synopsis 526, which looks for partial patterns which have ended in the starting state of one of the activated transitions, and which sequentially precede the new symbol. For all such matches, the technique has found a new (partial or complete) pattern, which is output and recursively inserted back into aright synopsis 528 of the upper join 512(2). - Along the left branch of the multicast above the symbols input 514(2), the technique creates a seed start state on each input symbol and recursively inserts it into the
right join synopsis 528 of the upper join 512(2). - The notion of progress used to bound the computation in this example is that transitions can be followed along increasing sequence numbers. The technique is therefore bounded in the number of recursive steps at any given moment by the number of received symbols, which at any given moment, is finite.
- Note that in the example given above, the input sequence is: ‘ABBA’. Given that the query returns partial and complete discovered patterns, the technique should output the following patterns and their associated end sequence IDs:
- ‘A’:1, ‘AB’:2, ‘ABB’:3, ‘ABBA’:4, ‘A’:4
- Note that there are actually 4 extra outputs in
FIG. 5 . These outputs correspond to the 4 seed patterns introduced by the left side of the multicast, and are regarded as patterns of length 0. - In summary, the above description offers systems and techniques for processing recursive streaming queries. The description further defines how query graphs utilized in the processing can be updated to specific points in time even while the recursive streaming query may remain ongoing. The above described techniques/methods and systems can be implemented on any type of networkable computing device(s) as should be recognized by the skilled artisan.
-
FIG. 6 illustrates a flowchart of a method ortechnique 600 that is consistent with at least some implementations of the present concepts. The order in which thetechnique 600 is described is not intended to be construed as a limitation, and any number of the described blocks can be combined in any order to implement the technique, or an alternate technique. Furthermore, the technique can be implemented in any suitable hardware, software, firmware, or combination thereof, such that a computing device can implement the technique. In one case, the technique is stored on a computer-readable storage media as a set of instructions such that execution by a computing device causes the computing device to perform the technique. - The technique processes a recursive streaming query through a query graph at
block 602. Query graphs consist of operators connected to one another via streams. Non-limiting examples of operators and potential arrangements of operators in a query graph are detailed above in relation toFIGS. 2-5 . - The technique detects when output produced by executing the query graph advances to a specific point at
block 604. One implementation involves circulating speculative CTIs through a recursive loop of the query graph to detect when the output has advanced to the specific point. Examples of this and other exemplary techniques are described above. - The above described concepts detail the surprising conclusion that recursive streaming query plans, through the introduction of a cycle in the query graph, is simple, highly expressive, and practical. At least some of these concepts can immediately benefit from all the capabilities of existing operators such as incremental window evaluation, disorder tolerance, and speculation.
- The present concepts can be employed in implementations that are sufficiently expressive to attack both graph-walking queries and regular-expression pattern matching. In the case of pattern matching, the associated query plan is actually linear in the number of transitions of the finite automata which detects the pattern, resulting in a highly efficient algorithm. Even further expressiveness is achieved in CEDR by speculating when necessary to ensure disorder tolerance. This allows operators such as aggregation and difference to be used in recursive loops, which is useful for expressing branch and bound execution strategies.
- Detecting forward time progress is relatively straightforward with the addition of speculative CTIs, which function similarly to regular CTIs. The above discussion includes two implementations; a blocking speculative-CT strategy based on high water marks and a non-blocking version based on probing.
- Although techniques, methods, devices, systems, etc., pertaining to recursive streaming query scenarios are described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed methods, devices, systems, etc.
Claims (20)
1. A system, comprising:
first and second networked stream operators for operating on a recursive streaming query, wherein the first operator is configured to receive at least two input streams and to generate an output stream based upon a first set of conditions and wherein the second operator is positioned downstream from the first operator and is configured to generate a multicast output, and wherein a portion of the multicast output is recursively directed back as an input to the first operator thereby forming a recursive loop.
2. The system of claim 1 , wherein the first operator comprises a union operator and the second operator comprises a flying fixed-point (FFP) operator and wherein at least one additional operator is interposed between the union operator and the FFP operator.
3. The system of claim 1 , wherein the at least one additional operator comprises a join operator that is configured to receive output from the union operator and at least one other stream as inputs.
4. The system of claim 1 , wherein the second operator comprises a flying fixed-point (FFP) operator that is configured to probe the recursive loop with speculative times to determine a specific point in time to which the recursive loop has completed processing.
5. The system of claim 4 , wherein the speculative times are based on timestamps of input events.
6. The system of claim 1 , wherein the first operator comprises a union operator and the second operator comprises a flying fixed-point (FFP) operator and wherein at least one additional operator is interposed between the union operator and the FFP operator and wherein the FFP operator and the one or more additional operators are configured to correctly handle stream events that arrive out of time order.
7. The system of claim 6 , wherein the FFP operator and the at least one additional operator are configured to correctly handle stream events that retract or reduce a validity of previously seen stream events.
8. The system of claim 1 , wherein the second operator is configured to probe the recursive loop with speculative times to determine a point in time to which the recursive loop has completed processing.
9. A method, comprising:
processing at least one input stream associated with a recursive streaming query; and,
advancing time for the recursive streaming query to a specific point when the at least one input stream has advanced to the specific point and recursive computations on the input stream are complete to the specific point.
10. The method of claim 9 , wherein the processing comprising processing two input streams.
11. The method of claim 9 , wherein the processing comprising introducing speculative time events to determine the specific point.
12. The method of claim 9 , further comprising generating an output from the processing and wherein the output includes information about the advancing.
13. The method of claim 12 , wherein the generating comprises generating the output as an event that contains time information associated with the advancing.
14. The method of claim 9 , wherein the advancing occurs while recursive computations continue for events subsequent to the point.
15. The method of claim 9 , wherein the advancing is repeated for additional times that are subsequent to the specific point.
16. A computer-readable storage media having instructions stored thereon that when executed by a computing device cause the computing device to perform acts, comprising:
processing a recursive streaming query through a query graph; and,
detecting when output produced by executing the query graph advances to a specific point.
17. The computer-readable storage media of claim 16 , wherein the processing comprises a recursive step on the query graph.
18. The computer-readable storage media of claim 17 , wherein the detecting comprises detecting a passage of time on the recursive step.
19. The computer-readable storage media of claim 16 , wherein the detecting comprises probing the query graph with the specific point in time.
20. The computer-readable storage media of claim 19 , wherein the probing is accomplished with a speculative event.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/246,509 US20100088325A1 (en) | 2008-10-07 | 2008-10-07 | Streaming Queries |
US13/298,159 US9229986B2 (en) | 2008-10-07 | 2011-11-16 | Recursive processing in streaming queries |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/246,509 US20100088325A1 (en) | 2008-10-07 | 2008-10-07 | Streaming Queries |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/298,159 Division US9229986B2 (en) | 2008-10-07 | 2011-11-16 | Recursive processing in streaming queries |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100088325A1 true US20100088325A1 (en) | 2010-04-08 |
Family
ID=42076613
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/246,509 Abandoned US20100088325A1 (en) | 2008-10-07 | 2008-10-07 | Streaming Queries |
US13/298,159 Active US9229986B2 (en) | 2008-10-07 | 2011-11-16 | Recursive processing in streaming queries |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/298,159 Active US9229986B2 (en) | 2008-10-07 | 2011-11-16 | Recursive processing in streaming queries |
Country Status (1)
Country | Link |
---|---|
US (2) | US20100088325A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090125635A1 (en) * | 2007-11-08 | 2009-05-14 | Microsoft Corporation | Consistency sensitive streaming operators |
US20110093866A1 (en) * | 2009-10-21 | 2011-04-21 | Microsoft Corporation | Time-based event processing using punctuation events |
US20120166469A1 (en) * | 2010-12-22 | 2012-06-28 | Software Ag | CEP engine and method for processing CEP queries |
WO2012082660A3 (en) * | 2010-12-13 | 2013-01-17 | Microsoft Corporation | Reactive coincidence |
US9158816B2 (en) | 2009-10-21 | 2015-10-13 | Microsoft Technology Licensing, Llc | Event processing with XML query based on reusable XML query template |
US9229986B2 (en) | 2008-10-07 | 2016-01-05 | Microsoft Technology Licensing, Llc | Recursive processing in streaming queries |
US9584379B2 (en) | 2013-06-20 | 2017-02-28 | Microsoft Technology Licensing, Llc | Sorted event monitoring by context partition |
US20170116276A1 (en) * | 2015-10-23 | 2017-04-27 | Oracle International Corporation | Parallel execution of queries with a recursive clause |
US9767217B1 (en) * | 2014-05-28 | 2017-09-19 | Google Inc. | Streaming graph computations in a distributed processing system |
US10452655B2 (en) | 2015-10-23 | 2019-10-22 | Oracle International Corporation | In-memory cursor duration temp tables |
US10642831B2 (en) | 2015-10-23 | 2020-05-05 | Oracle International Corporation | Static data caching for queries with a clause that requires multiple iterations to execute |
US10783142B2 (en) | 2015-10-23 | 2020-09-22 | Oracle International Corporation | Efficient data retrieval in staged use of in-memory cursor duration temporary tables |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130297531A1 (en) * | 2012-05-02 | 2013-11-07 | Imageworks Interactive | Device for modifying various types of assets |
EP2682878A1 (en) * | 2012-07-04 | 2014-01-08 | Software AG | Method of processing relational queries in a database system and corresponding database system |
US10108650B2 (en) * | 2012-11-12 | 2018-10-23 | Sony Corporation | Information processing device and information processing method |
US9280549B2 (en) | 2013-03-14 | 2016-03-08 | International Business Machines Corporation | Compressing tuples in a streaming application |
US10334011B2 (en) * | 2016-06-13 | 2019-06-25 | Microsoft Technology Licensing, Llc | Efficient sorting for a stream processing engine |
GB2592421B (en) * | 2020-02-27 | 2022-03-02 | Crfs Ltd | Real-time data processing |
Citations (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5321837A (en) * | 1991-10-11 | 1994-06-14 | International Business Machines Corporation | Event handling mechanism having a process and an action association process |
US5546570A (en) * | 1995-02-17 | 1996-08-13 | International Business Machines Corporation | Evaluation strategy for execution of SQL queries involving recursion and table queues |
US5765037A (en) * | 1985-10-31 | 1998-06-09 | Biax Corporation | System for executing instructions with delayed firing times |
US6236998B1 (en) * | 1996-08-29 | 2001-05-22 | Nokia Telecommunications Oy | Event recording in a service database system |
US6327587B1 (en) * | 1998-10-05 | 2001-12-04 | Digital Archaeology, Inc. | Caching optimization with disk and/or memory cache management |
US20020083049A1 (en) * | 1998-10-05 | 2002-06-27 | Michael Forster | Data exploration system and method |
US6604102B2 (en) * | 1999-07-06 | 2003-08-05 | Hewlett-Packard Development Company, Lp. | System and method for performing database operations on a continuous stream of tuples |
US20030163465A1 (en) * | 2002-02-28 | 2003-08-28 | Morrill Daniel Lawrence | Processing information about occurrences of multiple types of events in a consistent manner |
US6629106B1 (en) * | 1999-02-26 | 2003-09-30 | Computing Services Support Solutions, Inc. | Event monitoring and correlation system |
US6681230B1 (en) * | 1999-03-25 | 2004-01-20 | Lucent Technologies Inc. | Real-time event processing system with service authoring environment |
US20040111396A1 (en) * | 2002-12-06 | 2004-06-10 | Eldar Musayev | Querying against a hierarchical structure such as an extensible markup language document |
US20040172599A1 (en) * | 2003-02-28 | 2004-09-02 | Patrick Calahan | Systems and methods for streaming XPath query |
US20040205082A1 (en) * | 2003-04-14 | 2004-10-14 | International Business Machines Corporation | System and method for querying XML streams |
US20050138081A1 (en) * | 2003-05-14 | 2005-06-23 | Alshab Melanie A. | Method and system for reducing information latency in a business enterprise |
US6920468B1 (en) * | 1998-07-08 | 2005-07-19 | Ncr Corporation | Event occurrence detection method and apparatus |
US20060100969A1 (en) * | 2004-11-08 | 2006-05-11 | Min Wang | Learning-based method for estimating cost and statistics of complex operators in continuous queries |
US20060136448A1 (en) * | 2004-12-20 | 2006-06-22 | Enzo Cialini | Apparatus, system, and method for database provisioning |
US20060149849A1 (en) * | 2005-01-03 | 2006-07-06 | Gilad Raz | System for parameterized processing of streaming data |
US20060230071A1 (en) * | 2005-04-08 | 2006-10-12 | Accenture Global Services Gmbh | Model-driven event detection, implication, and reporting system |
US20060282695A1 (en) * | 2005-06-09 | 2006-12-14 | Microsoft Corporation | Real time event stream processor to ensure up-to-date and accurate result |
US20070237410A1 (en) * | 2006-03-24 | 2007-10-11 | Lucent Technologies Inc. | Fast approximate wavelet tracking on streams |
US7310638B1 (en) * | 2004-10-06 | 2007-12-18 | Metra Tech | Method and apparatus for efficiently processing queries in a streaming transaction processing system |
US20070294217A1 (en) * | 2006-06-14 | 2007-12-20 | Nec Laboratories America, Inc. | Safety guarantee of continuous join queries over punctuated data streams |
US20080016095A1 (en) * | 2006-07-13 | 2008-01-17 | Nec Laboratories America, Inc. | Multi-Query Optimization of Window-Based Stream Queries |
US7349925B2 (en) * | 2004-01-22 | 2008-03-25 | International Business Machines Corporation | Shared scans utilizing query monitor during query execution to improve buffer cache utilization across multi-stream query environments |
US20090100029A1 (en) * | 2007-10-16 | 2009-04-16 | Oracle International Corporation | Handling Silent Relations In A Data Stream Management System |
US20090106190A1 (en) * | 2007-10-18 | 2009-04-23 | Oracle International Corporation | Support For User Defined Functions In A Data Stream Management System |
US20090106218A1 (en) * | 2007-10-20 | 2009-04-23 | Oracle International Corporation | Support for user defined aggregations in a data stream management system |
US20090125635A1 (en) * | 2007-11-08 | 2009-05-14 | Microsoft Corporation | Consistency sensitive streaming operators |
US20090204551A1 (en) * | 2004-11-08 | 2009-08-13 | International Business Machines Corporation | Learning-Based Method for Estimating Costs and Statistics of Complex Operators in Continuous Queries |
US20090228465A1 (en) * | 2008-03-06 | 2009-09-10 | Saileshwar Krishnamurthy | Systems and Methods for Managing Queries |
US20090319501A1 (en) * | 2008-06-24 | 2009-12-24 | Microsoft Corporation | Translation of streaming queries into sql queries |
US7702689B2 (en) * | 2006-07-13 | 2010-04-20 | Sap Ag | Systems and methods for querying metamodel data |
US7840592B2 (en) * | 2005-04-14 | 2010-11-23 | International Business Machines Corporation | Estimating a number of rows returned by a recursive query |
Family Cites Families (143)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0532727A1 (en) * | 1991-02-26 | 1993-03-24 | Hewlett-Packard Company | Method of evaluating a recursive query of a database |
US5999908A (en) | 1992-08-06 | 1999-12-07 | Abelow; Daniel H. | Customer-based product design module |
JPH0757117A (en) | 1993-07-09 | 1995-03-03 | Silicon Graphics Inc | Forming method of index to texture map and computer control display system |
US5742806A (en) | 1994-01-31 | 1998-04-21 | Sun Microsystems, Inc. | Apparatus and method for decomposing database queries for database management system including multiprocessor digital data processing system |
US5528516A (en) | 1994-05-25 | 1996-06-18 | System Management Arts, Inc. | Apparatus and method for event correlation and problem reporting |
EP0687089B1 (en) * | 1994-06-10 | 2003-05-28 | Hewlett-Packard Company, A Delaware Corporation | Event-processing system and method of constructing such a system |
US5549115A (en) | 1994-09-28 | 1996-08-27 | Heartstream, Inc. | Method and apparatus for gathering event data using a removable data storage medium and clock |
US5809235A (en) | 1996-03-08 | 1998-09-15 | International Business Machines Corporation | Object oriented network event management framework |
US6145009A (en) | 1997-05-20 | 2000-11-07 | Kabushiki Kaisha Toshiba | Event controlling system for integrating different event driven systems |
GB9725347D0 (en) | 1997-11-28 | 1998-01-28 | Ncr Int Inc | Database relationship analysis and strategy implementation tool |
US6336139B1 (en) | 1998-06-03 | 2002-01-01 | International Business Machines Corporation | System, method and computer program product for event correlation in a distributed computing environment |
US6321338B1 (en) | 1998-11-09 | 2001-11-20 | Sri International | Network surveillance |
US6763353B2 (en) | 1998-12-07 | 2004-07-13 | Vitria Technology, Inc. | Real time business process analysis method and apparatus |
US6477562B2 (en) | 1998-12-16 | 2002-11-05 | Clearwater Networks, Inc. | Prioritized instruction scheduling for multi-streaming processors |
US6496831B1 (en) | 1999-03-25 | 2002-12-17 | Lucent Technologies Inc. | Real-time event processing system for telecommunications and other applications |
US6449618B1 (en) | 1999-03-25 | 2002-09-10 | Lucent Technologies Inc. | Real-time event processing system with subscription model |
US6477565B1 (en) | 1999-06-01 | 2002-11-05 | Yodlee.Com, Inc. | Method and apparatus for restructuring of personalized data for transmission from a data network to connected and portable network appliances |
US7020618B1 (en) | 1999-10-25 | 2006-03-28 | Ward Richard E | Method and system for customer service process management |
EP1143362A2 (en) | 1999-12-14 | 2001-10-10 | Citicorp Development Center, Inc. | System and method for managing financial transaction information |
US7590644B2 (en) | 1999-12-21 | 2009-09-15 | International Business Machine Corporation | Method and apparatus of streaming data transformation using code generator and translator |
US6507840B1 (en) | 1999-12-21 | 2003-01-14 | Lucent Technologies Inc. | Histogram-based approximation of set-valued query-answers |
US6775658B1 (en) | 1999-12-21 | 2004-08-10 | Mci, Inc. | Notification by business rule trigger control |
US7523190B1 (en) | 1999-12-23 | 2009-04-21 | Bickerstaff Cynthia L | Real-time performance assessment of large area network user experience |
US20040220791A1 (en) | 2000-01-03 | 2004-11-04 | Interactual Technologies, Inc. A California Corpor | Personalization services for entities from multiple sources |
EP1122901A3 (en) | 2000-02-03 | 2006-07-12 | Matsushita Electric Industrial Co., Ltd. | Digital broadcasting system and event message transmission method |
US6941557B1 (en) | 2000-05-23 | 2005-09-06 | Verizon Laboratories Inc. | System and method for providing a global real-time advanced correlation environment architecture |
WO2002017183A2 (en) | 2000-08-04 | 2002-02-28 | Xtremesoft, Inc. | System and method for analysing a transactional monitoring system |
US7111010B2 (en) | 2000-09-25 | 2006-09-19 | Hon Hai Precision Industry, Ltd. | Method and system for managing event attributes |
US7688306B2 (en) | 2000-10-02 | 2010-03-30 | Apple Inc. | Methods and apparatuses for operating a portable device based on an accelerometer |
US7103556B2 (en) | 2000-11-02 | 2006-09-05 | Jpmorgan Chase Bank, N.A. | System and method for aggregate portfolio client support |
US7107224B1 (en) | 2000-11-03 | 2006-09-12 | Mydecide, Inc. | Value driven integrated build-to-buy decision analysis system and method |
US6925631B2 (en) | 2000-12-08 | 2005-08-02 | Hewlett-Packard Development Company, L.P. | Method, computer system and computer program product for processing extensible markup language streams |
US6782379B2 (en) | 2000-12-22 | 2004-08-24 | Oblix, Inc. | Preparing output XML based on selected programs and XML templates |
US20020099578A1 (en) | 2001-01-22 | 2002-07-25 | Eicher Daryl E. | Performance-based supply chain management system and method with automatic alert threshold determination |
US7212839B2 (en) | 2001-01-22 | 2007-05-01 | Wildseed Limited | Visualization supplemented wireless mobile telephony-audio |
US6757689B2 (en) | 2001-02-02 | 2004-06-29 | Hewlett-Packard Development Company, L.P. | Enabling a zero latency enterprise |
US7013289B2 (en) | 2001-02-21 | 2006-03-14 | Michel Horn | Global electronic commerce system |
US7065566B2 (en) | 2001-03-30 | 2006-06-20 | Tonic Software, Inc. | System and method for business systems transactions and infrastructure management |
EP1382165A2 (en) | 2001-04-13 | 2004-01-21 | MOTOROLA INC., A Corporation of the state of Delaware | Manipulating data streams in data stream processors |
US7349691B2 (en) | 2001-07-03 | 2008-03-25 | Microsoft Corporation | System and apparatus for performing broadcast and localcast communications |
DE10158853A1 (en) | 2001-11-30 | 2003-06-12 | Univ Braunschweig Tech Carolo Wilhelmina | Method for analysis of the time behavior of complex distributed systems with base components such as program modules or electronic circuit modules in which the system is modeled using input event models and event streams |
US7065561B2 (en) | 2002-03-08 | 2006-06-20 | Bea Systems, Inc. | Selective parsing of an XML document |
US7865867B2 (en) | 2002-03-08 | 2011-01-04 | Agile Software Corporation | System and method for managing and monitoring multiple workflows |
JP4047053B2 (en) | 2002-04-16 | 2008-02-13 | 富士通株式会社 | Retrieval apparatus and method using sequence pattern including repetition |
US7610211B2 (en) | 2002-06-21 | 2009-10-27 | Hewlett-Packard Development Company, L.P. | Investigating business processes |
WO2004027649A1 (en) | 2002-09-18 | 2004-04-01 | Netezza Corporation | Asymmetric streaming record data processor method and apparatus |
GB0222917D0 (en) | 2002-10-02 | 2002-11-13 | Ibm | Management of business process application execution |
US7293024B2 (en) | 2002-11-14 | 2007-11-06 | Seisint, Inc. | Method for sorting and distributing data among a plurality of nodes |
US7467018B1 (en) | 2002-11-18 | 2008-12-16 | Rockwell Automation Technologies, Inc. | Embedded database systems and methods in an industrial controller environment |
US7437675B2 (en) | 2003-02-03 | 2008-10-14 | Hewlett-Packard Development Company, L.P. | System and method for monitoring event based systems |
US7657540B1 (en) | 2003-02-04 | 2010-02-02 | Seisint, Inc. | Method and system for linking and delinking data records |
JP2004240766A (en) | 2003-02-06 | 2004-08-26 | Toshiba Corp | System and method for generating pattern detection processing program |
US7624173B2 (en) | 2003-02-10 | 2009-11-24 | International Business Machines Corporation | Method and system for classifying content and prioritizing web site content issues |
US7487148B2 (en) | 2003-02-28 | 2009-02-03 | Eaton Corporation | System and method for analyzing data |
US7693810B2 (en) | 2003-03-04 | 2010-04-06 | Mantas, Inc. | Method and system for advanced scenario based alert generation and processing |
US7409428B1 (en) | 2003-04-22 | 2008-08-05 | Cooper Technologies Company | Systems and methods for messaging to multiple gateways |
DE60309286T2 (en) | 2003-04-23 | 2007-05-31 | Comptel Corp. | event mediation |
US7383255B2 (en) | 2003-06-23 | 2008-06-03 | Microsoft Corporation | Common query runtime system and application programming interface |
US20050165724A1 (en) | 2003-07-11 | 2005-07-28 | Computer Associates Think, Inc. | System and method for using an XML file to control XML to entity/relationship transformation |
US6931327B2 (en) | 2003-08-01 | 2005-08-16 | Dexcom, Inc. | System and methods for processing analyte sensor data |
US7587667B2 (en) | 2003-09-04 | 2009-09-08 | Oracle International Corporation | Techniques for streaming validation-based XML processing directions |
US20050052427A1 (en) | 2003-09-10 | 2005-03-10 | Wu Michael Chi Hung | Hand gesture interaction with touch surface |
US7350192B2 (en) | 2003-12-08 | 2008-03-25 | Ebay Inc. | Method and system to automatically generate software code |
CA2564754A1 (en) | 2004-04-26 | 2005-11-10 | Right90, Inc. | Forecasting data with real-time updates |
US8621242B2 (en) | 2004-06-11 | 2013-12-31 | Arm Limited | Display of a verification image to confirm security |
US20060033625A1 (en) * | 2004-08-11 | 2006-02-16 | General Electric Company | Digital assurance method and system to extend in-home living |
GB2417868A (en) | 2004-09-04 | 2006-03-08 | Hewlett Packard Development Co | An asynchronous distributed system with a synchronous communication subsystem which facilitates the generation of global data |
US8055787B2 (en) | 2004-09-10 | 2011-11-08 | Invensys Systems, Inc. | System and method for managing industrial process control data streams over network links |
US20060130070A1 (en) | 2004-11-22 | 2006-06-15 | Graf Lars O | System and method of event correlation |
US7899921B2 (en) | 2004-12-08 | 2011-03-01 | Microsoft Corporation | Verifying and maintaining connection liveliness in a reliable messaging for web services environment |
US7321895B2 (en) | 2005-01-14 | 2008-01-22 | International Business Machines Corporation | Timeline condition support for an abstract database |
US7747640B2 (en) | 2005-01-20 | 2010-06-29 | International Business Machines Corporation | Method for regenerating selected rows for an otherwise static result set |
US20070043856A1 (en) | 2005-02-07 | 2007-02-22 | Metavize, Inc. | Methods and systems for low-latency event pipelining |
US20060248182A1 (en) | 2005-05-02 | 2006-11-02 | Polycentric Networks Corporation | Formatted and/or tunable QoS data publication, subscription, and/or distribution including dynamic network formation |
US20060253831A1 (en) | 2005-05-03 | 2006-11-09 | Microsoft Corporation | Validation architecture |
US7627544B2 (en) | 2005-05-20 | 2009-12-01 | Microsoft Corporation | Recognizing event patterns from event streams |
US8660891B2 (en) | 2005-11-01 | 2014-02-25 | Millennial Media | Interactive mobile advertisement banners |
WO2007035452A1 (en) | 2005-09-16 | 2007-03-29 | Rhysome, Inc. | Method and system for building, processing, and maintaining scenarios in event-driven information systems |
US20070118545A1 (en) | 2005-11-21 | 2007-05-24 | International Business Machines Corporation | Dynamic business process integration using complex event processing |
KR100813000B1 (en) | 2005-12-01 | 2008-03-13 | 한국전자통신연구원 | Stream data processing system and method for avoiding duplication of data processing |
US8589949B2 (en) | 2006-05-01 | 2013-11-19 | International Business Machines Corporation | Processing multiple heterogeneous event types in a complex event processing engine |
US7716234B2 (en) | 2006-05-26 | 2010-05-11 | Business Objects, S.A. | Apparatus and method for querying databases via a web service |
US8190474B2 (en) | 2006-07-21 | 2012-05-29 | Say Media, Inc. | Engagement-based compensation for interactive advertisement |
US20080065590A1 (en) | 2006-09-07 | 2008-03-13 | Microsoft Corporation | Lightweight query processing over in-memory data structures |
US20080065666A1 (en) | 2006-09-08 | 2008-03-13 | Battelle Memorial Institute, A Part Interest | Apparatuses, data structures, and methods for dynamic information analysis |
TWI337715B (en) | 2006-11-08 | 2011-02-21 | Inst Information Industry | Method and system for complex event processing |
US7747610B2 (en) | 2006-11-10 | 2010-06-29 | Sybase, Inc. | Database system and methodology for processing path based queries |
US20080120283A1 (en) | 2006-11-17 | 2008-05-22 | Oracle International Corporation | Processing XML data stream(s) using continuous queries in a data stream management system |
US7890923B2 (en) | 2006-12-01 | 2011-02-15 | International Business Machines Corporation | Configurable pattern detection method and apparatus |
US8769485B2 (en) | 2006-12-04 | 2014-07-01 | Tibco Software, Inc. | Data parallelism and parallel operations in stream processing |
US9038041B2 (en) | 2006-12-04 | 2015-05-19 | Tibco Software, Inc. | Stream processor with compiled programs |
US9215996B2 (en) | 2007-03-02 | 2015-12-22 | The Nielsen Company (Us), Llc | Apparatus and method for objectively determining human response to media |
US8219848B2 (en) | 2007-04-10 | 2012-07-10 | International Business Machines Corporation | Mechanism for recovery from site failure in a stream processing system |
US7884807B2 (en) | 2007-05-15 | 2011-02-08 | Synaptics Incorporated | Proximity sensor and method for indicating a display orientation change |
US20080301125A1 (en) | 2007-05-29 | 2008-12-04 | Bea Systems, Inc. | Event processing query language including an output clause |
CA2688509C (en) | 2007-05-31 | 2017-02-28 | Informatica Corporation | Distributed system for monitoring information events |
US7984040B2 (en) | 2007-06-05 | 2011-07-19 | Oracle International Corporation | Methods and systems for querying event streams using multiple event processors |
US7676461B2 (en) | 2007-07-18 | 2010-03-09 | Microsoft Corporation | Implementation of stream algebra over class instances |
US20090070786A1 (en) | 2007-09-11 | 2009-03-12 | Bea Systems, Inc. | Xml-based event processing networks for event server |
US7996388B2 (en) | 2007-10-17 | 2011-08-09 | Oracle International Corporation | Adding new continuous queries to a data stream management system operating on existing queries |
US20090125550A1 (en) | 2007-11-08 | 2009-05-14 | Microsoft Corporation | Temporal event stream model |
US9336327B2 (en) | 2007-11-30 | 2016-05-10 | Microsoft Technology Licensing, Llc | Mapping and query translation between XML, objects, and relations |
US8447859B2 (en) | 2007-12-28 | 2013-05-21 | International Business Machines Corporation | Adaptive business resiliency computer system for information technology environments |
US8291005B2 (en) * | 2008-01-07 | 2012-10-16 | International Business Machines Corporation | Providing consistency in processing data streams |
US8627299B2 (en) | 2008-02-29 | 2014-01-07 | International Business Machines Corporation | Virtual machine and programming language for event processing |
US9049255B2 (en) | 2008-02-29 | 2015-06-02 | Blackberry Limited | Visual event notification on a handheld communications device |
US7525646B1 (en) * | 2008-03-27 | 2009-04-28 | International Business Machines Corporation | Multiple pattern generator integration with single post expose bake station |
US8200682B2 (en) | 2008-04-22 | 2012-06-12 | Uc4 Software Gmbh | Method of detecting a reference sequence of events in a sample sequence of events |
US8610659B2 (en) | 2008-05-12 | 2013-12-17 | Blackberry Limited | Method and apparatus for automatic brightness adjustment on a display of a mobile electronic device |
US8060614B2 (en) * | 2008-06-19 | 2011-11-15 | Microsoft Corporation | Streaming operator placement for distributed stream processing |
US20100017214A1 (en) | 2008-07-15 | 2010-01-21 | Ronald Ambrosio | Extended services oriented architecture for distributed analytics |
US8447739B2 (en) | 2008-07-16 | 2013-05-21 | SAP France S.A. | Systems and methods to create continuous queries via a semantic layer |
US20100041391A1 (en) | 2008-08-12 | 2010-02-18 | Anthony Wayne Spivey | Embedded mobile analytics in a mobile device |
CN102165495B (en) | 2008-09-25 | 2014-11-26 | 皇家飞利浦电子股份有限公司 | Three dimensional image data processing |
US20100088325A1 (en) | 2008-10-07 | 2010-04-08 | Microsoft Corporation | Streaming Queries |
US20100121744A1 (en) | 2008-11-07 | 2010-05-13 | At&T Intellectual Property I, L.P. | Usage data monitoring and communication between multiple devices |
US8493408B2 (en) | 2008-11-19 | 2013-07-23 | Apple Inc. | Techniques for manipulating panoramas |
US8296303B2 (en) | 2008-11-20 | 2012-10-23 | Sap Ag | Intelligent event query publish and subscribe system |
US8156111B2 (en) | 2008-11-24 | 2012-04-10 | Yahoo! Inc. | Identifying and expanding implicitly temporally qualified queries |
WO2010065768A1 (en) | 2008-12-03 | 2010-06-10 | Sapient Corporation | Systems and methods for advertisement serving networks |
US20100141571A1 (en) | 2008-12-09 | 2010-06-10 | Tony Chiang | Image Sensor with Integrated Light Meter for Controlling Display Brightness |
US20100141658A1 (en) | 2008-12-09 | 2010-06-10 | Microsoft Corporation | Two-dimensional shadows showing three-dimensional depth |
US9767427B2 (en) | 2009-04-30 | 2017-09-19 | Hewlett Packard Enterprise Development Lp | Modeling multi-dimensional sequence data over streams |
US8296434B1 (en) | 2009-05-28 | 2012-10-23 | Amazon Technologies, Inc. | Providing dynamically scaling computing load balancing |
WO2010138975A1 (en) | 2009-05-29 | 2010-12-02 | Sk Telecom Americas, Inc. | System and method for motivating users to improve their wellness |
US8019390B2 (en) | 2009-06-17 | 2011-09-13 | Pradeep Sindhu | Statically oriented on-screen transluscent keyboard |
US8880524B2 (en) | 2009-07-17 | 2014-11-04 | Apple Inc. | Scalable real time event stream processing |
JP4818408B2 (en) | 2009-08-04 | 2011-11-16 | キヤノン株式会社 | Image processing apparatus and control method thereof |
CA2754159C (en) | 2009-08-11 | 2012-05-15 | Certusview Technologies, Llc | Systems and methods for complex event processing of vehicle-related information |
US9383970B2 (en) | 2009-08-13 | 2016-07-05 | Microsoft Technology Licensing, Llc | Distributed analytics platform |
KR101638056B1 (en) | 2009-09-07 | 2016-07-11 | 삼성전자 주식회사 | Method for providing user interface in mobile terminal |
US9158816B2 (en) | 2009-10-21 | 2015-10-13 | Microsoft Technology Licensing, Llc | Event processing with XML query based on reusable XML query template |
US8413169B2 (en) | 2009-10-21 | 2013-04-02 | Microsoft Corporation | Time-based event processing using punctuation events |
US8803908B2 (en) | 2010-01-15 | 2014-08-12 | Apple Inc. | Digital image transitions |
US20110213664A1 (en) | 2010-02-28 | 2011-09-01 | Osterhout Group, Inc. | Local advertising content on an interactive head-mounted eyepiece |
TW201137668A (en) | 2010-04-26 | 2011-11-01 | Hon Hai Prec Ind Co Ltd | Adjustment system and method for three-dimensional image |
US20120036485A1 (en) | 2010-08-09 | 2012-02-09 | XMG Studio | Motion Driven User Interface |
KR101130734B1 (en) | 2010-08-12 | 2012-03-28 | 연세대학교 산학협력단 | Method for generating context hierachyand, system for generating context hierachyand |
WO2012053030A1 (en) | 2010-10-19 | 2012-04-26 | 三菱電機株式会社 | Three-dimensional display device |
US8941601B2 (en) | 2011-04-21 | 2015-01-27 | Nokia Corporation | Apparatus and associated methods |
US9035880B2 (en) | 2012-03-01 | 2015-05-19 | Microsoft Corporation | Controlling images at hand-held devices |
JP5801734B2 (en) | 2012-03-01 | 2015-10-28 | 株式会社ジャパンディスプレイ | Liquid crystal display device, driving method of liquid crystal display device, and electronic apparatus |
US9886321B2 (en) | 2012-04-03 | 2018-02-06 | Microsoft Technology Licensing, Llc | Managing distributed analytics on device groups |
-
2008
- 2008-10-07 US US12/246,509 patent/US20100088325A1/en not_active Abandoned
-
2011
- 2011-11-16 US US13/298,159 patent/US9229986B2/en active Active
Patent Citations (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5765037A (en) * | 1985-10-31 | 1998-06-09 | Biax Corporation | System for executing instructions with delayed firing times |
US6253313B1 (en) * | 1985-10-31 | 2001-06-26 | Biax Corporation | Parallel processor system for processing natural concurrencies and method therefor |
US5321837A (en) * | 1991-10-11 | 1994-06-14 | International Business Machines Corporation | Event handling mechanism having a process and an action association process |
US5546570A (en) * | 1995-02-17 | 1996-08-13 | International Business Machines Corporation | Evaluation strategy for execution of SQL queries involving recursion and table queues |
US6236998B1 (en) * | 1996-08-29 | 2001-05-22 | Nokia Telecommunications Oy | Event recording in a service database system |
US6920468B1 (en) * | 1998-07-08 | 2005-07-19 | Ncr Corporation | Event occurrence detection method and apparatus |
US6327587B1 (en) * | 1998-10-05 | 2001-12-04 | Digital Archaeology, Inc. | Caching optimization with disk and/or memory cache management |
US20020083049A1 (en) * | 1998-10-05 | 2002-06-27 | Michael Forster | Data exploration system and method |
US6601058B2 (en) * | 1998-10-05 | 2003-07-29 | Michael Forster | Data exploration system and method |
US6629106B1 (en) * | 1999-02-26 | 2003-09-30 | Computing Services Support Solutions, Inc. | Event monitoring and correlation system |
US6681230B1 (en) * | 1999-03-25 | 2004-01-20 | Lucent Technologies Inc. | Real-time event processing system with service authoring environment |
US6604102B2 (en) * | 1999-07-06 | 2003-08-05 | Hewlett-Packard Development Company, Lp. | System and method for performing database operations on a continuous stream of tuples |
US20030163465A1 (en) * | 2002-02-28 | 2003-08-28 | Morrill Daniel Lawrence | Processing information about occurrences of multiple types of events in a consistent manner |
US20040111396A1 (en) * | 2002-12-06 | 2004-06-10 | Eldar Musayev | Querying against a hierarchical structure such as an extensible markup language document |
US20040172599A1 (en) * | 2003-02-28 | 2004-09-02 | Patrick Calahan | Systems and methods for streaming XPath query |
US20040205082A1 (en) * | 2003-04-14 | 2004-10-14 | International Business Machines Corporation | System and method for querying XML streams |
US20050138081A1 (en) * | 2003-05-14 | 2005-06-23 | Alshab Melanie A. | Method and system for reducing information latency in a business enterprise |
US7349925B2 (en) * | 2004-01-22 | 2008-03-25 | International Business Machines Corporation | Shared scans utilizing query monitor during query execution to improve buffer cache utilization across multi-stream query environments |
US7310638B1 (en) * | 2004-10-06 | 2007-12-18 | Metra Tech | Method and apparatus for efficiently processing queries in a streaming transaction processing system |
US20060100969A1 (en) * | 2004-11-08 | 2006-05-11 | Min Wang | Learning-based method for estimating cost and statistics of complex operators in continuous queries |
US20090204551A1 (en) * | 2004-11-08 | 2009-08-13 | International Business Machines Corporation | Learning-Based Method for Estimating Costs and Statistics of Complex Operators in Continuous Queries |
US20060136448A1 (en) * | 2004-12-20 | 2006-06-22 | Enzo Cialini | Apparatus, system, and method for database provisioning |
US20060149849A1 (en) * | 2005-01-03 | 2006-07-06 | Gilad Raz | System for parameterized processing of streaming data |
US20060230071A1 (en) * | 2005-04-08 | 2006-10-12 | Accenture Global Services Gmbh | Model-driven event detection, implication, and reporting system |
US7840592B2 (en) * | 2005-04-14 | 2010-11-23 | International Business Machines Corporation | Estimating a number of rows returned by a recursive query |
US20060282695A1 (en) * | 2005-06-09 | 2006-12-14 | Microsoft Corporation | Real time event stream processor to ensure up-to-date and accurate result |
US20070237410A1 (en) * | 2006-03-24 | 2007-10-11 | Lucent Technologies Inc. | Fast approximate wavelet tracking on streams |
US20070294217A1 (en) * | 2006-06-14 | 2007-12-20 | Nec Laboratories America, Inc. | Safety guarantee of continuous join queries over punctuated data streams |
US20080016095A1 (en) * | 2006-07-13 | 2008-01-17 | Nec Laboratories America, Inc. | Multi-Query Optimization of Window-Based Stream Queries |
US7702689B2 (en) * | 2006-07-13 | 2010-04-20 | Sap Ag | Systems and methods for querying metamodel data |
US20090100029A1 (en) * | 2007-10-16 | 2009-04-16 | Oracle International Corporation | Handling Silent Relations In A Data Stream Management System |
US20090106190A1 (en) * | 2007-10-18 | 2009-04-23 | Oracle International Corporation | Support For User Defined Functions In A Data Stream Management System |
US20090106218A1 (en) * | 2007-10-20 | 2009-04-23 | Oracle International Corporation | Support for user defined aggregations in a data stream management system |
US20090125635A1 (en) * | 2007-11-08 | 2009-05-14 | Microsoft Corporation | Consistency sensitive streaming operators |
US20090228465A1 (en) * | 2008-03-06 | 2009-09-10 | Saileshwar Krishnamurthy | Systems and Methods for Managing Queries |
US20090319501A1 (en) * | 2008-06-24 | 2009-12-24 | Microsoft Corporation | Translation of streaming queries into sql queries |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090125635A1 (en) * | 2007-11-08 | 2009-05-14 | Microsoft Corporation | Consistency sensitive streaming operators |
US8315990B2 (en) | 2007-11-08 | 2012-11-20 | Microsoft Corporation | Consistency sensitive streaming operators |
US9229986B2 (en) | 2008-10-07 | 2016-01-05 | Microsoft Technology Licensing, Llc | Recursive processing in streaming queries |
US20110093866A1 (en) * | 2009-10-21 | 2011-04-21 | Microsoft Corporation | Time-based event processing using punctuation events |
US9348868B2 (en) | 2009-10-21 | 2016-05-24 | Microsoft Technology Licensing, Llc | Event processing with XML query based on reusable XML query template |
US8413169B2 (en) | 2009-10-21 | 2013-04-02 | Microsoft Corporation | Time-based event processing using punctuation events |
US9158816B2 (en) | 2009-10-21 | 2015-10-13 | Microsoft Technology Licensing, Llc | Event processing with XML query based on reusable XML query template |
US10394625B2 (en) | 2010-12-13 | 2019-08-27 | Microsoft Technology Licensing, Llc | Reactive coincidence |
WO2012082660A3 (en) * | 2010-12-13 | 2013-01-17 | Microsoft Corporation | Reactive coincidence |
US9477537B2 (en) | 2010-12-13 | 2016-10-25 | Microsoft Technology Licensing, Llc | Reactive coincidence |
US10255238B2 (en) * | 2010-12-22 | 2019-04-09 | Software Ag | CEP engine and method for processing CEP queries |
US20120166469A1 (en) * | 2010-12-22 | 2012-06-28 | Software Ag | CEP engine and method for processing CEP queries |
US9584379B2 (en) | 2013-06-20 | 2017-02-28 | Microsoft Technology Licensing, Llc | Sorted event monitoring by context partition |
EP3011456B1 (en) * | 2013-06-20 | 2018-04-11 | Microsoft Technology Licensing, LLC | Sorted event monitoring by context partition |
US9767217B1 (en) * | 2014-05-28 | 2017-09-19 | Google Inc. | Streaming graph computations in a distributed processing system |
US20170116276A1 (en) * | 2015-10-23 | 2017-04-27 | Oracle International Corporation | Parallel execution of queries with a recursive clause |
US10452655B2 (en) | 2015-10-23 | 2019-10-22 | Oracle International Corporation | In-memory cursor duration temp tables |
US10642831B2 (en) | 2015-10-23 | 2020-05-05 | Oracle International Corporation | Static data caching for queries with a clause that requires multiple iterations to execute |
US10678792B2 (en) * | 2015-10-23 | 2020-06-09 | Oracle International Corporation | Parallel execution of queries with a recursive clause |
US10783142B2 (en) | 2015-10-23 | 2020-09-22 | Oracle International Corporation | Efficient data retrieval in staged use of in-memory cursor duration temporary tables |
Also Published As
Publication number | Publication date |
---|---|
US20120084322A1 (en) | 2012-04-05 |
US9229986B2 (en) | 2016-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9229986B2 (en) | Recursive processing in streaming queries | |
Chandramouli et al. | High-performance dynamic pattern matching over disordered streams | |
US20090125635A1 (en) | Consistency sensitive streaming operators | |
Greco et al. | Stratification criteria and rewriting techniques for checking chase termination | |
Kolchinsky et al. | Real-time multi-pattern detection over event streams | |
CN110622156A (en) | Incremental graph computation for querying large graphs | |
Kolchinsky et al. | Lazy evaluation methods for detecting complex events | |
Li | Computing complete answers to queries in the presence of limited access patterns | |
Greco et al. | Chase termination: A constraints rewriting approach | |
Neumann | Query simplification: graceful degradation for join-order optimization | |
US20070078816A1 (en) | Common sub-expression elimination for inverse query evaluation | |
Baumgartner et al. | Model evolution with equality—revised and implemented | |
Chandramouli et al. | On-the-fly progress detection in iterative stream queries | |
Jiang et al. | Scalable structural index construction for JSON analytics | |
US9286570B2 (en) | Property reactive modifications in a rete network | |
US11875199B2 (en) | Real-time multi-pattern detection over event streams | |
US10339454B2 (en) | Building a hybrid reactive rule engine for relational and graph reasoning | |
Stratulat | Validating back-links of FOLID cyclic pre-proofs | |
Gfeller et al. | Faster or-join enactment for bpmn 2.0 | |
Frochaux et al. | Puzzling over subsequence-query extensions: Disjunction and generalised gaps | |
Vidal et al. | Parallel AI Planning on the SCC | |
US11693862B2 (en) | Efficient adaptive detection of complex event patterns | |
Fesefeldt | Proving termination of pointer programs on top of symbolic execution | |
Winkler et al. | Optimizing mkbTT | |
Nigam et al. | An Operational Semantics for Network Datalog. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICROSOFT CORPORATION,WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GOLDSTEIN, JONATHAN D.;MAIER, DAVID E.;SIGNING DATES FROM 20080930 TO 20081002;REEL/FRAME:021659/0261 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034766/0509 Effective date: 20141014 |