CN116016262A - Method and device for detecting call chain consistency in real time based on union - Google Patents

Method and device for detecting call chain consistency in real time based on union Download PDF

Info

Publication number
CN116016262A
CN116016262A CN202211692298.8A CN202211692298A CN116016262A CN 116016262 A CN116016262 A CN 116016262A CN 202211692298 A CN202211692298 A CN 202211692298A CN 116016262 A CN116016262 A CN 116016262A
Authority
CN
China
Prior art keywords
span
real time
call chain
analysis platform
agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211692298.8A
Other languages
Chinese (zh)
Other versions
CN116016262B (en
Inventor
林思捷
黄锰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
Tianyi Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Cloud Technology Co Ltd filed Critical Tianyi Cloud Technology Co Ltd
Priority to CN202211692298.8A priority Critical patent/CN116016262B/en
Publication of CN116016262A publication Critical patent/CN116016262A/en
Application granted granted Critical
Publication of CN116016262B publication Critical patent/CN116016262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a method and a device for detecting call chain continuity in real time based on union collection, which comprises the following steps of S1: executing business logic by an application, and generating Span data by the agent; step S2: agent marks Span whether leaf nodes or not; step S3: the agent uploads Span data in batches; step S4: the analysis platform receives Span data; step S5: the analysis platform carries out continuity detection on Span in real time; step S6: the analysis platform carries out integrity detection on Span in real time; step S7: the analysis platform executes the call chain decision, so that the defect of tail coherent sampling is overcome, the consumption of the memory of the analysis node is effectively reduced, the analysis delay is reduced, and the performance and stability of platform service are improved.

Description

Method and device for detecting call chain consistency in real time based on union
Technical Field
The invention belongs to the field of data analysis, and more particularly belongs to the technical field of micro-service architecture multilink call chain detection sampling.
Background
As internet services expand and architectures upgrade, systems become increasingly complex, and more systems begin to go to distribution. Distributed Tracing distributed call chaining is one of the most common means for monitoring the performance of distributed applications, consisting of a series of Span segments, recording the complete call chaining from user requests to distributed services. Through analysis of the call chain, performance monitoring of the application can be realized, the root of the service performance problem of the large-scale distributed application can be quickly found, and the availability of the distributed application is obviously improved.
Currently, there are many open source systems for implementing trace analysis for call chains, such as jaeger, zikpin, elasticapm, pinpoint, skywalking, and as application traffic increases, link data increases. The full-quantity acquisition mode brings calculation and storage pressure to the monitoring service, and the full-quantity storage of call chain data which is rarely accessed daily is also a waste of storage resources, and the sampling is the fastest and mainstream mode for solving the problems at present. Common approaches include both head consecutive samples and tail consecutive samples. The basic idea of head coherent sampling is to determine whether the whole link is reserved according to a proportion algorithm when an entrance service receives a request, the probability of hitting an abnormal call chain and a slow response call chain is low, the basic idea of tail coherent sampling is to buffer all Span data of the same link, then make business decisions on the whole link, such as determining whether to store the call chain according to a request result state code or whether an abnormality occurs, needing to store data for a certain time, consuming memory resources and causing delay in analysis, and then, for example, a link tracking sampling method and system, the patent proposes to judge whether the sampling rate condition is met through a request sequence, judge whether the sampling rate condition is met through a sampling feature to sample, essentially, by combining traditional head and tail sampling, firstly randomly eliminate most call chains through head sampling, then make decisions on the eliminated call chains through a tail sampling method, and reserve the call chains meeting the feature condition. The patent does not optimize the probability of head sampling hit exception and slow response call chains to be low, nor does it optimize the memory resource consumption and delay problems of tail sampling.
Aiming at the problems, the invention aims at optimizing tail coherent sampling, adopts a union algorithm based on expansion to sample in real time, relieves the pressure of analysis nodes, and can improve the performance and stability of the server
Disclosure of Invention
The present invention has been made in view of the above-described problems.
According to one aspect of the invention, a method for detecting call chain continuity in real time based on union collection is provided, and the method comprises the following steps:
step S1: executing business logic by an application, and generating Span data by the agent;
step S2: agent marks Span whether leaf nodes or not;
step S3: the agent uploads Span data in batches;
step S4: the analysis platform receives Span data;
step S5: the analysis platform carries out continuity detection on Span in real time;
step S6: the analysis platform carries out integrity detection on Span in real time;
step S7: the analysis platform performs call chain decisions.
The invention also provides a device for detecting call chain consistency in real time based on the union, which comprises:
the generation module is used for: executing business logic by an application, and generating Span data by the agent;
and a marking module: agent marks Span whether leaf nodes or not;
and an uploading module: the agent uploads Span data in batches;
and a receiving module: the analysis platform receives Span data;
detection module 1: the analysis platform carries out continuity detection on Span in real time;
detection module 2: the analysis platform carries out integrity detection on Span in real time;
the execution module: the analysis platform performs call chain decisions.
Compared with the prior art, the application has the following beneficial effects:
compared with the prior art, the method and the device have the advantages that the continuity and the integrity of the calling chain can be accurately detected in real time, meanwhile, the consumption of memory resources can be obviously reduced, and the analysis and statistics performance is improved.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following more particular description of embodiments of the present invention, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, and not constitute a limitation to the invention. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 illustrates an exemplary block diagram of concurrent-collection-based real-time detection in accordance with one embodiment of the present invention;
FIG. 2 shows a visualized call chain Span presentation illustration in accordance with one embodiment of the invention;
FIG. 3 illustrates a primary flow diagram for detecting call chain continuity in real time, according to one embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.
Embodiment one:
in general, a complete call chain data is created and reported by each system respectively, and meanwhile, the traditional Span specification only defines whether a father node Span exists or not, and lacks definition of whether a child node Span exists or not, so when the Span is reported to a service platform in a traditional manner, the service platform cannot judge whether the Span contains child nodes or not, and cannot judge whether the child Span of the currently reported Span is reported or not in real time, so that whether the call chain is complete or not cannot be judged in real time.
In order to solve the problems, a method and a device for detecting call chain consistency in real time based on union collection are provided. Specifically, the technical scheme of the invention is that whether the Span has a sub Span is judged by adding a Span leaf node isLeaf identifier, meanwhile, whether the currently reported call chain meets consistency and integrity is judged by combining and checking a collection algorithm, when the Span is reported by each application, the root Span of the Span is searched recursively according to the Parentspan Id of the Span, whether all spans of the service platform have the same root Span is detected, if the same root Span indicates that all spans of the current service platform are mutually communicated, consistency detection is met, and then whether all non-leaf node spans find the corresponding sub Span is judged, and when the condition is met, the current call chain is considered to meet the integrity requirement.
Noun parsing is as follows:
trace: a call chain, which generates a final result response from a user initiated request to each inter-call between the micro services, generates a plurality of call records in the process, and the combination of the call records is called a complete call chain data, wherein the plurality of call records identify the call chain to which the same traceId belongs.
Span: call records, also called spans, record information of a call, including TraceId, spanId, parentSpanId, call service name, timestamp, input/output parameters, execution exception information, etc., and the call is generally divided into an intra-system call and an inter-RPC system call.
System internal call: typically, are method calls within the same thread that pass the TraceId information of the call chain through ThreadLocal.
RPC intersystem call: cross-system calls, commonly referred to as Http, dubbo, pass the TraceId information through a specific protocol.
Based on the above situation, the application provides a method for detecting call chain consistency in real time based on union collection.
Referring to fig. 3 specifically, a method for detecting call chain continuity based on union real-time detection according to an embodiment of the present invention includes:
step S1: executing business logic by an application, and generating Span data by the agent;
step S2: agent marks Span whether leaf nodes or not;
step S3: the agent uploads Span data in batches;
step S4: the analysis platform receives Span data;
step S5: the analysis platform carries out continuity detection on Span in real time;
step S6: the analysis platform carries out integrity detection on Span in real time;
step S7: the analysis platform performs call chain decisions.
As an example, a specific method is as follows: the user initiated request generation call chain data structure is as follows:
current Span Application of the genus Father Span Status of isLeaf
1 Application A 1 False
2 Application A 1 False
3 Application B 2 True
4 Application A 1 False
5 Application C 4 True
The reporting Span time of each application cannot be determined, and may be (A, B, C), or (B, C, A), (C, B, A), (A, C, B), etc.
Taking the reporting sequence (A, B, C) as an example, the A firstly reports Span1, span2 and Span4 data to the service platform, and the service platform analyzes that all the currently reported Span reachable root nodes are Span1 by utilizing a union algorithm, so that the call chain consistency requirement is met:
node Span1 Span2 Span4
Root node Span1 Span1 Span1
Whether or not to reach True True True
With isLeaf state analysis, the sub Span cannot be found due to Span2 and Span4, and the integrity requirement is not satisfied:
node Span1 Span2 Span4
Whether or not leaf node False False False
Presence child node True False False
After receiving Span reported by the B, the service platform detects the continuity state, and meets the continuity requirement as follows:
Figure BDA0004021721390000041
Figure BDA0004021721390000051
detecting the integrity state by using the isLeaf state, because Span4 cannot find the child Span, the integrity requirement is not satisfied:
Figure BDA0004021721390000052
after receiving Span reported by C, the service platform detects continuity state, and meets the continuity requirement as follows:
Figure BDA0004021721390000053
detecting an integrity state by using the isLeaf state, and meeting the integrity requirement:
Figure BDA0004021721390000054
and according to the consistency and integrity state judgment, the service platform detects that all Span data are reported by the current call chain in real time, and further can carry out subsequent sampling decision analysis.
The invention solves the defect of tail coherent sampling, effectively reduces the consumption of the memory of the analysis node, reduces the analysis delay and improves the performance and stability of platform service.
Embodiment two:
the invention also provides a device for detecting call chain consistency in real time based on the union, which comprises:
the generation module is used for: executing business logic by an application, and generating Span data by the agent;
and a marking module: agent marks Span whether leaf nodes or not;
and an uploading module: the agent uploads Span data in batches;
and a receiving module: the analysis platform receives Span data;
detection module 1: the analysis platform carries out continuity detection on Span in real time;
detection module 2: the analysis platform carries out integrity detection on Span in real time;
the execution module: the analysis platform performs call chain decisions.
As an example, a specific apparatus performs the following method: the user initiated request generation call chain data structure is as follows:
current Span Application of the genus Father Span Status of isLeaf
1 Application A 1 False
2 Application A 1 False
3 Application B 2 True
4 Application A 1 False
5 Application C 4 True
The reporting Span time of each application cannot be determined, and may be (A, B, C), or (B, C, A), (C, B, A), (A, C, B), etc.
Taking the reporting sequence (A, B, C) as an example, the A firstly reports Span1, span2 and Span4 data to the service platform, and the service platform analyzes that all the currently reported Span reachable root nodes are Span1 by utilizing a union algorithm, so that the call chain consistency requirement is met:
node Span1 Span2 Span4
Root node Span1 Span1 Span1
Whether or not to reach True True True
With isLeaf state analysis, the sub Span cannot be found due to Span2 and Span4, and the integrity requirement is not satisfied:
Figure BDA0004021721390000061
after receiving Span reported by the B, the service platform detects the continuity state, and meets the continuity requirement as follows:
node Span1 Span2 Span4 Span3
Root node Span1 Span1 Span1 Span1
Whether or not to reach True True True True
Detecting the integrity state by using the isLeaf state, because Span4 cannot find the child Span, the integrity requirement is not satisfied:
Figure BDA0004021721390000071
after receiving Span reported by C, the service platform detects continuity state, and meets the continuity requirement as follows:
Figure BDA0004021721390000072
detecting an integrity state by using the isLeaf state, and meeting the integrity requirement:
Figure BDA0004021721390000073
and according to the consistency and integrity state judgment, the service platform detects that all Span data are reported by the current call chain in real time, and further can carry out subsequent sampling decision analysis.
The method for detecting the call chain continuity based on the parallel collection real-time detection provided by the invention realizes accurate analysis of the call chain integrity, improves the high concurrency scene analysis statistical performance, overcomes the defect of tail consecutive sampling, effectively reduces the consumption of analysis node memory, reduces analysis delay, and improves the performance and stability of platform service.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the above illustrative embodiments are merely illustrative and are not intended to limit the scope of the present invention thereto. Various changes and modifications may be made therein by one of ordinary skill in the art without departing from the scope and spirit of the invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in order to streamline the invention and aid in understanding one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of the invention. However, the method of the present invention should not be construed as reflecting the following intent: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention. It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be combined in any combination, except combinations where the features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
The foregoing description is merely illustrative of specific embodiments of the present invention and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention. The protection scope of the invention is subject to the protection scope of the claims.

Claims (2)

1. A method for detecting call chain consistency in real time based on a union, the method comprising:
step S1: executing business logic by an application, and generating Span data by the agent;
step S2: agent marks Span whether leaf nodes or not;
step S3: the agent uploads Span data in batches;
step S4: the analysis platform receives Span data;
step S5: the analysis platform carries out continuity detection on Span in real time;
step S6: the analysis platform carries out integrity detection on Span in real time;
step S7: the analysis platform performs call chain decisions.
2. An apparatus for detecting call chain consistency in real time based on a union, the apparatus comprising: the generation module is used for: executing business logic by an application, and generating Span data by the agent;
and a marking module: agent marks Span whether leaf nodes or not;
and an uploading module: the agent uploads Span data in batches;
and a receiving module: the analysis platform receives Span data;
detection module 1: the analysis platform carries out continuity detection on Span in real time;
detection module 2: the analysis platform carries out integrity detection on Span in real time;
the execution module: the analysis platform performs call chain decisions.
CN202211692298.8A 2022-12-28 2022-12-28 Method and device for detecting call chain consistency in real time based on union Active CN116016262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211692298.8A CN116016262B (en) 2022-12-28 2022-12-28 Method and device for detecting call chain consistency in real time based on union

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211692298.8A CN116016262B (en) 2022-12-28 2022-12-28 Method and device for detecting call chain consistency in real time based on union

Publications (2)

Publication Number Publication Date
CN116016262A true CN116016262A (en) 2023-04-25
CN116016262B CN116016262B (en) 2024-05-24

Family

ID=86034895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211692298.8A Active CN116016262B (en) 2022-12-28 2022-12-28 Method and device for detecting call chain consistency in real time based on union

Country Status (1)

Country Link
CN (1) CN116016262B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8738414B1 (en) * 2010-12-31 2014-05-27 Ajay R. Nagar Method and system for handling program, project and asset scheduling management
US20140168228A1 (en) * 2012-12-13 2014-06-19 Nvidia Corporation Fine-grained parallel traversal for ray tracing
CA2988957A1 (en) * 2015-06-09 2016-12-15 Ultrata Llc Infinite memory fabric hardware implementation with router
WO2017004858A1 (en) * 2015-07-08 2017-01-12 重庆邮电大学 Subscriber tag-based shunting method and system in 5g network
CN106649055A (en) * 2017-01-10 2017-05-10 山东浪潮云服务信息科技有限公司 Domestic CPU (central processing unit) and operating system based software and hardware fault alarming system and method
CN110069358A (en) * 2019-04-18 2019-07-30 彩讯科技股份有限公司 Call chain trace analysis method, apparatus, electronic equipment and storage medium
CN111478806A (en) * 2020-04-02 2020-07-31 聚好看科技股份有限公司 Link tracking sampling method and system
WO2021242466A1 (en) * 2020-05-28 2021-12-02 Splunk, Inc. Computing performance analysis for spans in a microservices-based architecture
CN114078597A (en) * 2020-07-29 2022-02-22 甲骨文国际公司 Decision trees with support from text for healthcare applications
CN114168421A (en) * 2021-12-09 2022-03-11 上海甄云信息科技有限公司 Customized code compatibility analysis system and method based on micro-service call chain
CN114721898A (en) * 2022-03-16 2022-07-08 缀初网络技术(上海)有限公司 Edge cloud server utilization rate prediction method and device based on boosting algorithm and storage medium
CN115276226A (en) * 2022-07-11 2022-11-01 广西电网有限责任公司 Power distribution network operation monitoring and commanding platform and method based on micro-service technology

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8738414B1 (en) * 2010-12-31 2014-05-27 Ajay R. Nagar Method and system for handling program, project and asset scheduling management
US20140168228A1 (en) * 2012-12-13 2014-06-19 Nvidia Corporation Fine-grained parallel traversal for ray tracing
CA2988957A1 (en) * 2015-06-09 2016-12-15 Ultrata Llc Infinite memory fabric hardware implementation with router
WO2017004858A1 (en) * 2015-07-08 2017-01-12 重庆邮电大学 Subscriber tag-based shunting method and system in 5g network
CN106649055A (en) * 2017-01-10 2017-05-10 山东浪潮云服务信息科技有限公司 Domestic CPU (central processing unit) and operating system based software and hardware fault alarming system and method
CN110069358A (en) * 2019-04-18 2019-07-30 彩讯科技股份有限公司 Call chain trace analysis method, apparatus, electronic equipment and storage medium
CN111478806A (en) * 2020-04-02 2020-07-31 聚好看科技股份有限公司 Link tracking sampling method and system
WO2021242466A1 (en) * 2020-05-28 2021-12-02 Splunk, Inc. Computing performance analysis for spans in a microservices-based architecture
CN114078597A (en) * 2020-07-29 2022-02-22 甲骨文国际公司 Decision trees with support from text for healthcare applications
CN114168421A (en) * 2021-12-09 2022-03-11 上海甄云信息科技有限公司 Customized code compatibility analysis system and method based on micro-service call chain
CN114721898A (en) * 2022-03-16 2022-07-08 缀初网络技术(上海)有限公司 Edge cloud server utilization rate prediction method and device based on boosting algorithm and storage medium
CN115276226A (en) * 2022-07-11 2022-11-01 广西电网有限责任公司 Power distribution network operation monitoring and commanding platform and method based on micro-service technology

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
M. TAASSORI ET AL.: "Compact Leakage-Free Support for Integrity and Reliability", 《2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA)》, 13 July 2020 (2020-07-13) *
吴晓;马骏;陶先平;吕建;: "层次化网关转发的agent迁移技术与应用", 南京大学学报(自然科学), no. 02, 30 March 2010 (2010-03-30) *
邹丹丹;丁建兵;王希栋;叶晓舟;欧阳晔: "基于微服务调用链双向搜索的故障根因定位方法", 《通信技术》, 20 November 2022 (2022-11-20) *

Also Published As

Publication number Publication date
CN116016262B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
CN111459766B (en) Micro-service system-oriented call chain tracking and analyzing method
CN107729210B (en) Distributed service cluster abnormity diagnosis method and device
CN103176974B (en) The method and apparatus of access path in optimization data storehouse
CN109587008B (en) Method, device and storage medium for detecting abnormal flow data
CN108322320B (en) Service survivability analysis method and device
CN110502494A (en) Log processing method, device, computer equipment and storage medium
CN109684052B (en) Transaction analysis method, device, equipment and storage medium
CN111881011A (en) Log management method, platform, server and storage medium
CN113242157B (en) Centralized data quality monitoring method under distributed processing environment
CN101997709A (en) Root alarm data analysis method and system
CN110704484A (en) Method and system for processing mass real-time data stream
CN110647447A (en) Abnormal instance detection method, apparatus, device and medium for distributed system
CN116166505B (en) Monitoring platform, method, storage medium and equipment for dual-state IT architecture in financial industry
JP2023534696A (en) Anomaly detection in network topology
CN108108445A (en) A kind of data intelligence processing method and system
Zhang et al. A survey on quality assurance techniques for big data applications
CN111181800A (en) Test data processing method and device, electronic equipment and storage medium
CN113596078A (en) Service problem positioning method and device
CN102437959B (en) Stream forming method based on dual overtime network message
CN116016262B (en) Method and device for detecting call chain consistency in real time based on union
CN110417621B (en) Method for detecting abnormal operation state of lightweight embedded system
CN106547860B (en) Method for positioning performance fault of distributed database
US11797366B1 (en) Identifying a root cause of an error
CN106502842A (en) Data reconstruction method and system
CN114116811B (en) Log processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant