US20140229414A1 - Systems and methods for detecting anomalies - Google Patents
Systems and methods for detecting anomalies Download PDFInfo
- Publication number
- US20140229414A1 US20140229414A1 US14/143,185 US201314143185A US2014229414A1 US 20140229414 A1 US20140229414 A1 US 20140229414A1 US 201314143185 A US201314143185 A US 201314143185A US 2014229414 A1 US2014229414 A1 US 2014229414A1
- Authority
- US
- United States
- Prior art keywords
- surprise
- historical
- scores
- property values
- property
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 239000000523 sample Substances 0.000 claims abstract description 86
- 238000001514 detection method Methods 0.000 claims abstract description 32
- 230000000875 corresponding effect Effects 0.000 claims abstract description 25
- 238000012544 monitoring process Methods 0.000 claims abstract description 14
- 230000015654 memory Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 3
- 230000002730 additional effect Effects 0.000 claims 7
- 238000010586 diagram Methods 0.000 description 20
- 239000000047 product Substances 0.000 description 14
- 230000006870 function Effects 0.000 description 13
- 238000005259 measurement Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- RWSOTUBLDIXVET-UHFFFAOYSA-N Dihydrogen sulfide Chemical compound S RWSOTUBLDIXVET-UHFFFAOYSA-N 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000005291 magnetic effect Effects 0.000 description 1
- 238000012011 method of payment Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000013450 outlier detection Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B63—SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
- B63J—AUXILIARIES ON VESSELS
- B63J4/00—Arrangements of installations for treating ballast water, waste water, sewage, sludge, or refuse, or for preventing environmental pollution not otherwise provided for
- B63J4/002—Arrangements of installations for treating ballast water, waste water, sewage, sludge, or refuse, or for preventing environmental pollution not otherwise provided for for treating ballast water
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F1/00—Treatment of water, waste water, or sewage
- C02F1/008—Control or steering systems not provided for elsewhere in subclass C02F
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F1/00—Treatment of water, waste water, or sewage
- C02F1/30—Treatment of water, waste water, or sewage by irradiation
- C02F1/32—Treatment of water, waste water, or sewage by irradiation with ultraviolet light
- C02F1/325—Irradiation devices or lamp constructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F1/00—Treatment of water, waste water, or sewage
- C02F1/001—Processes for the treatment of water whereby the filtration technique is of importance
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2103/00—Nature of the water, waste water, sewage or sludge to be treated
- C02F2103/008—Originating from marine vessels, ships and boats, e.g. bilge water or ballast water
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2103/00—Nature of the water, waste water, sewage or sludge to be treated
- C02F2103/08—Seawater, e.g. for desalination
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2201/00—Apparatus for treatment of water, waste water or sewage
- C02F2201/32—Details relating to UV-irradiation devices
- C02F2201/326—Lamp control systems
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2209/00—Controlling or monitoring parameters in water treatment
- C02F2209/005—Processes using a programmable logic controller [PLC]
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2209/00—Controlling or monitoring parameters in water treatment
- C02F2209/02—Temperature
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2209/00—Controlling or monitoring parameters in water treatment
- C02F2209/06—Controlling or monitoring parameters in water treatment pH
-
- C—CHEMISTRY; METALLURGY
- C02—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F—TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
- C02F2209/00—Controlling or monitoring parameters in water treatment
- C02F2209/11—Turbidity
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T70/00—Maritime or waterways transport
Definitions
- the present invention relates generally to data processing, and in some embodiments, to detecting anomalies in computer-based systems.
- enterprises maintain and operate large numbers of computer systems (e.g., servers) that may each run a layered set of software.
- these computer systems provide functionality for the operation of the enterprise or to provide outbound services to their customers.
- the enterprise may monitor the hardware and software layers of these servers by logging processing load, memory usage, and many other monitored signals at frequent intervals.
- the enterprise may occasionally suffer disruptions, where some of its services were degraded or even completely unavailable to customers.
- the enterprise will perform a post-mortem analysis of the monitored signal in an effort to debug the system. For example, the enterprise may analyze the memory usage to identify a program the may be performing improperly, or view the processing load to determine whether more hardware is needed.
- FIG. 1 illustrates a block diagram depicting a network architecture of a system, according to some embodiments, having a client-server architecture configured for exchanging data over a network.
- FIG. 2 illustrates a block diagram showing components provided within the system of FIG. 1 according to some embodiments.
- FIG. 3 is a diagram showing sampled values of a number of searches performed on a computer system that are observed over a time period, such as a twenty-four hour period, according to an example embodiment.
- FIG. 4 is a diagram showing additional sampled values from two additional days, as compared to FIG. 3 , according to an example embodiment.
- FIG. 5 is diagram of a plot of metric data over time for a metric of a computer system, according to an example embodiment.
- FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment.
- FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment.
- FIG. 8 is a histogram showing quantiles for a metric over a two-week period, according to an example embodiment.
- FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment.
- FIG. 10 is a histogram that includes the quantiles shown in FIG. 8 but with a new quantile, according to an example embodiment.
- FIG. 11 is a plot of the quantiles in a time series with a new quantile, according to an example embodiment.
- FIG. 12 is a flowchart diagram illustrating a method for detecting an anomaly in a computer system, according to an example embodiment.
- FIG. 13 is a diagram illustrating a property value table that may be generated based on executing the probes, according to an example embodiment.
- FIG. 14 is a diagram showing property values for a probe-property type pair, according to an example embodiment.
- FIG. 15 is a diagram illustrating a surprise score table, according to an example embodiment.
- FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation, according to an example embodiment.
- FIG. 17 is a chart illustrating surprise score features over time, according to an example embodiment.
- FIG. 18 illustrates a diagrammatic representation of a machine in the example form of a computer system within which a set of instructions may be executed to cause the machine to perform any one or more of the methodologies discussed herein.
- Described in detail herein is an apparatus and method for detecting anomalies in a computer system. For example, some embodiments may be used to address the problem of how to monitor signals in a computer system to detect disruptions before they affect users, and to do so with few false positives. Some embodiments may address this problem by analyzing signals for strange behavior that may be referred to as an anomaly. Example embodiments can then scan multiple monitored signals, and raise an alert when the site monitoring system detects an anomaly.
- multiple probes are executed on an evolving data set (e.g., a listing database). Each probe may return a result.
- Property values are then derived from a respective result returned by one of the probes.
- a property value may be a value that quantifies a property or aspect of the result, such as, for example, a number of listing returned, a portion of classified listings, a measurement of the prices in a listing, and the like.
- Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values.
- the corresponding property value and the historical property values are derived from results returned from the same probe.
- Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.
- FIG. 1 illustrates a network diagram depicting a network system 100 , according to one embodiment, having a client-server architecture configured for exchanging data over a network.
- a networked system 102 forms a network-based publication system that provides server-side functionality, via a network 104 (e.g., the Internet or Wide Area Network (WAN)), to one or more clients and devices.
- FIG. 1 further illustrates, for example, one or both of a web client 106 (e.g., a web browser) and a programmatic client 108 executing on device machines 110 and 112 .
- the publication system 100 comprises a marketplace system.
- the publication system 100 comprises other types of systems such as, but not limited to, a social networking system, a matching system, a recommendation system, an electronic commerce (e-commerce) system, a search system, and the like.
- Each of the device machines 110 , 112 comprises a computing device that includes at least a display and communication capabilities with the network 104 to access the networked system 102 .
- the device machines 110 , 112 comprise, but are not limited to, remote devices, work stations, computers, general purpose computers, Internet appliances, hand-held devices, wireless devices, portable devices, wearable computers, cellular or mobile phones, portable digital assistants (PDAs), smart phones, tablets, ultrabooks, netbooks, laptops, desktops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, network PCs, mini-computers, and the like.
- Each of the device machines 110 , 112 may connect with the network 104 via a wired or wireless connection.
- one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks.
- VPN virtual private network
- LAN local area network
- WLAN wireless LAN
- WAN wide area network
- WWAN wireless WAN
- MAN metropolitan area network
- PSTN Public Switched Telephone Network
- PSTN Public Switched Telephone Network
- Each of the device machines 110 , 112 includes one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, an e-commerce site application (also referred to as a marketplace application), and the like.
- applications also referred to as “apps”
- this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the networked system 102 , on an as needed basis, for data and/or processing capabilities not locally available (such as access to a database of items available for sale, to authenticate a user, to verify a method of payment, etc.).
- the given one of the device machines 110 , 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102 .
- the given one of the device machines 110 , 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102 .
- two device machines 110 , 112 are shown in FIG. 1 , more or less than two device machines can be included in the system 100 .
- An Application Program Interface (API) server 114 and a web server 116 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 118 .
- the application servers 118 host one or more marketplace applications 120 and payment applications 122 .
- the application servers 118 are, in turn, shown to be coupled to one or more databases servers 124 that facilitate access to one or more databases 126 .
- the marketplace applications 120 may provide a number of e-commerce functions and services to users that access networked system 102 .
- E-commerce functions/services may include a number of publisher functions and services (e.g., search, listing, content viewing, payment, etc.).
- the marketplace applications 120 may provide a number of services and functions to users for listing goods and/or services or offers for goods and/or services for sale, searching for goods and services, facilitating transactions, and reviewing and providing feedback about transactions and associated users.
- the marketplace applications 120 may track and store data and metadata relating to listings, transactions, and user interactions.
- the marketplace applications 120 may publish or otherwise provide access to content items stored in application servers 118 or databases 126 accessible to the application servers 118 and/or the database servers 124 .
- the payment applications 122 may likewise provide a number of payment services and functions to users.
- the payment applications 122 may allow users to accumulate value (e.g., in a commercial currency, such as the U.S. dollar, or a proprietary currency, such as “points”) in accounts, and then later to redeem the accumulated value for products or items (e.g., goods or services) that are made available via the marketplace applications 120 .
- a commercial currency such as the U.S. dollar
- a proprietary currency such as “points”
- the payment applications 122 may form part of a payment service that is separate and distinct from the networked system 102 .
- the payment applications 122 may be omitted from the system 100 .
- at least a portion of the marketplace applications 120 may be provided on the device machines 110 and/or 112 .
- system 100 shown in FIG. 1 employs a client-server architecture
- embodiments of the present disclosure is not limited to such an architecture, and may equally well find application in, for example, a distributed or peer-to-peer architecture system.
- the various marketplace and payment applications 120 and 122 may also be implemented as standalone software programs, which do not necessarily have networking capabilities.
- the web client 106 accesses the various marketplace and payment applications 120 and 122 via the web interface supported by the web server 116 .
- the programmatic client 108 accesses the various services and functions provided by the marketplace and payment applications 120 and 122 via the programmatic interface provided by the API server 114 .
- the programmatic client 108 may, for example, be a seller application (e.g., the TurboLister application developed by eBay Inc., of San Jose, Calif.) to enable sellers to author and manage listings on the networked system 102 in an off-line manner, and to perform batch-mode communications between the programmatic client 108 and the networked system 102 .
- FIG. 1 also illustrates a third party application 128 , executing on a third party server machine 130 , as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 114 .
- the third party application 128 may, utilizing information retrieved from the networked system 102 , support one or more features or functions on a website hosted by the third party.
- the third party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of the networked system 102 .
- FIG. 2 illustrates a block diagram showing components provided within the networked system 102 according to some embodiments.
- the networked system 102 may be hosted on dedicated or shared server machines (not shown) that are communicatively coupled to enable communications between server machines.
- the components themselves are communicatively coupled (e.g., via appropriate interfaces) to each other and to various data sources, so as to allow information to be passed between the applications or so as to allow the applications to share and access common data.
- the components may access one or more databases 126 via the database servers 124 .
- the networked system 102 may provide a number of publishing, listing, and/or price-setting mechanisms whereby a seller (also referred to as a first user) may list (or publish information concerning) goods or services for sale or barter, a buyer (also referred to as a second user) can express interest in or indicate a desire to purchase or barter such goods or services, and a transaction (such as a trade) may be completed pertaining to the goods or services.
- the networked system 102 may comprise at least one publication engine 202 and one or more selling engines 204 .
- the publication engine 202 may publish information, such as item listings or product description pages, on the networked system 102 .
- the selling engines 204 may comprise one or more fixed-price engines that support fixed-price listing and price setting mechanisms and one or more auction engines that support auction-format listing and price setting mechanisms (e.g., English, Dutch, Chinese, Double, Reverse auctions, etc.).
- the various auction engines may also provide a number of features in support of these auction-format listings, such as a reserve price feature whereby a seller may specify a reserve price in connection with a listing and a proxy-bidding feature whereby a bidder may invoke automated proxy bidding.
- the selling engines 204 may further comprise one or more deal engines that support merchant-generated offers for products and services.
- a listing engine 206 allows sellers to conveniently author listings of items or authors to author publications.
- the listings pertain to goods or services that a user (e.g., a seller) wishes to transact via the networked system 102 .
- the listings may be an offer, deal, coupon, or discount for the good or service.
- Each good or service is associated with a particular category.
- the listing engine 206 may receive listing data such as title, description, and aspect name/value pairs.
- each listing for a good or service may be assigned an item identifier.
- a user may create a listing that is an advertisement or other form of information publication. The listing information may then be stored to one or more storage devices coupled to the networked system 102 (e.g., databases 126 ).
- Listings also may comprise product description pages that display a product and information (e.g., product title, specifications, and reviews) associated with the product.
- the product description page may include an aggregation of item listings that correspond to the product described on the product description page.
- the listing engine 206 also may allow buyers to conveniently author listings or requests for items desired to be purchased.
- the listings may pertain to goods or services that a user (e.g., a buyer) wishes to transact via the networked system 102 .
- Each good or service is associated with a particular category.
- the listing engine 206 may receive as much or as little listing data, such as title, description, and aspect name/value pairs, that the buyer is aware of about the requested item.
- the listing engine 206 may parse the buyer's submitted item information and may complete incomplete portions of the listing.
- the listing engine 206 may parse the description, extract key terms and use those terms to make a determination of the identity of the item. Using the determined item identity, the listing engine 206 may retrieve additional item details for inclusion in the buyer item request. In some embodiments, the listing engine 206 may assign an item identifier to each listing for a good or service.
- the listing engine 206 allows sellers to generate offers for discounts on products or services.
- the listing engine 206 may receive listing data, such as the product or service being offered, a price and/or discount for the product or service, a time period for which the offer is valid, and so forth.
- the listing engine 206 permits sellers to generate offers from the sellers' mobile devices. The generated offers may be uploaded to the networked system 102 for storage and tracking
- the listing engine 206 allows users to navigate through various categories, catalogs, or inventory data structures according to which listings may be classified within the networked system 102 .
- the listing engine 206 allows a user to successively navigate down a category tree comprising a hierarchy of categories (e.g., the category tree structure) until a particular set of listing is reached.
- Various other navigation applications within the listing engine 206 may be provided to supplement the searching and browsing applications.
- the listing engine 206 may record the various user actions (e.g., clicks) performed by the user in order to navigate down the category tree.
- Searching the networked system 102 is facilitated by a searching engine 208 .
- the searching engine 208 enables keyword queries of listings published via the networked system 102 .
- the searching engine 208 receives the keyword queries from a device of a user and conducts a review of the storage device storing the listing information. The review will enable compilation of a result set of listings that may be sorted and returned to the client device (e.g., device machine 110 , 112 ) of the user.
- the searching engine 208 may record the query (e.g., keywords) and any subsequent user actions and behaviors (e.g., navigations, selections, or click-throughs).
- the searching engine 208 also may perform a search based on a location of the user.
- a user may access the searching engine 208 via a mobile device and generate a search query. Using the search query and the user's location, the searching engine 208 may return relevant search results for products, services, offers, auctions, and so forth to the user.
- the searching engine 208 may identify relevant search results both in a list form and graphically on a map. Selection of a graphical indicator on the map may provide additional details regarding the selected search result.
- the user may specify, as part of the search query, a radius or distance from the user's current location to limit search results.
- the searching engine 208 also may perform a search based on an image.
- the image may be taken from a camera or imaging component of a client device or may be accessed from storage.
- the networked system 102 may further included an anomaly detection engine 212 and a probe module 210 to perform various anomaly detection functionalities or operations as set forth in greater detail below.
- some example embodiments may be configured to detect anomalies in an evolving data set by comparing surprise scores of property values received from a probe module.
- some simplified examples of analyzing property values are now described to highlight some potential aspects addressed by example embodiments. For example, as a warm-up problem, consider a signal from a high software layer: the number of searches (“srp”) received or performed by the networked system 102 of FIG. 1 . In some cases, srp may be tracked by the probe module 210 periodically, say, for example, every two minutes. FIG.
- FIG. 3 is a diagram showing sampled values 300 of srp observed over a time period, such as a twenty-four hour period, according to an example embodiment.
- the vertical axis range may represent sampled values of the number of searches performed over a two minute period, whereas the horizontal axis may represent time, which ranges from midnight to midnight, PDT.
- the anomaly detecting engine 212 may identify that sampled values 302 and 304 , occurring around 4:00 AM and10:30 PM, respectively, are suspicious because the sample values 302 and 304 each exhibit a comparatively drastic deviation from their neighboring values.
- FIG. 4 is a diagram showing sampled values 400 that includes sampled values from the prior two days, relative to the sampled values 300 of FIG. 3 , according to an example embodiment. Based on the sampled values 400 , one may reasonably conclude that the sampled value 302 should be categorized as an anomaly, but not the sampled value 304 . Such is the case because the sampled value 304 is consistent with the other two days of samples, whereas the sampled value 302 is inconsistent with the other two days.
- FIGS. 3 and 4 suggest that comparing a property value of a result against historical property values may be used as a simple feature for detecting anomalies in a computer system.
- the feature may be based on comparing a current value of srp with its respective value 24 hours ago, 48 hours ago, 92 hours ago, and so forth.
- the probe module 210 may periodically issue a query (or a set of queries) and log values for one or more properties relating to the search result returned by the query (or each query in the set of queries). Examples of properties that may be tracked by the probe module 210 include a number of items returned from a search query, the average list price, a measurement of the number of items that are auctions relative to non-auction items, a number of classified listings, a number of searches executed, etc.
- the anomaly detection engine 212 may repeatedly cycle through a fixed set (e.g., tens, hundreds, thousands, and so forth) of queries to build a historical model of the property values for each of the search queries over time.
- FIG. 5 is diagram of a plot of property values 500 over time for a property of a computer system, according to an example embodiment.
- the property values may include one or more sampled values of for a property, which are sampled over time.
- the horizontal axis of FIG. 5 represents time, with the right-hand side representing the most recent samples.
- the vertical axis of FIG. 5 represents values of the property being monitored by the anomaly detecting engine 212 .
- the property values 500 may represent the median sales price for the listings returned when the probe module 210 submits the search query “htc hd2” to the searching engine 208 .
- the plot may include a fitted line 504 to represent expected values from the property over time. As may be appreciated from FIG.
- the property values 500 exhibits some noise (e.g., values that deviate from the fitted line 504 ). However, even when compared to the noise within the property values 500 , the property value 502 may represent an anomaly because the deviation of the property value 502 is deviates significantly from the fitted line 504 when compared to the other values of the property values 500 .
- the anomaly detecting engine 212 may determine whether a value of a property represents an anomaly caused by a site disruption based in part on calculating surprise scores for the property value.
- a surprise score may be a measurement used to quantify how out of the norm a value for a property is based on historical values for that property.
- the anomaly detecting engine 212 may quantify the surprise score for a value of a property by computing the (unsigned) deviation of each property value from an expected value.
- one specific implementation of calculating a surprise score may involve dividing the deviation of a value from the expected value (e.g., the fitted line 504 ) by the median deviation of all the values.
- the anomaly detecting engine 212 may assign the value 502 a surprise score of 7.3 (e.g., 97.9/13.4).
- FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment.
- a surprise score of 7 is not unusual because a surprise score of 7 is not far off in value from other surprise scores.
- FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment. Since log(7.3) ⁇ 2, it is clear that a value of 2 is not all that unusual. Quantitatively, the percentile of the surprise score for the query “htc hd2” is about 96%. Ringing an alarm for a surprise this large may generate a large number of false positives.
- FIGS. 6 and 7 are diagrams illustrating the difficulty in getting a low false positive rate when using statistical methods for detecting anomalies, according to an example embodiment.
- An example system may, for example, periodically execute 3000 queries six times a day to log measure data relating to 40 different properties.
- the anomaly detection engine 212 can construct a feature based on the number of surprise scores that deviate from historical norms. A sudden change in the number of high surprise scores, for example, might be a good indicator of a site disruption. This is done separately for each property being monitored by the anomaly detection engine 212 . To make this quantitative, instead of counting the number of queries with a high surprise, some embodiments of the anomaly detection engine 212 can examine a quantile (e.g., 0.9 th quantile) of the surprise values for a property. Using the quantiles to detect anomalies is now described.
- a quantile e.g., 0.9 th quantile
- the surprise score of the most recent property value depends on at least the following: a property type (e.g., mean sales price listed), a property value (e.g., a value for the mean sales price listed), the probe (e.g., a query that generates a result of listed items for sale), and a collection window of recent property values for the probe.
- a property type e.g., mean sales price listed
- a property value e.g., a value for the mean sales price listed
- the probe e.g., a query that generates a result of listed items for sale
- a collection window of recent property values for the probe e.g., a collection window of recent property values for the probe.
- FIG. 8 is a histogram showing quantiles 800 for a property type (e.g., a median sale price) over a two-week period, according to an example embodiment. As shown in FIG. 8 , the quantiles are clustered near 3.6, with a range from 3.0 to 4.6.
- FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment. This shows the feature (e.g., the quantile of the surprise score) is fairly smooth, and might be, in some cases, a candidate for anomaly detection.
- FIG. 10 is a histogram that includes the quantiles 800 shown in FIG. 8 but with a new quantile 1002 , according to an example embodiment.
- the new quantile 1002 may be calculated based on the value of the property when the anomaly detection engine 212 executes the set of queries again.
- FIG. 11 is a plot of the quantiles in a time series with the new quantile 1002 , according to an example embodiment. For example, while FIG. 9 shows quantiles from up to 07:00 on Nov 28, FIG. 10 adds the quantile 1002 .
- FIG. 9 is vaguely normal with a mean of 3.7 and a standard deviation of 0.3.
- the new value of the new quantile 1002 shown in FIGS. 10 and 11 is 29.6, which is
- an individual property for a particular query will have sudden jumps in values. Although these sudden jumps may represent outliers, an outlier, in and of itself, should not necessarily raise an alert. Instead, example embodiments may use the number of queries that have such jumps as a good signal for raising an alert of an anomaly. So a selection of features may go like this. For each (probe, property value) pair, we compute a measure of surprise and measure whether the latest property value of the property is an outlier. The anomaly detection engine 212 then has a surprise number for each query. It is expected to have a few large surprise numbers, but not too many.
- the anomaly detection engine 212 may in some embodiments select the 90th quantile of surprise values (e.g., sort the surprise values from low to high and, return the 90-th value, or using a non-sorting function to calculate a quantile or ranking of surprise scores). This is our feature. Now we can use any outlier detection method to raise an alert. For example, in an example embodiment, the anomaly detection engine 212 may take the last 30 days' worth of signals, compute their mean and standard deviation. If the latest quantile of the signal is more than threshold deviation (e.g., 56 from the mean), the anomaly detection engine 212 raises an alert.
- threshold deviation e.g., 56 from the mean
- FIG. 12 is a flowchart diagram illustrating a method 1200 for detecting an anomaly in a computer system, according to an example embodiment.
- the method 1200 may begin at operation 1202 when the probe module 210 executes probes on an evolving data set. Each probe returns a result derived from the evolving data set.
- the probe module 210 may issue a set of queries to the searching engine 208 of FIG. 2 .
- the probe module 210 may then receive search results for each of the queries issued to the search engine 208 . It is to be appreciated that each of the probes (e.g., search queries) may be different and, accordingly, each results may also be different.
- the probe module 210 is further configured to derive property values for each result returned from the probes.
- a property value may include a data that quantifies a property or aspect of a result.
- the property value may represent, for example, a value for the property of the number of items returned in the result, the average list price in the result, a measurement of the number of items that are auctions relative to non-auction items in the result, a number of classified listings in the result, or any other suitable property.
- the execution of operation 1202 may result in a data table that includes a number of property values that each correspond to one of the probes executed by the probe module 210 .
- the table may include multiple columns, where each column corresponds to a different property type. This is shown in FIG. 13 .
- FIG. 13 is a diagram illustrating a property value table 1300 that may be generated based on executing the probes, according to an example embodiment.
- the property value table 1300 may store the property values (organized by property types 1304 ) collected for a single iteration of the probes 1302 . As FIG. 13 shows, for a single probe, the probe module 210 may derive property values for multiple property types.
- the probe module 210 may derive multiple property values, each corresponding to a different probe.
- single property value may be specific to a probe-property type pair.
- property value 1310 may be specific to the Probe 2 -Property Type 2 pair.
- the anomaly detecting engine 212 may generate surprise scores for each of the property values.
- Each surprise score may be generated based on a comparison of a property value and historical property values that correspond to a probe-property type pair. For example, for a probe, the surprise score may be based on a function of the property value and a deviation from an expect value.
- An expected value may be determined based on a robust estimation of the tendency of the historical property values for that probe.
- a median, mode, mean, and trimmed mean are all examples of a robust estimation of the tendency of the value for the feature that can be used to generate a surprise score.
- the surprise score for a value may be based on a standard deviation from the tendency for that value of the property.
- operation 1204 may generate a surprise score for the latest results where each surprise corresponds to one of the queries in the set of queries.
- FIG. 14 is a diagram showing property values 1400 for a probe-property type pair, according to an example embodiment.
- the property values 1400 may include the property value 1310 received as part of a current iteration of executing the probes.
- the property value 1310 may be specific to the Probe 2 -Property 2 pair.
- the property values 1400 may also include historical property values 1402 that were obtained in past iterations of executing the probes.
- the historical property values 1402 are specific to the same probe-property type pair as the property value 1310 (e.g., Probe 2 -Property 2 pair).
- the surprise score is based on the deviation of the sample property value 1310 from the historical property values 1402 .
- FIG. 15 is a diagram illustrating a surprise score table 1500 , according to an example embodiment.
- the surprise score table 1500 includes a surprise score for each of the probe property pairs.
- the surprise score table 1500 is generated based on calculating a surprise score in the manner discussed with respect to FIG. 14 . That is, a surprise score is generated for each probe-property type pair based on a comparison between the property value corresponding to the probe-property type pair and the historical property values for the probe-property type pair.
- the anomaly detecting engine 212 may access a plurality of historical surprise scores generated by the anomaly detection engine 212 .
- the historical surprise scores accessed at operation 1206 may be based on past iterations of executing the probes. Further, in some cases, the historical surprise scores may be specific to a particular property type.
- the anomaly detection engine 212 may alert a monitoring system of an anomaly regarding the evolving data set.
- the idea of operation 1208 is that an alert is generated if the surprise scores for a given property type (e.g., Property 2 ), across all probes (e.g., Probes 1-6 ) is out of the norm from historical surprise scores for those probe-property type pairs.
- the comparison used by operation 1208 may be based a feature derived from the surprise scores.
- FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation 1204 , according to an example embodiment.
- the measurement of the feature shown in FIG. 16 is for the surprise scores generated for a single iteration of the execution of the probes.
- the chart 1600 may measure the feature for the surprise scores generated across Probes 1-6 for Property Type 2 .
- the feature may be a measurement of deviation a surprise score is from an expected value.
- An expected value for the surprise score may be calculated based on a robust estimation of the tendency of the historical surprise scores for that property type.
- the feature may be a quantile of the surprise scores, or a quantile of the data derived from the surprise scores (e.g., the deviation from the expected value). Still further, in some cases, the feature may be a measurement or count of the number of surprise scores that deviate beyond a threshold amount from the expected value, or that exceed a fixed surprise score.
- the feature of the surprise scores may then be compared against historical surprise scores from past iterations of executing the probes. This is shown in FIG. 17 .
- FIG. 17 is a chart illustrating surprise score features 1700 over time, according to an example embodiment.
- the surprise score feature 1702 may be a feature of the surprise scores for a current iteration of the execution of the probes
- the historical surprise score features 1704 are features of surprise scores from past iterations of executing the probes.
- operation 1208 may alert if the feature of the surprise score 1702 deviates from the historical surprise score features 1704 beyond a threshold amount.
- the operations 1206 and 1208 shown in FIG. 12 may be repeated across all the property types in monitored by the probe module 210 .
- the operations 1206 and 1208 may execute across Property Types 1-5 .
- the computer system may be an inventory data store.
- the probe module 210 may be configured to detect as property types, among other things, the number of items stored per category, the number of auction items per category, and the like.
- the computer system may be a computer infrastructure (e.g., a collection of computer servers).
- the probe module 210 may be configured to detect as property types, among other things, a processor load, bandwidth consumption, thread count, running processes count, memory usage, throughput count, rate of disk seeks, rate of packets transmitted or received, or rate of response.
- the property values tracked by the anomaly detection engine 212 may include dimensions in addition to what is described above.
- a table may be used to store the property values, where the columns are the metrics tracked by the different probes modules, and the rows are the different values for those property types at different times.
- An extension would be a 3D-table or cube.
- For each (property, value) cell there may be a series of aspects instead of a single number.
- An aspect may be a vertical stack out of the page. In example (1), the aspect might be different countries.
- a cell in a table may be related to a specific query (perhaps ‘iPhone 5S’) and property (perhaps number of results). But using the aspects, the results vary by country, so the single cell is replaced by a stack of entries, one for each country.
- the anomaly detection engine 212 may be configured to detect a problem with the search software (a disruption) before users do.
- the property values may be received from the same interfaces and using the same computer systems used by the end users.
- the metric data received from the probe module 210 is a proxy, if not identical, to the user experience of users of that computer system. Accordingly, it is to be appreciated that when this disclosure states the anomaly detection engine 212 may detect a problem before users do, it may simply mean that the anomaly detection engine 212 can detect and report a problem without intervention from a user. Thus, compared to traditional systems, example embodiments may use the anomaly detection engine 212 to provide comparatively quick detection of site problems.
- FIG. 18 shows a diagrammatic representation of a machine in the example form of a computer system 1800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
- the computer system 1800 comprises, for example, any of the device machine 110 , device machine 112 , applications servers 118 , API server 114 , web server 116 , database servers 124 , or third party server 130 .
- the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a device machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
- the machine may be a server computer, a client computer, a personal computer (PC), a tablet, a set-top box (STB), a Personal Digital Assistant (PDA), a smart phone, a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
- PC personal computer
- PDA Personal Digital Assistant
- smart phone a cellular telephone
- web appliance a web appliance
- network router switch or bridge
- the example computer system 1800 includes a processor 1802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 1804 and a static memory 1806 , which communicate with each other via a bus 1808 .
- the computer system 1800 may further include a video display unit 1810 (e.g., liquid crystal display (LCD), organic light emitting diode (OLED), touch screen, or a cathode ray tube (CRT)).
- LCD liquid crystal display
- OLED organic light emitting diode
- CRT cathode ray tube
- the computer system 1800 also includes an alphanumeric input device 1812 (e.g., a physical or virtual keyboard), a cursor control device 1814 (e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad), a disk drive unit 1816 , a signal generation device 1818 (e.g., a speaker) and a network interface device 1820 .
- an alphanumeric input device 1812 e.g., a physical or virtual keyboard
- a cursor control device 1814 e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad
- a disk drive unit 1816 e.g., a disk drive unit 1816
- a signal generation device 1818 e.g., a speaker
- the disk drive unit 1816 includes a machine-readable medium 1822 on which is stored one or more sets of instructions 1824 (e.g., software) embodying any one or more of the methodologies or functions described herein.
- the instructions 1824 may also reside, completely or at least partially, within the main memory 1804 and/or within the processor 1802 during execution thereof by the computer system 1800 , the main memory 1804 and the processor 1802 also constituting machine-readable media.
- the instructions 1824 may further be transmitted or received over a network 1826 via the network interface device 1820 .
- machine-readable medium 1822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
- the term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present invention.
- the term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and carrier wave signals.
- modules, engines, components, or mechanisms may be implemented as logic or a number of modules, engines, components, or mechanisms.
- a module, engine, logic, component, or mechanism may be a tangible unit capable of performing certain operations and configured or arranged in a certain manner.
- one or more computer systems e.g., a standalone, client, or server computer system
- one or more components of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- firmware note that software and firmware can generally be used interchangeably herein as is known by a skilled artisan
- a module may be implemented mechanically or electronically.
- a module may comprise dedicated circuitry or logic that is permanently configured (e.g., within a special-purpose processor, application specific integrated circuit (ASIC), or array) to perform certain operations.
- a module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. It will be appreciated that a decision to implement a module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by, for example, cost, time, energy-usage, and package size considerations.
- module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), non-transitory, or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
- modules or components are temporarily configured (e.g., programmed)
- each of the modules or components need not be configured or instantiated at any one instance in time.
- the modules or components comprise a general-purpose processor configured using software
- the general-purpose processor may be configured as respective different modules at different times.
- Software may accordingly configure the processor to constitute a particular module at one instance of time and to constitute a different module at a different instance of time.
- Modules can provide information to, and receive information from, other modules. Accordingly, the described modules may be regarded as being communicatively coupled. Where multiples of such modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the modules. In embodiments in which multiple modules are configured or instantiated at different times, communications between such modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple modules have access. For example, one module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further module may then, at a later time, access the memory device to retrieve and process the stored output. Modules may also initiate communications with input or output devices and can operate on a resource (e.g., a collection of information).
- a resource e.g., a collection of information
Abstract
Apparatus and method for detecting anomalies in a computer system are disclosed herein. In some embodiments, multiple probes are executed on an evolving data set. Each probe may return a result. Property values are then derived from a respective result returned by a corresponding probe. Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values. The corresponding property value and the historical property values are derived from results returned from the same probe. Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.
Description
- The present invention relates generally to data processing, and in some embodiments, to detecting anomalies in computer-based systems.
- In many cases, enterprises maintain and operate large numbers of computer systems (e.g., servers) that may each run a layered set of software. In some cases, these computer systems provide functionality for the operation of the enterprise or to provide outbound services to their customers. In many cases, the enterprise may monitor the hardware and software layers of these servers by logging processing load, memory usage, and many other monitored signals at frequent intervals.
- Unfortunately, the enterprise may occasionally suffer disruptions, where some of its services were degraded or even completely unavailable to customers. To resolve these disruptions, the enterprise will perform a post-mortem analysis of the monitored signal in an effort to debug the system. For example, the enterprise may analyze the memory usage to identify a program the may be performing improperly, or view the processing load to determine whether more hardware is needed.
- Thus, traditional systems may utilize methods and systems for addressing anomalies that involves debugging a computer system after the anomaly has affected the computer system, and, by extension, the users.
- Some embodiments are illustrated by way of example and not limitations in the figures of the accompanying drawings, in which:
-
FIG. 1 illustrates a block diagram depicting a network architecture of a system, according to some embodiments, having a client-server architecture configured for exchanging data over a network. -
FIG. 2 illustrates a block diagram showing components provided within the system ofFIG. 1 according to some embodiments. -
FIG. 3 is a diagram showing sampled values of a number of searches performed on a computer system that are observed over a time period, such as a twenty-four hour period, according to an example embodiment. -
FIG. 4 is a diagram showing additional sampled values from two additional days, as compared toFIG. 3 , according to an example embodiment. -
FIG. 5 is diagram of a plot of metric data over time for a metric of a computer system, according to an example embodiment. -
FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment. -
FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment. -
FIG. 8 is a histogram showing quantiles for a metric over a two-week period, according to an example embodiment. -
FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment. -
FIG. 10 is a histogram that includes the quantiles shown inFIG. 8 but with a new quantile, according to an example embodiment. -
FIG. 11 is a plot of the quantiles in a time series with a new quantile, according to an example embodiment. -
FIG. 12 is a flowchart diagram illustrating a method for detecting an anomaly in a computer system, according to an example embodiment. -
FIG. 13 is a diagram illustrating a property value table that may be generated based on executing the probes, according to an example embodiment. -
FIG. 14 is a diagram showing property values for a probe-property type pair, according to an example embodiment. -
FIG. 15 is a diagram illustrating a surprise score table, according to an example embodiment. -
FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation, according to an example embodiment. -
FIG. 17 is a chart illustrating surprise score features over time, according to an example embodiment. -
FIG. 18 illustrates a diagrammatic representation of a machine in the example form of a computer system within which a set of instructions may be executed to cause the machine to perform any one or more of the methodologies discussed herein. - The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the terms used.
- Described in detail herein is an apparatus and method for detecting anomalies in a computer system. For example, some embodiments may be used to address the problem of how to monitor signals in a computer system to detect disruptions before they affect users, and to do so with few false positives. Some embodiments may address this problem by analyzing signals for strange behavior that may be referred to as an anomaly. Example embodiments can then scan multiple monitored signals, and raise an alert when the site monitoring system detects an anomaly.
- Various modifications to the example embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments and applications without departing from the scope of the invention. Moreover, in the following description, numerous details are set forth for the purpose of explanation. However, one of ordinary skill in the art will realize that the invention may be practiced without the use of these specific details. In other instances, well-known structures and processes are not shown in block diagram form in order not to obscure the description of the invention with unnecessary detail. Thus, the present disclosure is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
- For example, in some embodiments, multiple probes (e.g., queries) are executed on an evolving data set (e.g., a listing database). Each probe may return a result. Property values are then derived from a respective result returned by one of the probes. A property value may be a value that quantifies a property or aspect of the result, such as, for example, a number of listing returned, a portion of classified listings, a measurement of the prices in a listing, and the like.
- Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values. The corresponding property value and the historical property values are derived from results returned from the same probe. Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.
-
FIG. 1 illustrates a network diagram depicting anetwork system 100, according to one embodiment, having a client-server architecture configured for exchanging data over a network. A networkedsystem 102 forms a network-based publication system that provides server-side functionality, via a network 104 (e.g., the Internet or Wide Area Network (WAN)), to one or more clients and devices.FIG. 1 further illustrates, for example, one or both of a web client 106 (e.g., a web browser) and a programmatic client 108 executing on device machines 110 and 112. In one embodiment, thepublication system 100 comprises a marketplace system. In another embodiment, thepublication system 100 comprises other types of systems such as, but not limited to, a social networking system, a matching system, a recommendation system, an electronic commerce (e-commerce) system, a search system, and the like. - Each of the device machines 110, 112 comprises a computing device that includes at least a display and communication capabilities with the network 104 to access the
networked system 102. The device machines 110, 112 comprise, but are not limited to, remote devices, work stations, computers, general purpose computers, Internet appliances, hand-held devices, wireless devices, portable devices, wearable computers, cellular or mobile phones, portable digital assistants (PDAs), smart phones, tablets, ultrabooks, netbooks, laptops, desktops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, network PCs, mini-computers, and the like. Each of the device machines 110, 112 may connect with the network 104 via a wired or wireless connection. For example, one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks. - Each of the device machines 110, 112 includes one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, an e-commerce site application (also referred to as a marketplace application), and the like. In some embodiments, if the e-commerce site application is included in a given one of the device machines 110, 112, then this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the
networked system 102, on an as needed basis, for data and/or processing capabilities not locally available (such as access to a database of items available for sale, to authenticate a user, to verify a method of payment, etc.). Conversely if the e-commerce site application is not included in a given one of the device machines 110, 112, the given one of the device machines 110, 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on thenetworked system 102. Although two device machines 110, 112 are shown inFIG. 1 , more or less than two device machines can be included in thesystem 100. - An Application Program Interface (API) server 114 and a web server 116 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 118. The application servers 118 host one or
more marketplace applications 120 and payment applications 122. The application servers 118 are, in turn, shown to be coupled to one or more databases servers 124 that facilitate access to one or more databases 126. - The
marketplace applications 120 may provide a number of e-commerce functions and services to users that accessnetworked system 102. E-commerce functions/services may include a number of publisher functions and services (e.g., search, listing, content viewing, payment, etc.). For example, themarketplace applications 120 may provide a number of services and functions to users for listing goods and/or services or offers for goods and/or services for sale, searching for goods and services, facilitating transactions, and reviewing and providing feedback about transactions and associated users. Additionally, themarketplace applications 120 may track and store data and metadata relating to listings, transactions, and user interactions. In some embodiments, themarketplace applications 120 may publish or otherwise provide access to content items stored in application servers 118 or databases 126 accessible to the application servers 118 and/or the database servers 124. The payment applications 122 may likewise provide a number of payment services and functions to users. The payment applications 122 may allow users to accumulate value (e.g., in a commercial currency, such as the U.S. dollar, or a proprietary currency, such as “points”) in accounts, and then later to redeem the accumulated value for products or items (e.g., goods or services) that are made available via themarketplace applications 120. While the marketplace andpayment applications 120 and 122 are shown inFIG. 1 to both form part of thenetworked system 102, it will be appreciated that, in alternative embodiments, the payment applications 122 may form part of a payment service that is separate and distinct from thenetworked system 102. In other embodiments, the payment applications 122 may be omitted from thesystem 100. In some embodiments, at least a portion of themarketplace applications 120 may be provided on the device machines 110 and/or 112. - Further, while the
system 100 shown inFIG. 1 employs a client-server architecture, embodiments of the present disclosure is not limited to such an architecture, and may equally well find application in, for example, a distributed or peer-to-peer architecture system. The various marketplace andpayment applications 120 and 122 may also be implemented as standalone software programs, which do not necessarily have networking capabilities. - The web client 106 accesses the various marketplace and
payment applications 120 and 122 via the web interface supported by the web server 116. Similarly, the programmatic client 108 accesses the various services and functions provided by the marketplace andpayment applications 120 and 122 via the programmatic interface provided by the API server 114. The programmatic client 108 may, for example, be a seller application (e.g., the TurboLister application developed by eBay Inc., of San Jose, Calif.) to enable sellers to author and manage listings on thenetworked system 102 in an off-line manner, and to perform batch-mode communications between the programmatic client 108 and thenetworked system 102. -
FIG. 1 also illustrates a third party application 128, executing on a third party server machine 130, as having programmatic access to thenetworked system 102 via the programmatic interface provided by the API server 114. For example, the third party application 128 may, utilizing information retrieved from thenetworked system 102, support one or more features or functions on a website hosted by the third party. The third party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of thenetworked system 102. -
FIG. 2 illustrates a block diagram showing components provided within thenetworked system 102 according to some embodiments. Thenetworked system 102 may be hosted on dedicated or shared server machines (not shown) that are communicatively coupled to enable communications between server machines. The components themselves are communicatively coupled (e.g., via appropriate interfaces) to each other and to various data sources, so as to allow information to be passed between the applications or so as to allow the applications to share and access common data. Furthermore, the components may access one or more databases 126 via the database servers 124. - The
networked system 102 may provide a number of publishing, listing, and/or price-setting mechanisms whereby a seller (also referred to as a first user) may list (or publish information concerning) goods or services for sale or barter, a buyer (also referred to as a second user) can express interest in or indicate a desire to purchase or barter such goods or services, and a transaction (such as a trade) may be completed pertaining to the goods or services. To this end, thenetworked system 102 may comprise at least onepublication engine 202 and one ormore selling engines 204. Thepublication engine 202 may publish information, such as item listings or product description pages, on thenetworked system 102. In some embodiments, the sellingengines 204 may comprise one or more fixed-price engines that support fixed-price listing and price setting mechanisms and one or more auction engines that support auction-format listing and price setting mechanisms (e.g., English, Dutch, Chinese, Double, Reverse auctions, etc.). The various auction engines may also provide a number of features in support of these auction-format listings, such as a reserve price feature whereby a seller may specify a reserve price in connection with a listing and a proxy-bidding feature whereby a bidder may invoke automated proxy bidding. The sellingengines 204 may further comprise one or more deal engines that support merchant-generated offers for products and services. - A listing engine 206 allows sellers to conveniently author listings of items or authors to author publications. In one embodiment, the listings pertain to goods or services that a user (e.g., a seller) wishes to transact via the
networked system 102. In some embodiments, the listings may be an offer, deal, coupon, or discount for the good or service. Each good or service is associated with a particular category. The listing engine 206 may receive listing data such as title, description, and aspect name/value pairs. Furthermore, each listing for a good or service may be assigned an item identifier. In other embodiments, a user may create a listing that is an advertisement or other form of information publication. The listing information may then be stored to one or more storage devices coupled to the networked system 102 (e.g., databases 126). Listings also may comprise product description pages that display a product and information (e.g., product title, specifications, and reviews) associated with the product. In some embodiments, the product description page may include an aggregation of item listings that correspond to the product described on the product description page. - The listing engine 206 also may allow buyers to conveniently author listings or requests for items desired to be purchased. In some embodiments, the listings may pertain to goods or services that a user (e.g., a buyer) wishes to transact via the
networked system 102. Each good or service is associated with a particular category. The listing engine 206 may receive as much or as little listing data, such as title, description, and aspect name/value pairs, that the buyer is aware of about the requested item. In some embodiments, the listing engine 206 may parse the buyer's submitted item information and may complete incomplete portions of the listing. For example, if the buyer provides a brief description of a requested item, the listing engine 206 may parse the description, extract key terms and use those terms to make a determination of the identity of the item. Using the determined item identity, the listing engine 206 may retrieve additional item details for inclusion in the buyer item request. In some embodiments, the listing engine 206 may assign an item identifier to each listing for a good or service. - In some embodiments, the listing engine 206 allows sellers to generate offers for discounts on products or services. The listing engine 206 may receive listing data, such as the product or service being offered, a price and/or discount for the product or service, a time period for which the offer is valid, and so forth. In some embodiments, the listing engine 206 permits sellers to generate offers from the sellers' mobile devices. The generated offers may be uploaded to the
networked system 102 for storage and tracking - In a further example embodiment, the listing engine 206 allows users to navigate through various categories, catalogs, or inventory data structures according to which listings may be classified within the
networked system 102. For example, the listing engine 206 allows a user to successively navigate down a category tree comprising a hierarchy of categories (e.g., the category tree structure) until a particular set of listing is reached. Various other navigation applications within the listing engine 206 may be provided to supplement the searching and browsing applications. The listing engine 206 may record the various user actions (e.g., clicks) performed by the user in order to navigate down the category tree. - Searching the
networked system 102 is facilitated by a searching engine 208. For example, the searching engine 208 enables keyword queries of listings published via thenetworked system 102. In example embodiments, the searching engine 208 receives the keyword queries from a device of a user and conducts a review of the storage device storing the listing information. The review will enable compilation of a result set of listings that may be sorted and returned to the client device (e.g., device machine 110, 112) of the user. The searching engine 208 may record the query (e.g., keywords) and any subsequent user actions and behaviors (e.g., navigations, selections, or click-throughs). - The searching engine 208 also may perform a search based on a location of the user. A user may access the searching engine 208 via a mobile device and generate a search query. Using the search query and the user's location, the searching engine 208 may return relevant search results for products, services, offers, auctions, and so forth to the user. The searching engine 208 may identify relevant search results both in a list form and graphically on a map. Selection of a graphical indicator on the map may provide additional details regarding the selected search result. In some embodiments, the user may specify, as part of the search query, a radius or distance from the user's current location to limit search results.
- The searching engine 208 also may perform a search based on an image. The image may be taken from a camera or imaging component of a client device or may be accessed from storage.
- In addition to the above described modules, the
networked system 102 may further included ananomaly detection engine 212 and a probe module 210 to perform various anomaly detection functionalities or operations as set forth in greater detail below. - As explained above, some example embodiments may be configured to detect anomalies in an evolving data set by comparing surprise scores of property values received from a probe module. However, before describing the methods and systems for detecting anomalies in a computer system in great detail, some simplified examples of analyzing property values are now described to highlight some potential aspects addressed by example embodiments. For example, as a warm-up problem, consider a signal from a high software layer: the number of searches (“srp”) received or performed by the
networked system 102 ofFIG. 1 . In some cases, srp may be tracked by the probe module 210 periodically, say, for example, every two minutes.FIG. 3 is a diagram showing sampledvalues 300 of srp observed over a time period, such as a twenty-four hour period, according to an example embodiment. The vertical axis range may represent sampled values of the number of searches performed over a two minute period, whereas the horizontal axis may represent time, which ranges from midnight to midnight, PDT. In analyzing the sampledvalues 300 of srp, theanomaly detecting engine 212 may identify that sampledvalues value 300 shown inFIG. 3 . That is, based on the sample values 300, prior art systems are unlikely to reliably determine (e.g., without issuing too many false positives) whethersamples - But now consider
FIG. 4 , which shows additional sampled values for the number of searches per time period. For example,FIG. 4 is a diagram showing sampledvalues 400 that includes sampled values from the prior two days, relative to the sampledvalues 300 ofFIG. 3 , according to an example embodiment. Based on the sampledvalues 400, one may reasonably conclude that the sampledvalue 302 should be categorized as an anomaly, but not the sampledvalue 304. Such is the case because the sampledvalue 304 is consistent with the other two days of samples, whereas the sampledvalue 302 is inconsistent with the other two days. -
FIGS. 3 and 4 suggest that comparing a property value of a result against historical property values may be used as a simple feature for detecting anomalies in a computer system. The feature may be based on comparing a current value of srp with its respective value 24 hours ago, 48 hours ago, 92 hours ago, and so forth. - Another example of detecting anomalies in a computer system is now described with reference to
FIGS. 5-11 . In this example computer system, the probe module 210 may periodically issue a query (or a set of queries) and log values for one or more properties relating to the search result returned by the query (or each query in the set of queries). Examples of properties that may be tracked by the probe module 210 include a number of items returned from a search query, the average list price, a measurement of the number of items that are auctions relative to non-auction items, a number of classified listings, a number of searches executed, etc. Theanomaly detection engine 212 may repeatedly cycle through a fixed set (e.g., tens, hundreds, thousands, and so forth) of queries to build a historical model of the property values for each of the search queries over time. -
FIG. 5 is diagram of a plot ofproperty values 500 over time for a property of a computer system, according to an example embodiment. The property values may include one or more sampled values of for a property, which are sampled over time. The horizontal axis ofFIG. 5 represents time, with the right-hand side representing the most recent samples. The vertical axis ofFIG. 5 represents values of the property being monitored by theanomaly detecting engine 212. By way of example and not limitation, the property values 500 may represent the median sales price for the listings returned when the probe module 210 submits the search query “htc hd2” to the searching engine 208. As shown inFIG. 5 , the plot may include a fittedline 504 to represent expected values from the property over time. As may be appreciated fromFIG. 5 , the property values 500 exhibits some noise (e.g., values that deviate from the fitted line 504). However, even when compared to the noise within the property values 500, theproperty value 502 may represent an anomaly because the deviation of theproperty value 502 is deviates significantly from the fittedline 504 when compared to the other values of the property values 500. - In some cases, the
anomaly detecting engine 212 may determine whether a value of a property represents an anomaly caused by a site disruption based in part on calculating surprise scores for the property value. A surprise score may be a measurement used to quantify how out of the norm a value for a property is based on historical values for that property. For example, theanomaly detecting engine 212 may quantify the surprise score for a value of a property by computing the (unsigned) deviation of each property value from an expected value. For example, one specific implementation of calculating a surprise score may involve dividing the deviation of a value from the expected value (e.g., the fitted line 504) by the median deviation of all the values. Assuming the deviation for thevalue 502 is 97.9 and the median deviation for all the values of the property values 500 is 13.4, theanomaly detecting engine 212 may assign the value 502 a surprise score of 7.3 (e.g., 97.9/13.4). - Some embodiments may address the issue of whether a particular surprise score (e.g., a surprise score of 7.3, as discussed above) should trigger an alert that there may be an anomaly in the computer system.
FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment. In the context ofFIG. 6 , a surprise score of 7 is not unusual because a surprise score of 7 is not far off in value from other surprise scores. In fact, according toFIG. 6 , there are many other queries that result in surprise scores that are of higher value than 7. - To clarify surprise scores shown in
FIG. 6 ,FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment. Since log(7.3)≈2, it is clear that a value of 2 is not all that unusual. Quantitatively, the percentile of the surprise score for the query “htc hd2” is about 96%. Ringing an alarm for a surprise this large may generate a large number of false positives. - Incidentally,
FIGS. 6 and 7 are diagrams illustrating the difficulty in getting a low false positive rate when using statistical methods for detecting anomalies, according to an example embodiment. An example system may, for example, periodically execute 3000 queries six times a day to log measure data relating to 40 different properties. Thus, under these constraints, theanomaly detection engine 212 may generate 3000×40×6=720,000 graphs or tables each day. Even achieving as low as 1 false positive per day would require only triggering on graphs with a surprise score so high that it happens 0.00014% of the time. It is to be appreciated that such a stringent cutoff is likely to overlook many real disruptions. - One way around the difficulty of avoiding false positive may be through aggregation of surprising values across multiple queries. Since there will always be a few queries with a high surprise, the
anomaly detection engine 212 can construct a feature based on the number of surprise scores that deviate from historical norms. A sudden change in the number of high surprise scores, for example, might be a good indicator of a site disruption. This is done separately for each property being monitored by theanomaly detection engine 212. To make this quantitative, instead of counting the number of queries with a high surprise, some embodiments of theanomaly detection engine 212 can examine a quantile (e.g., 0.9th quantile) of the surprise values for a property. Using the quantiles to detect anomalies is now described. - The surprise score of the most recent property value, as computed above with reference to
FIG. 5 , depends on at least the following: a property type (e.g., mean sales price listed), a property value (e.g., a value for the mean sales price listed), the probe (e.g., a query that generates a result of listed items for sale), and a collection window of recent property values for the probe. When the surprise scores for each probe are calculated, the quantile of the surprise scores may be calculated. The quantile is computed by picking a property type and collection window, gathering up the surprise scores for the property type, across all the probes, within the collection window, and then taking the 90% quantile of those surprise scores. So there is a quantile for each property type-collection period pair. Every four hours the following process is performed for each property being monitored: rerun the set of queries again to obtain current property values for each query with respect to the property, recompute the surprise scores for the values obtained by rerunning the set of queries, and then determine the 90% quantile for these surprise scores. This gives a new value for the 90% quantile for the property which may be compared against the historical quantile to determine whether an anomaly exists. -
FIG. 8 is a histogram showing quantiles 800 for a property type (e.g., a median sale price) over a two-week period, according to an example embodiment. As shown inFIG. 8 , the quantiles are clustered near 3.6, with a range from 3.0 to 4.6. In another view,FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment. This shows the feature (e.g., the quantile of the surprise score) is fairly smooth, and might be, in some cases, a candidate for anomaly detection. - An example of an anomaly that corresponds to a genuine disruption is now described.
FIG. 10 is a histogram that includes the quantiles 800 shown inFIG. 8 but with anew quantile 1002, according to an example embodiment. Thenew quantile 1002 may be calculated based on the value of the property when theanomaly detection engine 212 executes the set of queries again. In another view,FIG. 11 is a plot of the quantiles in a time series with thenew quantile 1002, according to an example embodiment. For example, whileFIG. 9 shows quantiles from up to 07:00 on Nov 28,FIG. 10 adds thequantile 1002. The following observations are made.FIG. 9 is vaguely normal with a mean of 3.7 and a standard deviation of 0.3. The new value of thenew quantile 1002 shown inFIGS. 10 and 11 is 29.6, which is |29.6−3.7|/0.3≈86 standard deviations from the mean quantile. So the historical value of the quantile appears to be a useful feature for detecting anomalies. - Summarizing the above, it is expected that an individual property for a particular query will have sudden jumps in values. Although these sudden jumps may represent outliers, an outlier, in and of itself, should not necessarily raise an alert. Instead, example embodiments may use the number of queries that have such jumps as a good signal for raising an alert of an anomaly. So a selection of features may go like this. For each (probe, property value) pair, we compute a measure of surprise and measure whether the latest property value of the property is an outlier. The
anomaly detection engine 212 then has a surprise number for each query. It is expected to have a few large surprise numbers, but not too many. To quantify this, theanomaly detection engine 212 may in some embodiments select the 90th quantile of surprise values (e.g., sort the surprise values from low to high and, return the 90-th value, or using a non-sorting function to calculate a quantile or ranking of surprise scores). This is our feature. Now we can use any outlier detection method to raise an alert. For example, in an example embodiment, theanomaly detection engine 212 may take the last 30 days' worth of signals, compute their mean and standard deviation. If the latest quantile of the signal is more than threshold deviation (e.g., 56 from the mean), theanomaly detection engine 212 raises an alert. - A method for detecting an anomaly in a computer system is now described in greater detail. For example,
FIG. 12 is a flowchart diagram illustrating amethod 1200 for detecting an anomaly in a computer system, according to an example embodiment. - As
FIG. 12 illustrates, themethod 1200 may begin atoperation 1202 when the probe module 210 executes probes on an evolving data set. Each probe returns a result derived from the evolving data set. By way of example and not limitation, the probe module 210 may issue a set of queries to the searching engine 208 ofFIG. 2 . The probe module 210 may then receive search results for each of the queries issued to the search engine 208. It is to be appreciated that each of the probes (e.g., search queries) may be different and, accordingly, each results may also be different. - In some embodiment, as part of
operation 1202, the probe module 210 is further configured to derive property values for each result returned from the probes. As discussed above, a property value may include a data that quantifies a property or aspect of a result. To illustrate, again by way of example and not limitation, where the probe module 210 is configured to transmit a set of queries to the searching engine 208, the property value may represent, for example, a value for the property of the number of items returned in the result, the average list price in the result, a measurement of the number of items that are auctions relative to non-auction items in the result, a number of classified listings in the result, or any other suitable property. - Thus, in some embodiments, the execution of
operation 1202 may result in a data table that includes a number of property values that each correspond to one of the probes executed by the probe module 210. Further, as the probe module 210 may monitor more than one property type, the table may include multiple columns, where each column corresponds to a different property type. This is shown inFIG. 13 .FIG. 13 is a diagram illustrating a property value table 1300 that may be generated based on executing the probes, according to an example embodiment. The property value table 1300 may store the property values (organized by property types 1304) collected for a single iteration of theprobes 1302. AsFIG. 13 shows, for a single probe, the probe module 210 may derive property values for multiple property types. Further, for a single property type, the probe module 210 may derive multiple property values, each corresponding to a different probe. Thus, single property value may be specific to a probe-property type pair. For example,property value 1310 may be specific to the Probe2-Property Type2 pair. - With reference back to
FIG. 12 , atoperation 1204, theanomaly detecting engine 212 may generate surprise scores for each of the property values. Each surprise score may be generated based on a comparison of a property value and historical property values that correspond to a probe-property type pair. For example, for a probe, the surprise score may be based on a function of the property value and a deviation from an expect value. An expected value may be determined based on a robust estimation of the tendency of the historical property values for that probe. A median, mode, mean, and trimmed mean are all examples of a robust estimation of the tendency of the value for the feature that can be used to generate a surprise score. In some cases, the surprise score for a value may be based on a standard deviation from the tendency for that value of the property. Thus,operation 1204 may generate a surprise score for the latest results where each surprise corresponds to one of the queries in the set of queries. For example,FIG. 14 is a diagram showingproperty values 1400 for a probe-property type pair, according to an example embodiment. As Shown inFIG. 14 , theproperty values 1400 may include theproperty value 1310 received as part of a current iteration of executing the probes. As discussed with respect toFIG. 13 , theproperty value 1310 may be specific to the Probe2-Property2 pair. The property values 1400 may also includehistorical property values 1402 that were obtained in past iterations of executing the probes. Thehistorical property values 1402 are specific to the same probe-property type pair as the property value 1310 (e.g., Probe2-Property2 pair). The surprise score is based on the deviation of thesample property value 1310 from the historical property values 1402. - As discussed above with respect to the
operation 1204 shown inFIG. 12 , a surprise score is generated for each probe-property type pair. This is shown inFIG. 15 asFIG. 15 is a diagram illustrating a surprise score table 1500, according to an example embodiment. The surprise score table 1500 includes a surprise score for each of the probe property pairs. The surprise score table 1500 is generated based on calculating a surprise score in the manner discussed with respect toFIG. 14 . That is, a surprise score is generated for each probe-property type pair based on a comparison between the property value corresponding to the probe-property type pair and the historical property values for the probe-property type pair. - With reference back to
FIG. 12 , atoperation 1206, theanomaly detecting engine 212 may access a plurality of historical surprise scores generated by theanomaly detection engine 212. In some cases, the historical surprise scores accessed atoperation 1206 may be based on past iterations of executing the probes. Further, in some cases, the historical surprise scores may be specific to a particular property type. - At
operation 1208, responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, theanomaly detection engine 212 may alert a monitoring system of an anomaly regarding the evolving data set. With momentary reference toFIG. 15 , the idea ofoperation 1208 is that an alert is generated if the surprise scores for a given property type (e.g., Property2), across all probes (e.g., Probes1-6) is out of the norm from historical surprise scores for those probe-property type pairs. - The comparison used by
operation 1208 may be based a feature derived from the surprise scores. To illustrate,FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated byoperation 1204, according to an example embodiment. The measurement of the feature shown inFIG. 16 is for the surprise scores generated for a single iteration of the execution of the probes. For example, thechart 1600 may measure the feature for the surprise scores generated across Probes1-6 for Property Type2. For example, the feature may be a measurement of deviation a surprise score is from an expected value. An expected value for the surprise score may be calculated based on a robust estimation of the tendency of the historical surprise scores for that property type. In some cases, the feature may be a quantile of the surprise scores, or a quantile of the data derived from the surprise scores (e.g., the deviation from the expected value). Still further, in some cases, the feature may be a measurement or count of the number of surprise scores that deviate beyond a threshold amount from the expected value, or that exceed a fixed surprise score. - As part of
operation 1208, the feature of the surprise scores may then be compared against historical surprise scores from past iterations of executing the probes. This is shown inFIG. 17 .FIG. 17 is a chart illustrating surprise score features 1700 over time, according to an example embodiment. For example, thesurprise score feature 1702 may be a feature of the surprise scores for a current iteration of the execution of the probes, while the historical surprise score features 1704 are features of surprise scores from past iterations of executing the probes. Here,operation 1208 may alert if the feature of thesurprise score 1702 deviates from the historical surprise score features 1704 beyond a threshold amount. - It is to be appreciated that the
operations FIG. 12 may be repeated across all the property types in monitored by the probe module 210. For example, with reference toFIG. 5 , theoperations - It is to be further appreciated that although much of this disclosure discusses anomaly detection in the context of a search engine, other example embodiments may use the anomaly detection methods and systems described herein to detect anomalies in other types of computer systems. For example, the computer system may be an inventory data store. In such a case, the probe module 210 may be configured to detect as property types, among other things, the number of items stored per category, the number of auction items per category, and the like.
- As another example, the computer system may be a computer infrastructure (e.g., a collection of computer servers). In such a case, the probe module 210 may be configured to detect as property types, among other things, a processor load, bandwidth consumption, thread count, running processes count, memory usage, throughput count, rate of disk seeks, rate of packets transmitted or received, or rate of response.
- In other embodiments, the property values tracked by the
anomaly detection engine 212 may include dimensions in addition to what is described above. For example, in the embodiment discussed above, one may conceptualize that a table may be used to store the property values, where the columns are the metrics tracked by the different probes modules, and the rows are the different values for those property types at different times. An extension would be a 3D-table or cube. For each (property, value) cell, there may be a series of aspects instead of a single number. An aspect may be a vertical stack out of the page. In example (1), the aspect might be different countries. Thus, a cell in a table may be related to a specific query (perhaps ‘iPhone 5S’) and property (perhaps number of results). But using the aspects, the results vary by country, so the single cell is replaced by a stack of entries, one for each country. - As mentioned in the previous section, the
anomaly detection engine 212 may be configured to detect a problem with the search software (a disruption) before users do. In some cases, the property values may be received from the same interfaces and using the same computer systems used by the end users. In such cases, the metric data received from the probe module 210 is a proxy, if not identical, to the user experience of users of that computer system. Accordingly, it is to be appreciated that when this disclosure states theanomaly detection engine 212 may detect a problem before users do, it may simply mean that theanomaly detection engine 212 can detect and report a problem without intervention from a user. Thus, compared to traditional systems, example embodiments may use theanomaly detection engine 212 to provide comparatively quick detection of site problems. -
FIG. 18 shows a diagrammatic representation of a machine in the example form of acomputer system 1800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. Thecomputer system 1800 comprises, for example, any of the device machine 110, device machine 112, applications servers 118, API server 114, web server 116, database servers 124, or third party server 130. In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a device machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a server computer, a client computer, a personal computer (PC), a tablet, a set-top box (STB), a Personal Digital Assistant (PDA), a smart phone, a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. - The
example computer system 1800 includes a processor 1802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), amain memory 1804 and astatic memory 1806, which communicate with each other via abus 1808. Thecomputer system 1800 may further include a video display unit 1810 (e.g., liquid crystal display (LCD), organic light emitting diode (OLED), touch screen, or a cathode ray tube (CRT)). Thecomputer system 1800 also includes an alphanumeric input device 1812 (e.g., a physical or virtual keyboard), a cursor control device 1814 (e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad), adisk drive unit 1816, a signal generation device 1818 (e.g., a speaker) and anetwork interface device 1820. - The
disk drive unit 1816 includes a machine-readable medium 1822 on which is stored one or more sets of instructions 1824 (e.g., software) embodying any one or more of the methodologies or functions described herein. Theinstructions 1824 may also reside, completely or at least partially, within themain memory 1804 and/or within theprocessor 1802 during execution thereof by thecomputer system 1800, themain memory 1804 and theprocessor 1802 also constituting machine-readable media. - The
instructions 1824 may further be transmitted or received over a network 1826 via thenetwork interface device 1820. - While the machine-
readable medium 1822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present invention. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and carrier wave signals. - It will be appreciated that, for clarity purposes, the above description describes some embodiments with reference to different functional units or processors. However, it will be apparent that any suitable distribution of functionality between different functional units, processors or domains may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controller. Hence, references to specific functional units are only to be seen as references to suitable means for providing the described functionality, rather than indicative of a strict logical or physical structure or organization.
- Certain embodiments described herein may be implemented as logic or a number of modules, engines, components, or mechanisms. A module, engine, logic, component, or mechanism (collectively referred to as a “module”) may be a tangible unit capable of performing certain operations and configured or arranged in a certain manner. In certain example embodiments, one or more computer systems (e.g., a standalone, client, or server computer system) or one or more components of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an application or application portion) or firmware (note that software and firmware can generally be used interchangeably herein as is known by a skilled artisan) as a module that operates to perform certain operations described herein.
- In various embodiments, a module may be implemented mechanically or electronically. For example, a module may comprise dedicated circuitry or logic that is permanently configured (e.g., within a special-purpose processor, application specific integrated circuit (ASIC), or array) to perform certain operations. A module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. It will be appreciated that a decision to implement a module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by, for example, cost, time, energy-usage, and package size considerations.
- Accordingly, the term “module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), non-transitory, or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. Considering embodiments in which modules or components are temporarily configured (e.g., programmed), each of the modules or components need not be configured or instantiated at any one instance in time. For example, where the modules or components comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different modules at different times. Software may accordingly configure the processor to constitute a particular module at one instance of time and to constitute a different module at a different instance of time.
- Modules can provide information to, and receive information from, other modules. Accordingly, the described modules may be regarded as being communicatively coupled. Where multiples of such modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the modules. In embodiments in which multiple modules are configured or instantiated at different times, communications between such modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple modules have access. For example, one module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further module may then, at a later time, access the memory device to retrieve and process the stored output. Modules may also initiate communications with input or output devices and can operate on a resource (e.g., a collection of information).
- Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. One skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. Moreover, it will be appreciated that various modifications and alterations may be made by those skilled in the art without departing from the scope of the invention.
- The Abstract is provided to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.
Claims (20)
1. A computer-implemented system comprising:
a probe module implemented by one or more processors and configured to execute a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result, the probe module further configured to derive a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes; and
an anomaly detection engine implemented by the one or more processors and configured to:
generate a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe,
access a plurality of historical surprise scores generated by the anomaly detection engine, and
responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alert a monitoring system of an anomaly regarding the evolving data set.
2. The computer-implemented system of claim 1 , wherein the plurality of surprise scores are generated at a first iteration, and the historical surprise scores were generated at one or more past iterations.
3. The computer-implemented system of claim 1 , wherein:
the probe module is further configured to derive a plurality of additional property values, each additional property value from the plurality of property values are derived from the respective result returned by the corresponding probe of the plurality of probes, the plurality of additional property values relating to a different property than the plurality of property values; and
the anomaly detection engine is further configured to:
generate a plurality of additional surprise scores corresponding to the plurality of additional property values, each additional surprise score being generated based on a comparison of a corresponding additional property value from the plurality of additional property values and additional historical property values, the corresponding additional property value and the additional historical property values having been derived from results returned from the same probe,
access a plurality of additional historical surprise scores generated by the anomaly detection engine, and
responsive to a comparison between the plurality of additional surprise scores and the plurality of additional historical surprise scores, alert the monitoring system of an additional anomaly regarding the evolving data set.
4. The computer-implemented system of claim 1 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether the plurality of surprise scores deviate from a historical distribution of the plurality of historical surprise scores.
5. The computer-implemented system of claim 1 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a number of surprise scores from the plurality of surprise scores deviates from the plurality of historical surprise scores.
6. The computer-implemented system of claim 1 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a quantile of the plurality of surprise property values deviates from a historical distribution of quantiles derived from the plurality of historical surprise scores.
7. The computer-implemented system of claim 1 , wherein the evolving data set relates to data obtained by monitoring a computer system.
8. The computer-implemented system of claim 1 , wherein the evolving data set relates to data obtained by monitoring an inventory of items listed in a database.
9. The computer-implemented system of claim 1 , wherein the evolving data set relates to data obtained by monitoring a performance metrics of a plurality of servers.
10. The computer-implemented system of claim 9 , wherein the plura of property values relate to at least one of: a processor load, a bandwidth consumption, a thread count, a running processes count, a memory usage, a throughput count, a rate of disk seeks, a rate of packets transmitted or received, or a rate of response.
11. A computer-implemented method comprising:
executing a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result;
deriving a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes;
generating a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe;
accessing a plurality of historical surprise scores generated by the anomaly detection engine and responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alerting a monitoring system of an anomaly regarding the evolving data set.
12. The computer-implemented method of claim 11 , wherein the plurality of surprise scores are generated at a first iteration, and the historical surprise scores were generated at one or more past iterations.
13. The computer-implemented method of claim 11 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether the plurality of surprise scores deviate from a historical distribution of the plurality of historical surprise scores.
14. The computer-implemented method of claim 11 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a number of surprise scores from the plurality of surprise scores deviates from the plurality of historical surprise scores.
15. The computer-implemented method of claim 11 , wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a quantile of the plurality of surprise property values deviates from a historical distribution of quantiles derived from the plurality of historical surprise scores.
16. The computer-implemented method of claim 11 , wherein the evolving data set relates to data obtained by monitoring a computer system.
17. The computer-implemented method of claim 11 , wherein the evolving data set relates to data obtained by monitoring an inventory of items listed in a database.
18. The computer-implemented method of claim 11 , wherein the evolving data set relates to data obtained by monitoring a performance metrics of a plurality of servers.
19. The computer-implemented method of claim 18 , wherein the plurality of property values relate to at least one of: a processor load, a bandwidth consumption, a thread count, a running processes count, a memory usage, a throughput count, a rate of disk seeks, a rate of packets transmitted or received, or a rate of response.
20. A non-transitory computer-readable medium storing executable instructions thereon, which, when executed by a processor, cause the processor to perform operations comprising:
executing a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result;
deriving a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes;
generating a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe;
accessing a plurality of historical surprise scores generated by the anomaly detection engine and responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alerting a monitoring system of an anomaly regarding the evolving data set.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/143,185 US20140229414A1 (en) | 2013-02-08 | 2013-12-30 | Systems and methods for detecting anomalies |
US15/141,225 US20160239368A1 (en) | 2013-02-08 | 2016-04-28 | Systems and methods for detecting anomalies |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361762420P | 2013-02-08 | 2013-02-08 | |
US14/143,185 US20140229414A1 (en) | 2013-02-08 | 2013-12-30 | Systems and methods for detecting anomalies |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/141,225 Continuation US20160239368A1 (en) | 2013-02-08 | 2016-04-28 | Systems and methods for detecting anomalies |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140229414A1 true US20140229414A1 (en) | 2014-08-14 |
Family
ID=51296747
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/143,185 Abandoned US20140229414A1 (en) | 2013-02-08 | 2013-12-30 | Systems and methods for detecting anomalies |
US14/176,474 Active 2035-05-03 US10558512B2 (en) | 2013-02-08 | 2014-02-10 | Ballast water tank recirculation treatment system |
US15/141,225 Abandoned US20160239368A1 (en) | 2013-02-08 | 2016-04-28 | Systems and methods for detecting anomalies |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/176,474 Active 2035-05-03 US10558512B2 (en) | 2013-02-08 | 2014-02-10 | Ballast water tank recirculation treatment system |
US15/141,225 Abandoned US20160239368A1 (en) | 2013-02-08 | 2016-04-28 | Systems and methods for detecting anomalies |
Country Status (2)
Country | Link |
---|---|
US (3) | US20140229414A1 (en) |
WO (1) | WO2014124357A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160274807A1 (en) * | 2015-03-20 | 2016-09-22 | Ricoh Company, Ltd. | Information processing apparatus, information processing method, and information processing system |
WO2017060778A3 (en) * | 2015-09-05 | 2017-07-20 | Nudata Security Inc. | Systems and methods for detecting and scoring anomalies |
US9842204B2 (en) | 2008-04-01 | 2017-12-12 | Nudata Security Inc. | Systems and methods for assessing security risk |
US9946864B2 (en) | 2008-04-01 | 2018-04-17 | Nudata Security Inc. | Systems and methods for implementing and tracking identification tests |
US9990487B1 (en) | 2017-05-05 | 2018-06-05 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US10007776B1 (en) | 2017-05-05 | 2018-06-26 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US10127373B1 (en) | 2017-05-05 | 2018-11-13 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US10169906B2 (en) | 2013-03-29 | 2019-01-01 | Advanced Micro Devices, Inc. | Hybrid render with deferred primitive batch binning |
US10528533B2 (en) * | 2017-02-09 | 2020-01-07 | Adobe Inc. | Anomaly detection at coarser granularity of data |
WO2020131391A1 (en) * | 2018-12-20 | 2020-06-25 | Microsoft Technology Licensing, Llc | Automatic anomaly detection in computer processing pipelines |
CN113242839A (en) * | 2018-12-14 | 2021-08-10 | Abb瑞士股份有限公司 | Water treatment system and water treatment method |
US20210395106A1 (en) * | 2018-10-17 | 2021-12-23 | Organo Corporation | Water quality management method, ion adsorption device, information processing device and information processing system |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011058578A1 (en) * | 2009-11-13 | 2011-05-19 | Mehta Virendra J | Method and system for purifying water |
EP2977355A4 (en) * | 2013-03-22 | 2016-03-16 | Tech Cross Co Ltd | Ballast water treatment system |
US9828266B2 (en) | 2014-08-27 | 2017-11-28 | Algenol Biotech LLC | Systems and methods for sterilizing liquid media |
CN104673665B (en) * | 2015-03-05 | 2016-06-22 | 韩先锋 | The clump count warning devices of ballast water for ship |
US20170057833A1 (en) * | 2015-09-01 | 2017-03-02 | Xylem IP Holdings LLC. | Recirculating type active substance treatment system |
WO2017071944A1 (en) * | 2015-10-27 | 2017-05-04 | Koninklijke Philips N.V. | Anti-fouling system, controller and method of controlling the anti-fouling system |
GB201614497D0 (en) * | 2016-08-25 | 2016-10-12 | Rs Hydro Ltd | Water quality sensing |
CN107473328B (en) * | 2017-09-27 | 2020-11-20 | 广船国际有限公司 | Seawater treatment system and seawater treatment control method |
CN111399378A (en) * | 2020-03-16 | 2020-07-10 | 上海交通大学 | Ballast water sterilization device optimization method and system based on fuzzy and closed-loop control |
US11319038B1 (en) * | 2020-12-31 | 2022-05-03 | Clean Wake, Llc | Systems and methods for decontaminating watercraft |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050044406A1 (en) * | 2002-03-29 | 2005-02-24 | Michael Stute | Adaptive behavioral intrusion detection systems and methods |
US7337206B1 (en) * | 2002-07-15 | 2008-02-26 | Network Physics | Method for detecting congestion in internet traffic |
US8306931B1 (en) * | 2009-08-06 | 2012-11-06 | Data Fusion & Neural Networks, LLC | Detecting, classifying, and tracking abnormal data in a data stream |
US8639797B1 (en) * | 2007-08-03 | 2014-01-28 | Xangati, Inc. | Network monitoring of behavior probability density |
US20140074731A1 (en) * | 2012-09-13 | 2014-03-13 | Fannie Mae | System and method for automated data discrepancy analysis |
US20150051847A1 (en) * | 2011-11-22 | 2015-02-19 | Electric Power Research Institute, Inc. | System and method for anomaly detection |
US20150186989A1 (en) * | 2013-12-27 | 2015-07-02 | Ebay Inc. | Pricing and listing configuration recommendation engine |
US20150227992A1 (en) * | 2006-11-16 | 2015-08-13 | Genea Energy Partners, Inc. | Building Optimization Platform And Web-Based Invoicing System |
US20150237215A1 (en) * | 2009-07-17 | 2015-08-20 | Jaan Leemet | Determining Usage Predictions And Detecting Anomalous User Activity Through Traffic Patterns |
Family Cites Families (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3155609A (en) | 1960-05-09 | 1964-11-03 | Pampel Leonard Fredrick | Stabilization of a closed or open water system through the selective utilization of light |
US3917945A (en) | 1973-06-21 | 1975-11-04 | Hayashi Katsuhiki | Method and apparatus for detecting the degree of contamination of waste water |
US4298467A (en) | 1977-06-06 | 1981-11-03 | Panlmatic Company | Water treatment system |
US4752401A (en) | 1986-02-20 | 1988-06-21 | Safe Water Systems International, Inc. | Water treatment system for swimming pools and potable water |
US5948272A (en) | 1986-04-29 | 1999-09-07 | Lemelson; Jerome H. | System and method for detecting and neutralizing microorganisms in a fluid using a laser |
US4992380A (en) | 1988-10-14 | 1991-02-12 | Nalco Chemical Company | Continuous on-stream monitoring of cooling tower water |
DK96989D0 (en) | 1989-02-28 | 1989-02-28 | Faxe Kalkbrud Aktieselskabet | PROCEDURE FOR MONITORING BIOLOGICAL PROCESSES |
US5322569A (en) | 1991-10-08 | 1994-06-21 | General Dynamics Corporation | Ultraviolet marine anti-biofouling systems |
US5411889A (en) | 1994-02-14 | 1995-05-02 | Nalco Chemical Company | Regulating water treatment agent dosage based on operational system stresses |
CN1069162C (en) | 1994-05-02 | 2001-08-08 | 诺尔科化学公司 | Compositions of fluorescent biocides for use as improved antimicrobials |
US5466425A (en) | 1994-07-08 | 1995-11-14 | Amphion International, Limited | Biological decontamination system |
DK0828692T3 (en) | 1995-05-11 | 1999-08-23 | Biobalance As | New method for controlling biodegradation |
CN1183382C (en) | 1997-06-11 | 2005-01-05 | 纳尔科化学公司 | Solid-state fluorometer and methods of use therefor |
JP3920504B2 (en) | 1999-08-10 | 2007-05-30 | 株式会社荏原製作所 | UV sterilizer |
NO312413B1 (en) | 2000-01-04 | 2002-05-06 | Forinnova As | Method and apparatus for preventing the bloom of microorganisms in an aqueous system |
US6500345B2 (en) | 2000-07-31 | 2002-12-31 | Maritime Solutions, Inc. | Apparatus and method for treating water |
US6403030B1 (en) | 2000-07-31 | 2002-06-11 | Horton, Iii Isaac B. | Ultraviolet wastewater disinfection system and method |
US7160370B2 (en) | 2000-12-22 | 2007-01-09 | Saltech Corporation | Systems and methods for contaminant detection within a fluid, ultraviolet treatment and status notification |
JP3881183B2 (en) | 2001-03-13 | 2007-02-14 | 株式会社荏原製作所 | UV irradiation equipment |
CA2341089C (en) * | 2001-03-16 | 2002-07-02 | Paul F. Brodie | Ship ballast water sterilization method and system |
US20080206095A1 (en) | 2001-07-11 | 2008-08-28 | Duthie Robert E | Micro-organism reduction in liquid by use of a metal halide ultraviolet lamp |
WO2003097855A2 (en) * | 2002-05-14 | 2003-11-27 | Baylor College Of Medicine | Small molecule inhibitors of her2 expression |
US7897045B2 (en) | 2002-06-29 | 2011-03-01 | Marenco Technology Group, Inc. | Ship-side ballast water treatment systems including related apparatus and methods |
JP3914850B2 (en) | 2002-09-11 | 2007-05-16 | 株式会社東芝 | Ozone-promoted ozone-oxidized water treatment device and ozone-promoted oxidation module |
US6972415B2 (en) | 2002-09-26 | 2005-12-06 | R-Can Environmental Inc. | Fluid treatment system with UV sensor and intelligent driver |
JP2004188273A (en) | 2002-12-09 | 2004-07-08 | Japan Organo Co Ltd | Ultraviolet irradiation system |
JP4079795B2 (en) | 2003-02-17 | 2008-04-23 | 株式会社東芝 | Water treatment control system |
US7595003B2 (en) | 2003-07-18 | 2009-09-29 | Environmental Technologies, Inc. | On-board water treatment and management process and apparatus |
US7618536B2 (en) | 2003-09-03 | 2009-11-17 | Ekomarine Ab | Method of treating a marine object |
US7544291B2 (en) | 2004-12-21 | 2009-06-09 | Ranco Incorporated Of Delaware | Water purification system utilizing a plurality of ultraviolet light emitting diodes |
US7820038B2 (en) | 2005-03-29 | 2010-10-26 | Kabushiki Kaisha Toshiba | Ultraviolet radiation water treatment system |
ZA200803767B (en) | 2005-11-08 | 2009-11-25 | Council Scient Ind Res | An apparatus for disinfection of sea water/ship's ballast water and a method thereof |
US8812239B2 (en) * | 2005-12-14 | 2014-08-19 | Ut-Battelle, Llc | Method and system for real-time analysis of biosensor data |
US7585416B2 (en) | 2006-03-20 | 2009-09-08 | Council Of Scientific & Industrial Research | Apparatus for filtration and disinfection of sea water/ship's ballast water and a method of same |
DE102006045558A1 (en) * | 2006-09-25 | 2008-04-03 | Rwo Gmbh | Water treatment plant |
WO2008039147A2 (en) | 2006-09-26 | 2008-04-03 | Alfawall Aktiebolag | System, process and control unit for treating ballast water |
US8097130B2 (en) * | 2006-10-18 | 2012-01-17 | Balboa Instruments, Inc. | Integrated water treatment system |
JP4264111B2 (en) | 2007-03-01 | 2009-05-13 | 株式会社東芝 | UV irradiation system and water quality monitoring device |
DE102007018670A1 (en) | 2007-04-18 | 2008-10-23 | Wedeco Ag | Device for germinating ballast water in ship by ultraviolet radiation, comprises a pump line through which ballast water is received and discharged and which is interfused by two groups of ultraviolet transparent cladding tubes |
JP2009103479A (en) | 2007-10-22 | 2009-05-14 | National Maritime Research Institute | Water quality monitoring method and device |
EP2394963B1 (en) | 2008-11-21 | 2016-02-17 | The University of Tokushima | Ultraviolet sterilization device for outdoor water |
BRPI1008435A2 (en) | 2009-02-13 | 2015-08-25 | Lee Antimicrobial Solutions Llc | Method for microbial control and / or disinfection / remediation of an environment and diffuser device to produce phpg. |
WO2010095707A1 (en) | 2009-02-23 | 2010-08-26 | ローム株式会社 | Water purifier |
TWI568687B (en) | 2009-06-15 | 2017-02-01 | 沙烏地***油品公司 | Suspended media membrane biological reactor system and process including suspension system and multiple biological reactor zones |
CN201545720U (en) | 2009-09-27 | 2010-08-11 | 清华大学 | Low voltage high-strength UV ship ballast water processing device |
WO2011049546A1 (en) | 2009-10-20 | 2011-04-28 | Enviro Tech As | Apparatus for installation of ultraviolet system for ballast water treatment in explosive atmosphere of shipboard pump rooms and offshore platforms |
GB0919477D0 (en) | 2009-11-06 | 2009-12-23 | Otv Sa | Water purification apparatus and method |
JP4835785B2 (en) | 2010-02-25 | 2011-12-14 | 住友電気工業株式会社 | Ship ballast water treatment equipment |
-
2013
- 2013-12-30 US US14/143,185 patent/US20140229414A1/en not_active Abandoned
-
2014
- 2014-02-10 US US14/176,474 patent/US10558512B2/en active Active
- 2014-02-10 WO PCT/US2014/015539 patent/WO2014124357A1/en active Application Filing
-
2016
- 2016-04-28 US US15/141,225 patent/US20160239368A1/en not_active Abandoned
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050044406A1 (en) * | 2002-03-29 | 2005-02-24 | Michael Stute | Adaptive behavioral intrusion detection systems and methods |
US7337206B1 (en) * | 2002-07-15 | 2008-02-26 | Network Physics | Method for detecting congestion in internet traffic |
US20150227992A1 (en) * | 2006-11-16 | 2015-08-13 | Genea Energy Partners, Inc. | Building Optimization Platform And Web-Based Invoicing System |
US8639797B1 (en) * | 2007-08-03 | 2014-01-28 | Xangati, Inc. | Network monitoring of behavior probability density |
US20150237215A1 (en) * | 2009-07-17 | 2015-08-20 | Jaan Leemet | Determining Usage Predictions And Detecting Anomalous User Activity Through Traffic Patterns |
US8306931B1 (en) * | 2009-08-06 | 2012-11-06 | Data Fusion & Neural Networks, LLC | Detecting, classifying, and tracking abnormal data in a data stream |
US20150051847A1 (en) * | 2011-11-22 | 2015-02-19 | Electric Power Research Institute, Inc. | System and method for anomaly detection |
US20140074731A1 (en) * | 2012-09-13 | 2014-03-13 | Fannie Mae | System and method for automated data discrepancy analysis |
US20150186989A1 (en) * | 2013-12-27 | 2015-07-02 | Ebay Inc. | Pricing and listing configuration recommendation engine |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9946864B2 (en) | 2008-04-01 | 2018-04-17 | Nudata Security Inc. | Systems and methods for implementing and tracking identification tests |
US11036847B2 (en) | 2008-04-01 | 2021-06-15 | Mastercard Technologies Canada ULC | Systems and methods for assessing security risk |
US10997284B2 (en) | 2008-04-01 | 2021-05-04 | Mastercard Technologies Canada ULC | Systems and methods for assessing security risk |
US10839065B2 (en) | 2008-04-01 | 2020-11-17 | Mastercard Technologies Canada ULC | Systems and methods for assessing security risk |
US9842204B2 (en) | 2008-04-01 | 2017-12-12 | Nudata Security Inc. | Systems and methods for assessing security risk |
US10169906B2 (en) | 2013-03-29 | 2019-01-01 | Advanced Micro Devices, Inc. | Hybrid render with deferred primitive batch binning |
US10162539B2 (en) * | 2015-03-20 | 2018-12-25 | Ricoh Company, Ltd. | Information processing apparatus, information processing method, and information processing system |
US20160274807A1 (en) * | 2015-03-20 | 2016-09-22 | Ricoh Company, Ltd. | Information processing apparatus, information processing method, and information processing system |
US10129279B2 (en) | 2015-09-05 | 2018-11-13 | Mastercard Technologies Canada ULC | Systems and methods for detecting and preventing spoofing |
US10965695B2 (en) | 2015-09-05 | 2021-03-30 | Mastercard Technologies Canada ULC | Systems and methods for matching and scoring sameness |
CN108780479B (en) * | 2015-09-05 | 2022-02-11 | 万事达卡技术加拿大无限责任公司 | System and method for detecting and scoring anomalies |
WO2017060778A3 (en) * | 2015-09-05 | 2017-07-20 | Nudata Security Inc. | Systems and methods for detecting and scoring anomalies |
CN108780479A (en) * | 2015-09-05 | 2018-11-09 | 万事达卡技术加拿大无限责任公司 | For to the abnormal system and method for being detected and scoring |
US9749358B2 (en) | 2015-09-05 | 2017-08-29 | Nudata Security Inc. | Systems and methods for matching and scoring sameness |
US9813446B2 (en) | 2015-09-05 | 2017-11-07 | Nudata Security Inc. | Systems and methods for matching and scoring sameness |
US9800601B2 (en) | 2015-09-05 | 2017-10-24 | Nudata Security Inc. | Systems and methods for detecting and scoring anomalies |
US9749356B2 (en) | 2015-09-05 | 2017-08-29 | Nudata Security Inc. | Systems and methods for detecting and scoring anomalies |
US10212180B2 (en) | 2015-09-05 | 2019-02-19 | Mastercard Technologies Canada ULC | Systems and methods for detecting and preventing spoofing |
US9979747B2 (en) | 2015-09-05 | 2018-05-22 | Mastercard Technologies Canada ULC | Systems and methods for detecting and preventing spoofing |
US9749357B2 (en) | 2015-09-05 | 2017-08-29 | Nudata Security Inc. | Systems and methods for matching and scoring sameness |
US10749884B2 (en) | 2015-09-05 | 2020-08-18 | Mastercard Technologies Canada ULC | Systems and methods for detecting and preventing spoofing |
US10805328B2 (en) | 2015-09-05 | 2020-10-13 | Mastercard Technologies Canada ULC | Systems and methods for detecting and scoring anomalies |
US10528533B2 (en) * | 2017-02-09 | 2020-01-07 | Adobe Inc. | Anomaly detection at coarser granularity of data |
US10127373B1 (en) | 2017-05-05 | 2018-11-13 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US10007776B1 (en) | 2017-05-05 | 2018-06-26 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US9990487B1 (en) | 2017-05-05 | 2018-06-05 | Mastercard Technologies Canada ULC | Systems and methods for distinguishing among human users and software robots |
US20210395106A1 (en) * | 2018-10-17 | 2021-12-23 | Organo Corporation | Water quality management method, ion adsorption device, information processing device and information processing system |
CN113242839A (en) * | 2018-12-14 | 2021-08-10 | Abb瑞士股份有限公司 | Water treatment system and water treatment method |
WO2020131391A1 (en) * | 2018-12-20 | 2020-06-25 | Microsoft Technology Licensing, Llc | Automatic anomaly detection in computer processing pipelines |
US10901746B2 (en) | 2018-12-20 | 2021-01-26 | Microsoft Technology Licensing, Llc | Automatic anomaly detection in computer processing pipelines |
CN113227978A (en) * | 2018-12-20 | 2021-08-06 | 微软技术许可有限责任公司 | Automatic anomaly detection in computer processing pipelines |
Also Published As
Publication number | Publication date |
---|---|
US20140224714A1 (en) | 2014-08-14 |
US10558512B2 (en) | 2020-02-11 |
WO2014124357A1 (en) | 2014-08-14 |
US20160239368A1 (en) | 2016-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20160239368A1 (en) | Systems and methods for detecting anomalies | |
JP7465939B2 (en) | A Novel Non-parametric Statistical Behavioral Identification Ecosystem for Power Fraud Detection | |
US10354309B2 (en) | Methods and systems for selecting an optimized scoring function for use in ranking item listings presented in search results | |
US9323811B2 (en) | Query suggestion for e-commerce sites | |
US11074546B2 (en) | Global back-end taxonomy for commerce environments | |
US11392963B2 (en) | Determining and using brand information in electronic commerce | |
US20160012124A1 (en) | Methods for automatic query translation | |
US11734736B2 (en) | Building containers of uncategorized items | |
US10140339B2 (en) | Methods and systems for simulating a search to generate an optimized scoring function | |
US20220083556A1 (en) | Managing database offsets with time series | |
US20150254680A1 (en) | Utilizing product and service reviews | |
US9424352B2 (en) | View item related searches | |
AU2014321274B2 (en) | Recommendations for selling past purchases | |
WO2013090475A1 (en) | Recognizing missing offerings in a marketplace | |
US20160019623A1 (en) | International search result weghting | |
US20150095147A1 (en) | Monetizing qualified leads | |
US20130262507A1 (en) | Method and system to provide inline saved searches | |
US20150134417A1 (en) | Methods, systems, and apparatus for dynamic consumer segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: EBAY INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GOLDBERG, DAVID;SHAN, YINAN SYNC;SIGNING DATES FROM 20131213 TO 20131219;REEL/FRAME:031857/0546 |
|
AS | Assignment |
Owner name: PAYPAL, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EBAY INC.;REEL/FRAME:036171/0144 Effective date: 20150717 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |