US20140229414A1 - Systems and methods for detecting anomalies - Google Patents

Systems and methods for detecting anomalies Download PDF

Info

Publication number
US20140229414A1
US20140229414A1 US14/143,185 US201314143185A US2014229414A1 US 20140229414 A1 US20140229414 A1 US 20140229414A1 US 201314143185 A US201314143185 A US 201314143185A US 2014229414 A1 US2014229414 A1 US 2014229414A1
Authority
US
United States
Prior art keywords
surprise
historical
scores
property values
property
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/143,185
Inventor
David Goldberg
Sync Shan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PayPal Inc
Original Assignee
eBay Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by eBay Inc filed Critical eBay Inc
Priority to US14/143,185 priority Critical patent/US20140229414A1/en
Assigned to EBAY INC. reassignment EBAY INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SHAN, YINAN SYNC, GOLDBERG, DAVID
Publication of US20140229414A1 publication Critical patent/US20140229414A1/en
Assigned to PAYPAL, INC. reassignment PAYPAL, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EBAY INC.
Priority to US15/141,225 priority patent/US20160239368A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63JAUXILIARIES ON VESSELS
    • B63J4/00Arrangements of installations for treating ballast water, waste water, sewage, sludge, or refuse, or for preventing environmental pollution not otherwise provided for
    • B63J4/002Arrangements of installations for treating ballast water, waste water, sewage, sludge, or refuse, or for preventing environmental pollution not otherwise provided for for treating ballast water
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F1/00Treatment of water, waste water, or sewage
    • C02F1/008Control or steering systems not provided for elsewhere in subclass C02F
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F1/00Treatment of water, waste water, or sewage
    • C02F1/30Treatment of water, waste water, or sewage by irradiation
    • C02F1/32Treatment of water, waste water, or sewage by irradiation with ultraviolet light
    • C02F1/325Irradiation devices or lamp constructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F1/00Treatment of water, waste water, or sewage
    • C02F1/001Processes for the treatment of water whereby the filtration technique is of importance
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2103/00Nature of the water, waste water, sewage or sludge to be treated
    • C02F2103/008Originating from marine vessels, ships and boats, e.g. bilge water or ballast water
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2103/00Nature of the water, waste water, sewage or sludge to be treated
    • C02F2103/08Seawater, e.g. for desalination
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2201/00Apparatus for treatment of water, waste water or sewage
    • C02F2201/32Details relating to UV-irradiation devices
    • C02F2201/326Lamp control systems
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/005Processes using a programmable logic controller [PLC]
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/02Temperature
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/06Controlling or monitoring parameters in water treatment pH
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/11Turbidity
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T70/00Maritime or waterways transport

Definitions

  • the present invention relates generally to data processing, and in some embodiments, to detecting anomalies in computer-based systems.
  • enterprises maintain and operate large numbers of computer systems (e.g., servers) that may each run a layered set of software.
  • these computer systems provide functionality for the operation of the enterprise or to provide outbound services to their customers.
  • the enterprise may monitor the hardware and software layers of these servers by logging processing load, memory usage, and many other monitored signals at frequent intervals.
  • the enterprise may occasionally suffer disruptions, where some of its services were degraded or even completely unavailable to customers.
  • the enterprise will perform a post-mortem analysis of the monitored signal in an effort to debug the system. For example, the enterprise may analyze the memory usage to identify a program the may be performing improperly, or view the processing load to determine whether more hardware is needed.
  • FIG. 1 illustrates a block diagram depicting a network architecture of a system, according to some embodiments, having a client-server architecture configured for exchanging data over a network.
  • FIG. 2 illustrates a block diagram showing components provided within the system of FIG. 1 according to some embodiments.
  • FIG. 3 is a diagram showing sampled values of a number of searches performed on a computer system that are observed over a time period, such as a twenty-four hour period, according to an example embodiment.
  • FIG. 4 is a diagram showing additional sampled values from two additional days, as compared to FIG. 3 , according to an example embodiment.
  • FIG. 5 is diagram of a plot of metric data over time for a metric of a computer system, according to an example embodiment.
  • FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment.
  • FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment.
  • FIG. 8 is a histogram showing quantiles for a metric over a two-week period, according to an example embodiment.
  • FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment.
  • FIG. 10 is a histogram that includes the quantiles shown in FIG. 8 but with a new quantile, according to an example embodiment.
  • FIG. 11 is a plot of the quantiles in a time series with a new quantile, according to an example embodiment.
  • FIG. 12 is a flowchart diagram illustrating a method for detecting an anomaly in a computer system, according to an example embodiment.
  • FIG. 13 is a diagram illustrating a property value table that may be generated based on executing the probes, according to an example embodiment.
  • FIG. 14 is a diagram showing property values for a probe-property type pair, according to an example embodiment.
  • FIG. 15 is a diagram illustrating a surprise score table, according to an example embodiment.
  • FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation, according to an example embodiment.
  • FIG. 17 is a chart illustrating surprise score features over time, according to an example embodiment.
  • FIG. 18 illustrates a diagrammatic representation of a machine in the example form of a computer system within which a set of instructions may be executed to cause the machine to perform any one or more of the methodologies discussed herein.
  • Described in detail herein is an apparatus and method for detecting anomalies in a computer system. For example, some embodiments may be used to address the problem of how to monitor signals in a computer system to detect disruptions before they affect users, and to do so with few false positives. Some embodiments may address this problem by analyzing signals for strange behavior that may be referred to as an anomaly. Example embodiments can then scan multiple monitored signals, and raise an alert when the site monitoring system detects an anomaly.
  • multiple probes are executed on an evolving data set (e.g., a listing database). Each probe may return a result.
  • Property values are then derived from a respective result returned by one of the probes.
  • a property value may be a value that quantifies a property or aspect of the result, such as, for example, a number of listing returned, a portion of classified listings, a measurement of the prices in a listing, and the like.
  • Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values.
  • the corresponding property value and the historical property values are derived from results returned from the same probe.
  • Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.
  • FIG. 1 illustrates a network diagram depicting a network system 100 , according to one embodiment, having a client-server architecture configured for exchanging data over a network.
  • a networked system 102 forms a network-based publication system that provides server-side functionality, via a network 104 (e.g., the Internet or Wide Area Network (WAN)), to one or more clients and devices.
  • FIG. 1 further illustrates, for example, one or both of a web client 106 (e.g., a web browser) and a programmatic client 108 executing on device machines 110 and 112 .
  • the publication system 100 comprises a marketplace system.
  • the publication system 100 comprises other types of systems such as, but not limited to, a social networking system, a matching system, a recommendation system, an electronic commerce (e-commerce) system, a search system, and the like.
  • Each of the device machines 110 , 112 comprises a computing device that includes at least a display and communication capabilities with the network 104 to access the networked system 102 .
  • the device machines 110 , 112 comprise, but are not limited to, remote devices, work stations, computers, general purpose computers, Internet appliances, hand-held devices, wireless devices, portable devices, wearable computers, cellular or mobile phones, portable digital assistants (PDAs), smart phones, tablets, ultrabooks, netbooks, laptops, desktops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, network PCs, mini-computers, and the like.
  • Each of the device machines 110 , 112 may connect with the network 104 via a wired or wireless connection.
  • one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks.
  • VPN virtual private network
  • LAN local area network
  • WLAN wireless LAN
  • WAN wide area network
  • WWAN wireless WAN
  • MAN metropolitan area network
  • PSTN Public Switched Telephone Network
  • PSTN Public Switched Telephone Network
  • Each of the device machines 110 , 112 includes one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, an e-commerce site application (also referred to as a marketplace application), and the like.
  • applications also referred to as “apps”
  • this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the networked system 102 , on an as needed basis, for data and/or processing capabilities not locally available (such as access to a database of items available for sale, to authenticate a user, to verify a method of payment, etc.).
  • the given one of the device machines 110 , 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102 .
  • the given one of the device machines 110 , 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102 .
  • two device machines 110 , 112 are shown in FIG. 1 , more or less than two device machines can be included in the system 100 .
  • An Application Program Interface (API) server 114 and a web server 116 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 118 .
  • the application servers 118 host one or more marketplace applications 120 and payment applications 122 .
  • the application servers 118 are, in turn, shown to be coupled to one or more databases servers 124 that facilitate access to one or more databases 126 .
  • the marketplace applications 120 may provide a number of e-commerce functions and services to users that access networked system 102 .
  • E-commerce functions/services may include a number of publisher functions and services (e.g., search, listing, content viewing, payment, etc.).
  • the marketplace applications 120 may provide a number of services and functions to users for listing goods and/or services or offers for goods and/or services for sale, searching for goods and services, facilitating transactions, and reviewing and providing feedback about transactions and associated users.
  • the marketplace applications 120 may track and store data and metadata relating to listings, transactions, and user interactions.
  • the marketplace applications 120 may publish or otherwise provide access to content items stored in application servers 118 or databases 126 accessible to the application servers 118 and/or the database servers 124 .
  • the payment applications 122 may likewise provide a number of payment services and functions to users.
  • the payment applications 122 may allow users to accumulate value (e.g., in a commercial currency, such as the U.S. dollar, or a proprietary currency, such as “points”) in accounts, and then later to redeem the accumulated value for products or items (e.g., goods or services) that are made available via the marketplace applications 120 .
  • a commercial currency such as the U.S. dollar
  • a proprietary currency such as “points”
  • the payment applications 122 may form part of a payment service that is separate and distinct from the networked system 102 .
  • the payment applications 122 may be omitted from the system 100 .
  • at least a portion of the marketplace applications 120 may be provided on the device machines 110 and/or 112 .
  • system 100 shown in FIG. 1 employs a client-server architecture
  • embodiments of the present disclosure is not limited to such an architecture, and may equally well find application in, for example, a distributed or peer-to-peer architecture system.
  • the various marketplace and payment applications 120 and 122 may also be implemented as standalone software programs, which do not necessarily have networking capabilities.
  • the web client 106 accesses the various marketplace and payment applications 120 and 122 via the web interface supported by the web server 116 .
  • the programmatic client 108 accesses the various services and functions provided by the marketplace and payment applications 120 and 122 via the programmatic interface provided by the API server 114 .
  • the programmatic client 108 may, for example, be a seller application (e.g., the TurboLister application developed by eBay Inc., of San Jose, Calif.) to enable sellers to author and manage listings on the networked system 102 in an off-line manner, and to perform batch-mode communications between the programmatic client 108 and the networked system 102 .
  • FIG. 1 also illustrates a third party application 128 , executing on a third party server machine 130 , as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 114 .
  • the third party application 128 may, utilizing information retrieved from the networked system 102 , support one or more features or functions on a website hosted by the third party.
  • the third party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of the networked system 102 .
  • FIG. 2 illustrates a block diagram showing components provided within the networked system 102 according to some embodiments.
  • the networked system 102 may be hosted on dedicated or shared server machines (not shown) that are communicatively coupled to enable communications between server machines.
  • the components themselves are communicatively coupled (e.g., via appropriate interfaces) to each other and to various data sources, so as to allow information to be passed between the applications or so as to allow the applications to share and access common data.
  • the components may access one or more databases 126 via the database servers 124 .
  • the networked system 102 may provide a number of publishing, listing, and/or price-setting mechanisms whereby a seller (also referred to as a first user) may list (or publish information concerning) goods or services for sale or barter, a buyer (also referred to as a second user) can express interest in or indicate a desire to purchase or barter such goods or services, and a transaction (such as a trade) may be completed pertaining to the goods or services.
  • the networked system 102 may comprise at least one publication engine 202 and one or more selling engines 204 .
  • the publication engine 202 may publish information, such as item listings or product description pages, on the networked system 102 .
  • the selling engines 204 may comprise one or more fixed-price engines that support fixed-price listing and price setting mechanisms and one or more auction engines that support auction-format listing and price setting mechanisms (e.g., English, Dutch, Chinese, Double, Reverse auctions, etc.).
  • the various auction engines may also provide a number of features in support of these auction-format listings, such as a reserve price feature whereby a seller may specify a reserve price in connection with a listing and a proxy-bidding feature whereby a bidder may invoke automated proxy bidding.
  • the selling engines 204 may further comprise one or more deal engines that support merchant-generated offers for products and services.
  • a listing engine 206 allows sellers to conveniently author listings of items or authors to author publications.
  • the listings pertain to goods or services that a user (e.g., a seller) wishes to transact via the networked system 102 .
  • the listings may be an offer, deal, coupon, or discount for the good or service.
  • Each good or service is associated with a particular category.
  • the listing engine 206 may receive listing data such as title, description, and aspect name/value pairs.
  • each listing for a good or service may be assigned an item identifier.
  • a user may create a listing that is an advertisement or other form of information publication. The listing information may then be stored to one or more storage devices coupled to the networked system 102 (e.g., databases 126 ).
  • Listings also may comprise product description pages that display a product and information (e.g., product title, specifications, and reviews) associated with the product.
  • the product description page may include an aggregation of item listings that correspond to the product described on the product description page.
  • the listing engine 206 also may allow buyers to conveniently author listings or requests for items desired to be purchased.
  • the listings may pertain to goods or services that a user (e.g., a buyer) wishes to transact via the networked system 102 .
  • Each good or service is associated with a particular category.
  • the listing engine 206 may receive as much or as little listing data, such as title, description, and aspect name/value pairs, that the buyer is aware of about the requested item.
  • the listing engine 206 may parse the buyer's submitted item information and may complete incomplete portions of the listing.
  • the listing engine 206 may parse the description, extract key terms and use those terms to make a determination of the identity of the item. Using the determined item identity, the listing engine 206 may retrieve additional item details for inclusion in the buyer item request. In some embodiments, the listing engine 206 may assign an item identifier to each listing for a good or service.
  • the listing engine 206 allows sellers to generate offers for discounts on products or services.
  • the listing engine 206 may receive listing data, such as the product or service being offered, a price and/or discount for the product or service, a time period for which the offer is valid, and so forth.
  • the listing engine 206 permits sellers to generate offers from the sellers' mobile devices. The generated offers may be uploaded to the networked system 102 for storage and tracking
  • the listing engine 206 allows users to navigate through various categories, catalogs, or inventory data structures according to which listings may be classified within the networked system 102 .
  • the listing engine 206 allows a user to successively navigate down a category tree comprising a hierarchy of categories (e.g., the category tree structure) until a particular set of listing is reached.
  • Various other navigation applications within the listing engine 206 may be provided to supplement the searching and browsing applications.
  • the listing engine 206 may record the various user actions (e.g., clicks) performed by the user in order to navigate down the category tree.
  • Searching the networked system 102 is facilitated by a searching engine 208 .
  • the searching engine 208 enables keyword queries of listings published via the networked system 102 .
  • the searching engine 208 receives the keyword queries from a device of a user and conducts a review of the storage device storing the listing information. The review will enable compilation of a result set of listings that may be sorted and returned to the client device (e.g., device machine 110 , 112 ) of the user.
  • the searching engine 208 may record the query (e.g., keywords) and any subsequent user actions and behaviors (e.g., navigations, selections, or click-throughs).
  • the searching engine 208 also may perform a search based on a location of the user.
  • a user may access the searching engine 208 via a mobile device and generate a search query. Using the search query and the user's location, the searching engine 208 may return relevant search results for products, services, offers, auctions, and so forth to the user.
  • the searching engine 208 may identify relevant search results both in a list form and graphically on a map. Selection of a graphical indicator on the map may provide additional details regarding the selected search result.
  • the user may specify, as part of the search query, a radius or distance from the user's current location to limit search results.
  • the searching engine 208 also may perform a search based on an image.
  • the image may be taken from a camera or imaging component of a client device or may be accessed from storage.
  • the networked system 102 may further included an anomaly detection engine 212 and a probe module 210 to perform various anomaly detection functionalities or operations as set forth in greater detail below.
  • some example embodiments may be configured to detect anomalies in an evolving data set by comparing surprise scores of property values received from a probe module.
  • some simplified examples of analyzing property values are now described to highlight some potential aspects addressed by example embodiments. For example, as a warm-up problem, consider a signal from a high software layer: the number of searches (“srp”) received or performed by the networked system 102 of FIG. 1 . In some cases, srp may be tracked by the probe module 210 periodically, say, for example, every two minutes. FIG.
  • FIG. 3 is a diagram showing sampled values 300 of srp observed over a time period, such as a twenty-four hour period, according to an example embodiment.
  • the vertical axis range may represent sampled values of the number of searches performed over a two minute period, whereas the horizontal axis may represent time, which ranges from midnight to midnight, PDT.
  • the anomaly detecting engine 212 may identify that sampled values 302 and 304 , occurring around 4:00 AM and10:30 PM, respectively, are suspicious because the sample values 302 and 304 each exhibit a comparatively drastic deviation from their neighboring values.
  • FIG. 4 is a diagram showing sampled values 400 that includes sampled values from the prior two days, relative to the sampled values 300 of FIG. 3 , according to an example embodiment. Based on the sampled values 400 , one may reasonably conclude that the sampled value 302 should be categorized as an anomaly, but not the sampled value 304 . Such is the case because the sampled value 304 is consistent with the other two days of samples, whereas the sampled value 302 is inconsistent with the other two days.
  • FIGS. 3 and 4 suggest that comparing a property value of a result against historical property values may be used as a simple feature for detecting anomalies in a computer system.
  • the feature may be based on comparing a current value of srp with its respective value 24 hours ago, 48 hours ago, 92 hours ago, and so forth.
  • the probe module 210 may periodically issue a query (or a set of queries) and log values for one or more properties relating to the search result returned by the query (or each query in the set of queries). Examples of properties that may be tracked by the probe module 210 include a number of items returned from a search query, the average list price, a measurement of the number of items that are auctions relative to non-auction items, a number of classified listings, a number of searches executed, etc.
  • the anomaly detection engine 212 may repeatedly cycle through a fixed set (e.g., tens, hundreds, thousands, and so forth) of queries to build a historical model of the property values for each of the search queries over time.
  • FIG. 5 is diagram of a plot of property values 500 over time for a property of a computer system, according to an example embodiment.
  • the property values may include one or more sampled values of for a property, which are sampled over time.
  • the horizontal axis of FIG. 5 represents time, with the right-hand side representing the most recent samples.
  • the vertical axis of FIG. 5 represents values of the property being monitored by the anomaly detecting engine 212 .
  • the property values 500 may represent the median sales price for the listings returned when the probe module 210 submits the search query “htc hd2” to the searching engine 208 .
  • the plot may include a fitted line 504 to represent expected values from the property over time. As may be appreciated from FIG.
  • the property values 500 exhibits some noise (e.g., values that deviate from the fitted line 504 ). However, even when compared to the noise within the property values 500 , the property value 502 may represent an anomaly because the deviation of the property value 502 is deviates significantly from the fitted line 504 when compared to the other values of the property values 500 .
  • the anomaly detecting engine 212 may determine whether a value of a property represents an anomaly caused by a site disruption based in part on calculating surprise scores for the property value.
  • a surprise score may be a measurement used to quantify how out of the norm a value for a property is based on historical values for that property.
  • the anomaly detecting engine 212 may quantify the surprise score for a value of a property by computing the (unsigned) deviation of each property value from an expected value.
  • one specific implementation of calculating a surprise score may involve dividing the deviation of a value from the expected value (e.g., the fitted line 504 ) by the median deviation of all the values.
  • the anomaly detecting engine 212 may assign the value 502 a surprise score of 7.3 (e.g., 97.9/13.4).
  • FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment.
  • a surprise score of 7 is not unusual because a surprise score of 7 is not far off in value from other surprise scores.
  • FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment. Since log(7.3) ⁇ 2, it is clear that a value of 2 is not all that unusual. Quantitatively, the percentile of the surprise score for the query “htc hd2” is about 96%. Ringing an alarm for a surprise this large may generate a large number of false positives.
  • FIGS. 6 and 7 are diagrams illustrating the difficulty in getting a low false positive rate when using statistical methods for detecting anomalies, according to an example embodiment.
  • An example system may, for example, periodically execute 3000 queries six times a day to log measure data relating to 40 different properties.
  • the anomaly detection engine 212 can construct a feature based on the number of surprise scores that deviate from historical norms. A sudden change in the number of high surprise scores, for example, might be a good indicator of a site disruption. This is done separately for each property being monitored by the anomaly detection engine 212 . To make this quantitative, instead of counting the number of queries with a high surprise, some embodiments of the anomaly detection engine 212 can examine a quantile (e.g., 0.9 th quantile) of the surprise values for a property. Using the quantiles to detect anomalies is now described.
  • a quantile e.g., 0.9 th quantile
  • the surprise score of the most recent property value depends on at least the following: a property type (e.g., mean sales price listed), a property value (e.g., a value for the mean sales price listed), the probe (e.g., a query that generates a result of listed items for sale), and a collection window of recent property values for the probe.
  • a property type e.g., mean sales price listed
  • a property value e.g., a value for the mean sales price listed
  • the probe e.g., a query that generates a result of listed items for sale
  • a collection window of recent property values for the probe e.g., a collection window of recent property values for the probe.
  • FIG. 8 is a histogram showing quantiles 800 for a property type (e.g., a median sale price) over a two-week period, according to an example embodiment. As shown in FIG. 8 , the quantiles are clustered near 3.6, with a range from 3.0 to 4.6.
  • FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment. This shows the feature (e.g., the quantile of the surprise score) is fairly smooth, and might be, in some cases, a candidate for anomaly detection.
  • FIG. 10 is a histogram that includes the quantiles 800 shown in FIG. 8 but with a new quantile 1002 , according to an example embodiment.
  • the new quantile 1002 may be calculated based on the value of the property when the anomaly detection engine 212 executes the set of queries again.
  • FIG. 11 is a plot of the quantiles in a time series with the new quantile 1002 , according to an example embodiment. For example, while FIG. 9 shows quantiles from up to 07:00 on Nov 28, FIG. 10 adds the quantile 1002 .
  • FIG. 9 is vaguely normal with a mean of 3.7 and a standard deviation of 0.3.
  • the new value of the new quantile 1002 shown in FIGS. 10 and 11 is 29.6, which is
  • an individual property for a particular query will have sudden jumps in values. Although these sudden jumps may represent outliers, an outlier, in and of itself, should not necessarily raise an alert. Instead, example embodiments may use the number of queries that have such jumps as a good signal for raising an alert of an anomaly. So a selection of features may go like this. For each (probe, property value) pair, we compute a measure of surprise and measure whether the latest property value of the property is an outlier. The anomaly detection engine 212 then has a surprise number for each query. It is expected to have a few large surprise numbers, but not too many.
  • the anomaly detection engine 212 may in some embodiments select the 90th quantile of surprise values (e.g., sort the surprise values from low to high and, return the 90-th value, or using a non-sorting function to calculate a quantile or ranking of surprise scores). This is our feature. Now we can use any outlier detection method to raise an alert. For example, in an example embodiment, the anomaly detection engine 212 may take the last 30 days' worth of signals, compute their mean and standard deviation. If the latest quantile of the signal is more than threshold deviation (e.g., 56 from the mean), the anomaly detection engine 212 raises an alert.
  • threshold deviation e.g., 56 from the mean
  • FIG. 12 is a flowchart diagram illustrating a method 1200 for detecting an anomaly in a computer system, according to an example embodiment.
  • the method 1200 may begin at operation 1202 when the probe module 210 executes probes on an evolving data set. Each probe returns a result derived from the evolving data set.
  • the probe module 210 may issue a set of queries to the searching engine 208 of FIG. 2 .
  • the probe module 210 may then receive search results for each of the queries issued to the search engine 208 . It is to be appreciated that each of the probes (e.g., search queries) may be different and, accordingly, each results may also be different.
  • the probe module 210 is further configured to derive property values for each result returned from the probes.
  • a property value may include a data that quantifies a property or aspect of a result.
  • the property value may represent, for example, a value for the property of the number of items returned in the result, the average list price in the result, a measurement of the number of items that are auctions relative to non-auction items in the result, a number of classified listings in the result, or any other suitable property.
  • the execution of operation 1202 may result in a data table that includes a number of property values that each correspond to one of the probes executed by the probe module 210 .
  • the table may include multiple columns, where each column corresponds to a different property type. This is shown in FIG. 13 .
  • FIG. 13 is a diagram illustrating a property value table 1300 that may be generated based on executing the probes, according to an example embodiment.
  • the property value table 1300 may store the property values (organized by property types 1304 ) collected for a single iteration of the probes 1302 . As FIG. 13 shows, for a single probe, the probe module 210 may derive property values for multiple property types.
  • the probe module 210 may derive multiple property values, each corresponding to a different probe.
  • single property value may be specific to a probe-property type pair.
  • property value 1310 may be specific to the Probe 2 -Property Type 2 pair.
  • the anomaly detecting engine 212 may generate surprise scores for each of the property values.
  • Each surprise score may be generated based on a comparison of a property value and historical property values that correspond to a probe-property type pair. For example, for a probe, the surprise score may be based on a function of the property value and a deviation from an expect value.
  • An expected value may be determined based on a robust estimation of the tendency of the historical property values for that probe.
  • a median, mode, mean, and trimmed mean are all examples of a robust estimation of the tendency of the value for the feature that can be used to generate a surprise score.
  • the surprise score for a value may be based on a standard deviation from the tendency for that value of the property.
  • operation 1204 may generate a surprise score for the latest results where each surprise corresponds to one of the queries in the set of queries.
  • FIG. 14 is a diagram showing property values 1400 for a probe-property type pair, according to an example embodiment.
  • the property values 1400 may include the property value 1310 received as part of a current iteration of executing the probes.
  • the property value 1310 may be specific to the Probe 2 -Property 2 pair.
  • the property values 1400 may also include historical property values 1402 that were obtained in past iterations of executing the probes.
  • the historical property values 1402 are specific to the same probe-property type pair as the property value 1310 (e.g., Probe 2 -Property 2 pair).
  • the surprise score is based on the deviation of the sample property value 1310 from the historical property values 1402 .
  • FIG. 15 is a diagram illustrating a surprise score table 1500 , according to an example embodiment.
  • the surprise score table 1500 includes a surprise score for each of the probe property pairs.
  • the surprise score table 1500 is generated based on calculating a surprise score in the manner discussed with respect to FIG. 14 . That is, a surprise score is generated for each probe-property type pair based on a comparison between the property value corresponding to the probe-property type pair and the historical property values for the probe-property type pair.
  • the anomaly detecting engine 212 may access a plurality of historical surprise scores generated by the anomaly detection engine 212 .
  • the historical surprise scores accessed at operation 1206 may be based on past iterations of executing the probes. Further, in some cases, the historical surprise scores may be specific to a particular property type.
  • the anomaly detection engine 212 may alert a monitoring system of an anomaly regarding the evolving data set.
  • the idea of operation 1208 is that an alert is generated if the surprise scores for a given property type (e.g., Property 2 ), across all probes (e.g., Probes 1-6 ) is out of the norm from historical surprise scores for those probe-property type pairs.
  • the comparison used by operation 1208 may be based a feature derived from the surprise scores.
  • FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation 1204 , according to an example embodiment.
  • the measurement of the feature shown in FIG. 16 is for the surprise scores generated for a single iteration of the execution of the probes.
  • the chart 1600 may measure the feature for the surprise scores generated across Probes 1-6 for Property Type 2 .
  • the feature may be a measurement of deviation a surprise score is from an expected value.
  • An expected value for the surprise score may be calculated based on a robust estimation of the tendency of the historical surprise scores for that property type.
  • the feature may be a quantile of the surprise scores, or a quantile of the data derived from the surprise scores (e.g., the deviation from the expected value). Still further, in some cases, the feature may be a measurement or count of the number of surprise scores that deviate beyond a threshold amount from the expected value, or that exceed a fixed surprise score.
  • the feature of the surprise scores may then be compared against historical surprise scores from past iterations of executing the probes. This is shown in FIG. 17 .
  • FIG. 17 is a chart illustrating surprise score features 1700 over time, according to an example embodiment.
  • the surprise score feature 1702 may be a feature of the surprise scores for a current iteration of the execution of the probes
  • the historical surprise score features 1704 are features of surprise scores from past iterations of executing the probes.
  • operation 1208 may alert if the feature of the surprise score 1702 deviates from the historical surprise score features 1704 beyond a threshold amount.
  • the operations 1206 and 1208 shown in FIG. 12 may be repeated across all the property types in monitored by the probe module 210 .
  • the operations 1206 and 1208 may execute across Property Types 1-5 .
  • the computer system may be an inventory data store.
  • the probe module 210 may be configured to detect as property types, among other things, the number of items stored per category, the number of auction items per category, and the like.
  • the computer system may be a computer infrastructure (e.g., a collection of computer servers).
  • the probe module 210 may be configured to detect as property types, among other things, a processor load, bandwidth consumption, thread count, running processes count, memory usage, throughput count, rate of disk seeks, rate of packets transmitted or received, or rate of response.
  • the property values tracked by the anomaly detection engine 212 may include dimensions in addition to what is described above.
  • a table may be used to store the property values, where the columns are the metrics tracked by the different probes modules, and the rows are the different values for those property types at different times.
  • An extension would be a 3D-table or cube.
  • For each (property, value) cell there may be a series of aspects instead of a single number.
  • An aspect may be a vertical stack out of the page. In example (1), the aspect might be different countries.
  • a cell in a table may be related to a specific query (perhaps ‘iPhone 5S’) and property (perhaps number of results). But using the aspects, the results vary by country, so the single cell is replaced by a stack of entries, one for each country.
  • the anomaly detection engine 212 may be configured to detect a problem with the search software (a disruption) before users do.
  • the property values may be received from the same interfaces and using the same computer systems used by the end users.
  • the metric data received from the probe module 210 is a proxy, if not identical, to the user experience of users of that computer system. Accordingly, it is to be appreciated that when this disclosure states the anomaly detection engine 212 may detect a problem before users do, it may simply mean that the anomaly detection engine 212 can detect and report a problem without intervention from a user. Thus, compared to traditional systems, example embodiments may use the anomaly detection engine 212 to provide comparatively quick detection of site problems.
  • FIG. 18 shows a diagrammatic representation of a machine in the example form of a computer system 1800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
  • the computer system 1800 comprises, for example, any of the device machine 110 , device machine 112 , applications servers 118 , API server 114 , web server 116 , database servers 124 , or third party server 130 .
  • the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a device machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
  • the machine may be a server computer, a client computer, a personal computer (PC), a tablet, a set-top box (STB), a Personal Digital Assistant (PDA), a smart phone, a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
  • PC personal computer
  • PDA Personal Digital Assistant
  • smart phone a cellular telephone
  • web appliance a web appliance
  • network router switch or bridge
  • the example computer system 1800 includes a processor 1802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 1804 and a static memory 1806 , which communicate with each other via a bus 1808 .
  • the computer system 1800 may further include a video display unit 1810 (e.g., liquid crystal display (LCD), organic light emitting diode (OLED), touch screen, or a cathode ray tube (CRT)).
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • CRT cathode ray tube
  • the computer system 1800 also includes an alphanumeric input device 1812 (e.g., a physical or virtual keyboard), a cursor control device 1814 (e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad), a disk drive unit 1816 , a signal generation device 1818 (e.g., a speaker) and a network interface device 1820 .
  • an alphanumeric input device 1812 e.g., a physical or virtual keyboard
  • a cursor control device 1814 e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad
  • a disk drive unit 1816 e.g., a disk drive unit 1816
  • a signal generation device 1818 e.g., a speaker
  • the disk drive unit 1816 includes a machine-readable medium 1822 on which is stored one or more sets of instructions 1824 (e.g., software) embodying any one or more of the methodologies or functions described herein.
  • the instructions 1824 may also reside, completely or at least partially, within the main memory 1804 and/or within the processor 1802 during execution thereof by the computer system 1800 , the main memory 1804 and the processor 1802 also constituting machine-readable media.
  • the instructions 1824 may further be transmitted or received over a network 1826 via the network interface device 1820 .
  • machine-readable medium 1822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
  • the term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present invention.
  • the term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and carrier wave signals.
  • modules, engines, components, or mechanisms may be implemented as logic or a number of modules, engines, components, or mechanisms.
  • a module, engine, logic, component, or mechanism may be a tangible unit capable of performing certain operations and configured or arranged in a certain manner.
  • one or more computer systems e.g., a standalone, client, or server computer system
  • one or more components of a computer system e.g., a processor or a group of processors
  • software e.g., an application or application portion
  • firmware note that software and firmware can generally be used interchangeably herein as is known by a skilled artisan
  • a module may be implemented mechanically or electronically.
  • a module may comprise dedicated circuitry or logic that is permanently configured (e.g., within a special-purpose processor, application specific integrated circuit (ASIC), or array) to perform certain operations.
  • a module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. It will be appreciated that a decision to implement a module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by, for example, cost, time, energy-usage, and package size considerations.
  • module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), non-transitory, or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
  • modules or components are temporarily configured (e.g., programmed)
  • each of the modules or components need not be configured or instantiated at any one instance in time.
  • the modules or components comprise a general-purpose processor configured using software
  • the general-purpose processor may be configured as respective different modules at different times.
  • Software may accordingly configure the processor to constitute a particular module at one instance of time and to constitute a different module at a different instance of time.
  • Modules can provide information to, and receive information from, other modules. Accordingly, the described modules may be regarded as being communicatively coupled. Where multiples of such modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the modules. In embodiments in which multiple modules are configured or instantiated at different times, communications between such modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple modules have access. For example, one module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further module may then, at a later time, access the memory device to retrieve and process the stored output. Modules may also initiate communications with input or output devices and can operate on a resource (e.g., a collection of information).
  • a resource e.g., a collection of information

Abstract

Apparatus and method for detecting anomalies in a computer system are disclosed herein. In some embodiments, multiple probes are executed on an evolving data set. Each probe may return a result. Property values are then derived from a respective result returned by a corresponding probe. Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values. The corresponding property value and the historical property values are derived from results returned from the same probe. Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.

Description

    TECHNICAL FIELD
  • The present invention relates generally to data processing, and in some embodiments, to detecting anomalies in computer-based systems.
  • BACKGROUND
  • In many cases, enterprises maintain and operate large numbers of computer systems (e.g., servers) that may each run a layered set of software. In some cases, these computer systems provide functionality for the operation of the enterprise or to provide outbound services to their customers. In many cases, the enterprise may monitor the hardware and software layers of these servers by logging processing load, memory usage, and many other monitored signals at frequent intervals.
  • Unfortunately, the enterprise may occasionally suffer disruptions, where some of its services were degraded or even completely unavailable to customers. To resolve these disruptions, the enterprise will perform a post-mortem analysis of the monitored signal in an effort to debug the system. For example, the enterprise may analyze the memory usage to identify a program the may be performing improperly, or view the processing load to determine whether more hardware is needed.
  • Thus, traditional systems may utilize methods and systems for addressing anomalies that involves debugging a computer system after the anomaly has affected the computer system, and, by extension, the users.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Some embodiments are illustrated by way of example and not limitations in the figures of the accompanying drawings, in which:
  • FIG. 1 illustrates a block diagram depicting a network architecture of a system, according to some embodiments, having a client-server architecture configured for exchanging data over a network.
  • FIG. 2 illustrates a block diagram showing components provided within the system of FIG. 1 according to some embodiments.
  • FIG. 3 is a diagram showing sampled values of a number of searches performed on a computer system that are observed over a time period, such as a twenty-four hour period, according to an example embodiment.
  • FIG. 4 is a diagram showing additional sampled values from two additional days, as compared to FIG. 3, according to an example embodiment.
  • FIG. 5 is diagram of a plot of metric data over time for a metric of a computer system, according to an example embodiment.
  • FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment.
  • FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment.
  • FIG. 8 is a histogram showing quantiles for a metric over a two-week period, according to an example embodiment.
  • FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment.
  • FIG. 10 is a histogram that includes the quantiles shown in FIG. 8 but with a new quantile, according to an example embodiment.
  • FIG. 11 is a plot of the quantiles in a time series with a new quantile, according to an example embodiment.
  • FIG. 12 is a flowchart diagram illustrating a method for detecting an anomaly in a computer system, according to an example embodiment.
  • FIG. 13 is a diagram illustrating a property value table that may be generated based on executing the probes, according to an example embodiment.
  • FIG. 14 is a diagram showing property values for a probe-property type pair, according to an example embodiment.
  • FIG. 15 is a diagram illustrating a surprise score table, according to an example embodiment.
  • FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation, according to an example embodiment.
  • FIG. 17 is a chart illustrating surprise score features over time, according to an example embodiment.
  • FIG. 18 illustrates a diagrammatic representation of a machine in the example form of a computer system within which a set of instructions may be executed to cause the machine to perform any one or more of the methodologies discussed herein.
  • The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the terms used.
  • DETAILED DESCRIPTION
  • Described in detail herein is an apparatus and method for detecting anomalies in a computer system. For example, some embodiments may be used to address the problem of how to monitor signals in a computer system to detect disruptions before they affect users, and to do so with few false positives. Some embodiments may address this problem by analyzing signals for strange behavior that may be referred to as an anomaly. Example embodiments can then scan multiple monitored signals, and raise an alert when the site monitoring system detects an anomaly.
  • Various modifications to the example embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments and applications without departing from the scope of the invention. Moreover, in the following description, numerous details are set forth for the purpose of explanation. However, one of ordinary skill in the art will realize that the invention may be practiced without the use of these specific details. In other instances, well-known structures and processes are not shown in block diagram form in order not to obscure the description of the invention with unnecessary detail. Thus, the present disclosure is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
  • For example, in some embodiments, multiple probes (e.g., queries) are executed on an evolving data set (e.g., a listing database). Each probe may return a result. Property values are then derived from a respective result returned by one of the probes. A property value may be a value that quantifies a property or aspect of the result, such as, for example, a number of listing returned, a portion of classified listings, a measurement of the prices in a listing, and the like.
  • Surprise scores corresponding to the property values are generated, where each surprise score is generated based on a comparison between a corresponding property value and historical property values. The corresponding property value and the historical property values are derived from results returned from the same probe. Historical surprise scores generated by the anomaly detection engine are accessed. Responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, a monitoring system is alerted of an anomaly regarding the evolving data set.
  • FIG. 1 illustrates a network diagram depicting a network system 100, according to one embodiment, having a client-server architecture configured for exchanging data over a network. A networked system 102 forms a network-based publication system that provides server-side functionality, via a network 104 (e.g., the Internet or Wide Area Network (WAN)), to one or more clients and devices. FIG. 1 further illustrates, for example, one or both of a web client 106 (e.g., a web browser) and a programmatic client 108 executing on device machines 110 and 112. In one embodiment, the publication system 100 comprises a marketplace system. In another embodiment, the publication system 100 comprises other types of systems such as, but not limited to, a social networking system, a matching system, a recommendation system, an electronic commerce (e-commerce) system, a search system, and the like.
  • Each of the device machines 110, 112 comprises a computing device that includes at least a display and communication capabilities with the network 104 to access the networked system 102. The device machines 110, 112 comprise, but are not limited to, remote devices, work stations, computers, general purpose computers, Internet appliances, hand-held devices, wireless devices, portable devices, wearable computers, cellular or mobile phones, portable digital assistants (PDAs), smart phones, tablets, ultrabooks, netbooks, laptops, desktops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, network PCs, mini-computers, and the like. Each of the device machines 110, 112 may connect with the network 104 via a wired or wireless connection. For example, one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks.
  • Each of the device machines 110, 112 includes one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, an e-commerce site application (also referred to as a marketplace application), and the like. In some embodiments, if the e-commerce site application is included in a given one of the device machines 110, 112, then this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the networked system 102, on an as needed basis, for data and/or processing capabilities not locally available (such as access to a database of items available for sale, to authenticate a user, to verify a method of payment, etc.). Conversely if the e-commerce site application is not included in a given one of the device machines 110, 112, the given one of the device machines 110, 112 may use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102. Although two device machines 110, 112 are shown in FIG. 1, more or less than two device machines can be included in the system 100.
  • An Application Program Interface (API) server 114 and a web server 116 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 118. The application servers 118 host one or more marketplace applications 120 and payment applications 122. The application servers 118 are, in turn, shown to be coupled to one or more databases servers 124 that facilitate access to one or more databases 126.
  • The marketplace applications 120 may provide a number of e-commerce functions and services to users that access networked system 102. E-commerce functions/services may include a number of publisher functions and services (e.g., search, listing, content viewing, payment, etc.). For example, the marketplace applications 120 may provide a number of services and functions to users for listing goods and/or services or offers for goods and/or services for sale, searching for goods and services, facilitating transactions, and reviewing and providing feedback about transactions and associated users. Additionally, the marketplace applications 120 may track and store data and metadata relating to listings, transactions, and user interactions. In some embodiments, the marketplace applications 120 may publish or otherwise provide access to content items stored in application servers 118 or databases 126 accessible to the application servers 118 and/or the database servers 124. The payment applications 122 may likewise provide a number of payment services and functions to users. The payment applications 122 may allow users to accumulate value (e.g., in a commercial currency, such as the U.S. dollar, or a proprietary currency, such as “points”) in accounts, and then later to redeem the accumulated value for products or items (e.g., goods or services) that are made available via the marketplace applications 120. While the marketplace and payment applications 120 and 122 are shown in FIG. 1 to both form part of the networked system 102, it will be appreciated that, in alternative embodiments, the payment applications 122 may form part of a payment service that is separate and distinct from the networked system 102. In other embodiments, the payment applications 122 may be omitted from the system 100. In some embodiments, at least a portion of the marketplace applications 120 may be provided on the device machines 110 and/or 112.
  • Further, while the system 100 shown in FIG. 1 employs a client-server architecture, embodiments of the present disclosure is not limited to such an architecture, and may equally well find application in, for example, a distributed or peer-to-peer architecture system. The various marketplace and payment applications 120 and 122 may also be implemented as standalone software programs, which do not necessarily have networking capabilities.
  • The web client 106 accesses the various marketplace and payment applications 120 and 122 via the web interface supported by the web server 116. Similarly, the programmatic client 108 accesses the various services and functions provided by the marketplace and payment applications 120 and 122 via the programmatic interface provided by the API server 114. The programmatic client 108 may, for example, be a seller application (e.g., the TurboLister application developed by eBay Inc., of San Jose, Calif.) to enable sellers to author and manage listings on the networked system 102 in an off-line manner, and to perform batch-mode communications between the programmatic client 108 and the networked system 102.
  • FIG. 1 also illustrates a third party application 128, executing on a third party server machine 130, as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 114. For example, the third party application 128 may, utilizing information retrieved from the networked system 102, support one or more features or functions on a website hosted by the third party. The third party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of the networked system 102.
  • FIG. 2 illustrates a block diagram showing components provided within the networked system 102 according to some embodiments. The networked system 102 may be hosted on dedicated or shared server machines (not shown) that are communicatively coupled to enable communications between server machines. The components themselves are communicatively coupled (e.g., via appropriate interfaces) to each other and to various data sources, so as to allow information to be passed between the applications or so as to allow the applications to share and access common data. Furthermore, the components may access one or more databases 126 via the database servers 124.
  • The networked system 102 may provide a number of publishing, listing, and/or price-setting mechanisms whereby a seller (also referred to as a first user) may list (or publish information concerning) goods or services for sale or barter, a buyer (also referred to as a second user) can express interest in or indicate a desire to purchase or barter such goods or services, and a transaction (such as a trade) may be completed pertaining to the goods or services. To this end, the networked system 102 may comprise at least one publication engine 202 and one or more selling engines 204. The publication engine 202 may publish information, such as item listings or product description pages, on the networked system 102. In some embodiments, the selling engines 204 may comprise one or more fixed-price engines that support fixed-price listing and price setting mechanisms and one or more auction engines that support auction-format listing and price setting mechanisms (e.g., English, Dutch, Chinese, Double, Reverse auctions, etc.). The various auction engines may also provide a number of features in support of these auction-format listings, such as a reserve price feature whereby a seller may specify a reserve price in connection with a listing and a proxy-bidding feature whereby a bidder may invoke automated proxy bidding. The selling engines 204 may further comprise one or more deal engines that support merchant-generated offers for products and services.
  • A listing engine 206 allows sellers to conveniently author listings of items or authors to author publications. In one embodiment, the listings pertain to goods or services that a user (e.g., a seller) wishes to transact via the networked system 102. In some embodiments, the listings may be an offer, deal, coupon, or discount for the good or service. Each good or service is associated with a particular category. The listing engine 206 may receive listing data such as title, description, and aspect name/value pairs. Furthermore, each listing for a good or service may be assigned an item identifier. In other embodiments, a user may create a listing that is an advertisement or other form of information publication. The listing information may then be stored to one or more storage devices coupled to the networked system 102 (e.g., databases 126). Listings also may comprise product description pages that display a product and information (e.g., product title, specifications, and reviews) associated with the product. In some embodiments, the product description page may include an aggregation of item listings that correspond to the product described on the product description page.
  • The listing engine 206 also may allow buyers to conveniently author listings or requests for items desired to be purchased. In some embodiments, the listings may pertain to goods or services that a user (e.g., a buyer) wishes to transact via the networked system 102. Each good or service is associated with a particular category. The listing engine 206 may receive as much or as little listing data, such as title, description, and aspect name/value pairs, that the buyer is aware of about the requested item. In some embodiments, the listing engine 206 may parse the buyer's submitted item information and may complete incomplete portions of the listing. For example, if the buyer provides a brief description of a requested item, the listing engine 206 may parse the description, extract key terms and use those terms to make a determination of the identity of the item. Using the determined item identity, the listing engine 206 may retrieve additional item details for inclusion in the buyer item request. In some embodiments, the listing engine 206 may assign an item identifier to each listing for a good or service.
  • In some embodiments, the listing engine 206 allows sellers to generate offers for discounts on products or services. The listing engine 206 may receive listing data, such as the product or service being offered, a price and/or discount for the product or service, a time period for which the offer is valid, and so forth. In some embodiments, the listing engine 206 permits sellers to generate offers from the sellers' mobile devices. The generated offers may be uploaded to the networked system 102 for storage and tracking
  • In a further example embodiment, the listing engine 206 allows users to navigate through various categories, catalogs, or inventory data structures according to which listings may be classified within the networked system 102. For example, the listing engine 206 allows a user to successively navigate down a category tree comprising a hierarchy of categories (e.g., the category tree structure) until a particular set of listing is reached. Various other navigation applications within the listing engine 206 may be provided to supplement the searching and browsing applications. The listing engine 206 may record the various user actions (e.g., clicks) performed by the user in order to navigate down the category tree.
  • Searching the networked system 102 is facilitated by a searching engine 208. For example, the searching engine 208 enables keyword queries of listings published via the networked system 102. In example embodiments, the searching engine 208 receives the keyword queries from a device of a user and conducts a review of the storage device storing the listing information. The review will enable compilation of a result set of listings that may be sorted and returned to the client device (e.g., device machine 110, 112) of the user. The searching engine 208 may record the query (e.g., keywords) and any subsequent user actions and behaviors (e.g., navigations, selections, or click-throughs).
  • The searching engine 208 also may perform a search based on a location of the user. A user may access the searching engine 208 via a mobile device and generate a search query. Using the search query and the user's location, the searching engine 208 may return relevant search results for products, services, offers, auctions, and so forth to the user. The searching engine 208 may identify relevant search results both in a list form and graphically on a map. Selection of a graphical indicator on the map may provide additional details regarding the selected search result. In some embodiments, the user may specify, as part of the search query, a radius or distance from the user's current location to limit search results.
  • The searching engine 208 also may perform a search based on an image. The image may be taken from a camera or imaging component of a client device or may be accessed from storage.
  • In addition to the above described modules, the networked system 102 may further included an anomaly detection engine 212 and a probe module 210 to perform various anomaly detection functionalities or operations as set forth in greater detail below.
  • Anomaly Detection
  • As explained above, some example embodiments may be configured to detect anomalies in an evolving data set by comparing surprise scores of property values received from a probe module. However, before describing the methods and systems for detecting anomalies in a computer system in great detail, some simplified examples of analyzing property values are now described to highlight some potential aspects addressed by example embodiments. For example, as a warm-up problem, consider a signal from a high software layer: the number of searches (“srp”) received or performed by the networked system 102 of FIG. 1. In some cases, srp may be tracked by the probe module 210 periodically, say, for example, every two minutes. FIG. 3 is a diagram showing sampled values 300 of srp observed over a time period, such as a twenty-four hour period, according to an example embodiment. The vertical axis range may represent sampled values of the number of searches performed over a two minute period, whereas the horizontal axis may represent time, which ranges from midnight to midnight, PDT. In analyzing the sampled values 300 of srp, the anomaly detecting engine 212 may identify that sampled values 302 and 304, occurring around 4:00 AM and10:30 PM, respectively, are suspicious because the sample values 302 and 304 each exhibit a comparatively drastic deviation from their neighboring values. It is to be appreciated that traditional statistical methods may be unable to reliably determine if sampled values 302 and 304 should be considered anomalies based simply on the data found in the sampled value 300 shown in FIG. 3. That is, based on the sample values 300, prior art systems are unlikely to reliably determine (e.g., without issuing too many false positives) whether samples 302 and 304 are site disruptions or not.
  • But now consider FIG. 4, which shows additional sampled values for the number of searches per time period. For example, FIG. 4 is a diagram showing sampled values 400 that includes sampled values from the prior two days, relative to the sampled values 300 of FIG. 3, according to an example embodiment. Based on the sampled values 400, one may reasonably conclude that the sampled value 302 should be categorized as an anomaly, but not the sampled value 304. Such is the case because the sampled value 304 is consistent with the other two days of samples, whereas the sampled value 302 is inconsistent with the other two days.
  • FIGS. 3 and 4 suggest that comparing a property value of a result against historical property values may be used as a simple feature for detecting anomalies in a computer system. The feature may be based on comparing a current value of srp with its respective value 24 hours ago, 48 hours ago, 92 hours ago, and so forth.
  • Another example of detecting anomalies in a computer system is now described with reference to FIGS. 5-11. In this example computer system, the probe module 210 may periodically issue a query (or a set of queries) and log values for one or more properties relating to the search result returned by the query (or each query in the set of queries). Examples of properties that may be tracked by the probe module 210 include a number of items returned from a search query, the average list price, a measurement of the number of items that are auctions relative to non-auction items, a number of classified listings, a number of searches executed, etc. The anomaly detection engine 212 may repeatedly cycle through a fixed set (e.g., tens, hundreds, thousands, and so forth) of queries to build a historical model of the property values for each of the search queries over time.
  • FIG. 5 is diagram of a plot of property values 500 over time for a property of a computer system, according to an example embodiment. The property values may include one or more sampled values of for a property, which are sampled over time. The horizontal axis of FIG. 5 represents time, with the right-hand side representing the most recent samples. The vertical axis of FIG. 5 represents values of the property being monitored by the anomaly detecting engine 212. By way of example and not limitation, the property values 500 may represent the median sales price for the listings returned when the probe module 210 submits the search query “htc hd2” to the searching engine 208. As shown in FIG. 5, the plot may include a fitted line 504 to represent expected values from the property over time. As may be appreciated from FIG. 5, the property values 500 exhibits some noise (e.g., values that deviate from the fitted line 504). However, even when compared to the noise within the property values 500, the property value 502 may represent an anomaly because the deviation of the property value 502 is deviates significantly from the fitted line 504 when compared to the other values of the property values 500.
  • In some cases, the anomaly detecting engine 212 may determine whether a value of a property represents an anomaly caused by a site disruption based in part on calculating surprise scores for the property value. A surprise score may be a measurement used to quantify how out of the norm a value for a property is based on historical values for that property. For example, the anomaly detecting engine 212 may quantify the surprise score for a value of a property by computing the (unsigned) deviation of each property value from an expected value. For example, one specific implementation of calculating a surprise score may involve dividing the deviation of a value from the expected value (e.g., the fitted line 504) by the median deviation of all the values. Assuming the deviation for the value 502 is 97.9 and the median deviation for all the values of the property values 500 is 13.4, the anomaly detecting engine 212 may assign the value 502 a surprise score of 7.3 (e.g., 97.9/13.4).
  • Some embodiments may address the issue of whether a particular surprise score (e.g., a surprise score of 7.3, as discussed above) should trigger an alert that there may be an anomaly in the computer system. FIG. 6 is a histogram charting surprise scores from a number of queries submitted to a computer system, according to an example embodiment. In the context of FIG. 6, a surprise score of 7 is not unusual because a surprise score of 7 is not far off in value from other surprise scores. In fact, according to FIG. 6, there are many other queries that result in surprise scores that are of higher value than 7.
  • To clarify surprise scores shown in FIG. 6, FIG. 7 is another histogram showing the surprise scores according to a logarithmic function, according to an example embodiment. Since log(7.3)≈2, it is clear that a value of 2 is not all that unusual. Quantitatively, the percentile of the surprise score for the query “htc hd2” is about 96%. Ringing an alarm for a surprise this large may generate a large number of false positives.
  • Incidentally, FIGS. 6 and 7 are diagrams illustrating the difficulty in getting a low false positive rate when using statistical methods for detecting anomalies, according to an example embodiment. An example system may, for example, periodically execute 3000 queries six times a day to log measure data relating to 40 different properties. Thus, under these constraints, the anomaly detection engine 212 may generate 3000×40×6=720,000 graphs or tables each day. Even achieving as low as 1 false positive per day would require only triggering on graphs with a surprise score so high that it happens 0.00014% of the time. It is to be appreciated that such a stringent cutoff is likely to overlook many real disruptions.
  • One way around the difficulty of avoiding false positive may be through aggregation of surprising values across multiple queries. Since there will always be a few queries with a high surprise, the anomaly detection engine 212 can construct a feature based on the number of surprise scores that deviate from historical norms. A sudden change in the number of high surprise scores, for example, might be a good indicator of a site disruption. This is done separately for each property being monitored by the anomaly detection engine 212. To make this quantitative, instead of counting the number of queries with a high surprise, some embodiments of the anomaly detection engine 212 can examine a quantile (e.g., 0.9th quantile) of the surprise values for a property. Using the quantiles to detect anomalies is now described.
  • The surprise score of the most recent property value, as computed above with reference to FIG. 5, depends on at least the following: a property type (e.g., mean sales price listed), a property value (e.g., a value for the mean sales price listed), the probe (e.g., a query that generates a result of listed items for sale), and a collection window of recent property values for the probe. When the surprise scores for each probe are calculated, the quantile of the surprise scores may be calculated. The quantile is computed by picking a property type and collection window, gathering up the surprise scores for the property type, across all the probes, within the collection window, and then taking the 90% quantile of those surprise scores. So there is a quantile for each property type-collection period pair. Every four hours the following process is performed for each property being monitored: rerun the set of queries again to obtain current property values for each query with respect to the property, recompute the surprise scores for the values obtained by rerunning the set of queries, and then determine the 90% quantile for these surprise scores. This gives a new value for the 90% quantile for the property which may be compared against the historical quantile to determine whether an anomaly exists.
  • FIG. 8 is a histogram showing quantiles 800 for a property type (e.g., a median sale price) over a two-week period, according to an example embodiment. As shown in FIG. 8, the quantiles are clustered near 3.6, with a range from 3.0 to 4.6. In another view, FIG. 9 is a plot of the quantiles in a time series, according to an example embodiment. This shows the feature (e.g., the quantile of the surprise score) is fairly smooth, and might be, in some cases, a candidate for anomaly detection.
  • An example of an anomaly that corresponds to a genuine disruption is now described. FIG. 10 is a histogram that includes the quantiles 800 shown in FIG. 8 but with a new quantile 1002, according to an example embodiment. The new quantile 1002 may be calculated based on the value of the property when the anomaly detection engine 212 executes the set of queries again. In another view, FIG. 11 is a plot of the quantiles in a time series with the new quantile 1002, according to an example embodiment. For example, while FIG. 9 shows quantiles from up to 07:00 on Nov 28, FIG. 10 adds the quantile 1002. The following observations are made. FIG. 9 is vaguely normal with a mean of 3.7 and a standard deviation of 0.3. The new value of the new quantile 1002 shown in FIGS. 10 and 11 is 29.6, which is |29.6−3.7|/0.3≈86 standard deviations from the mean quantile. So the historical value of the quantile appears to be a useful feature for detecting anomalies.
  • Summarizing the above, it is expected that an individual property for a particular query will have sudden jumps in values. Although these sudden jumps may represent outliers, an outlier, in and of itself, should not necessarily raise an alert. Instead, example embodiments may use the number of queries that have such jumps as a good signal for raising an alert of an anomaly. So a selection of features may go like this. For each (probe, property value) pair, we compute a measure of surprise and measure whether the latest property value of the property is an outlier. The anomaly detection engine 212 then has a surprise number for each query. It is expected to have a few large surprise numbers, but not too many. To quantify this, the anomaly detection engine 212 may in some embodiments select the 90th quantile of surprise values (e.g., sort the surprise values from low to high and, return the 90-th value, or using a non-sorting function to calculate a quantile or ranking of surprise scores). This is our feature. Now we can use any outlier detection method to raise an alert. For example, in an example embodiment, the anomaly detection engine 212 may take the last 30 days' worth of signals, compute their mean and standard deviation. If the latest quantile of the signal is more than threshold deviation (e.g., 56 from the mean), the anomaly detection engine 212 raises an alert.
  • A method for detecting an anomaly in a computer system is now described in greater detail. For example, FIG. 12 is a flowchart diagram illustrating a method 1200 for detecting an anomaly in a computer system, according to an example embodiment.
  • As FIG. 12 illustrates, the method 1200 may begin at operation 1202 when the probe module 210 executes probes on an evolving data set. Each probe returns a result derived from the evolving data set. By way of example and not limitation, the probe module 210 may issue a set of queries to the searching engine 208 of FIG. 2. The probe module 210 may then receive search results for each of the queries issued to the search engine 208. It is to be appreciated that each of the probes (e.g., search queries) may be different and, accordingly, each results may also be different.
  • In some embodiment, as part of operation 1202, the probe module 210 is further configured to derive property values for each result returned from the probes. As discussed above, a property value may include a data that quantifies a property or aspect of a result. To illustrate, again by way of example and not limitation, where the probe module 210 is configured to transmit a set of queries to the searching engine 208, the property value may represent, for example, a value for the property of the number of items returned in the result, the average list price in the result, a measurement of the number of items that are auctions relative to non-auction items in the result, a number of classified listings in the result, or any other suitable property.
  • Thus, in some embodiments, the execution of operation 1202 may result in a data table that includes a number of property values that each correspond to one of the probes executed by the probe module 210. Further, as the probe module 210 may monitor more than one property type, the table may include multiple columns, where each column corresponds to a different property type. This is shown in FIG. 13. FIG. 13 is a diagram illustrating a property value table 1300 that may be generated based on executing the probes, according to an example embodiment. The property value table 1300 may store the property values (organized by property types 1304) collected for a single iteration of the probes 1302. As FIG. 13 shows, for a single probe, the probe module 210 may derive property values for multiple property types. Further, for a single property type, the probe module 210 may derive multiple property values, each corresponding to a different probe. Thus, single property value may be specific to a probe-property type pair. For example, property value 1310 may be specific to the Probe2-Property Type2 pair.
  • With reference back to FIG. 12, at operation 1204, the anomaly detecting engine 212 may generate surprise scores for each of the property values. Each surprise score may be generated based on a comparison of a property value and historical property values that correspond to a probe-property type pair. For example, for a probe, the surprise score may be based on a function of the property value and a deviation from an expect value. An expected value may be determined based on a robust estimation of the tendency of the historical property values for that probe. A median, mode, mean, and trimmed mean are all examples of a robust estimation of the tendency of the value for the feature that can be used to generate a surprise score. In some cases, the surprise score for a value may be based on a standard deviation from the tendency for that value of the property. Thus, operation 1204 may generate a surprise score for the latest results where each surprise corresponds to one of the queries in the set of queries. For example, FIG. 14 is a diagram showing property values 1400 for a probe-property type pair, according to an example embodiment. As Shown in FIG. 14, the property values 1400 may include the property value 1310 received as part of a current iteration of executing the probes. As discussed with respect to FIG. 13, the property value 1310 may be specific to the Probe2-Property2 pair. The property values 1400 may also include historical property values 1402 that were obtained in past iterations of executing the probes. The historical property values 1402 are specific to the same probe-property type pair as the property value 1310 (e.g., Probe2-Property2 pair). The surprise score is based on the deviation of the sample property value 1310 from the historical property values 1402.
  • As discussed above with respect to the operation 1204 shown in FIG. 12, a surprise score is generated for each probe-property type pair. This is shown in FIG. 15 as FIG. 15 is a diagram illustrating a surprise score table 1500, according to an example embodiment. The surprise score table 1500 includes a surprise score for each of the probe property pairs. The surprise score table 1500 is generated based on calculating a surprise score in the manner discussed with respect to FIG. 14. That is, a surprise score is generated for each probe-property type pair based on a comparison between the property value corresponding to the probe-property type pair and the historical property values for the probe-property type pair.
  • With reference back to FIG. 12, at operation 1206, the anomaly detecting engine 212 may access a plurality of historical surprise scores generated by the anomaly detection engine 212. In some cases, the historical surprise scores accessed at operation 1206 may be based on past iterations of executing the probes. Further, in some cases, the historical surprise scores may be specific to a particular property type.
  • At operation 1208, responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, the anomaly detection engine 212 may alert a monitoring system of an anomaly regarding the evolving data set. With momentary reference to FIG. 15, the idea of operation 1208 is that an alert is generated if the surprise scores for a given property type (e.g., Property2), across all probes (e.g., Probes1-6) is out of the norm from historical surprise scores for those probe-property type pairs.
  • The comparison used by operation 1208 may be based a feature derived from the surprise scores. To illustrate, FIG. 16 is a chart showing of a measurement of a feature of the surprise scores generated by operation 1204, according to an example embodiment. The measurement of the feature shown in FIG. 16 is for the surprise scores generated for a single iteration of the execution of the probes. For example, the chart 1600 may measure the feature for the surprise scores generated across Probes1-6 for Property Type2. For example, the feature may be a measurement of deviation a surprise score is from an expected value. An expected value for the surprise score may be calculated based on a robust estimation of the tendency of the historical surprise scores for that property type. In some cases, the feature may be a quantile of the surprise scores, or a quantile of the data derived from the surprise scores (e.g., the deviation from the expected value). Still further, in some cases, the feature may be a measurement or count of the number of surprise scores that deviate beyond a threshold amount from the expected value, or that exceed a fixed surprise score.
  • As part of operation 1208, the feature of the surprise scores may then be compared against historical surprise scores from past iterations of executing the probes. This is shown in FIG. 17. FIG. 17 is a chart illustrating surprise score features 1700 over time, according to an example embodiment. For example, the surprise score feature 1702 may be a feature of the surprise scores for a current iteration of the execution of the probes, while the historical surprise score features 1704 are features of surprise scores from past iterations of executing the probes. Here, operation 1208 may alert if the feature of the surprise score 1702 deviates from the historical surprise score features 1704 beyond a threshold amount.
  • It is to be appreciated that the operations 1206 and 1208 shown in FIG. 12 may be repeated across all the property types in monitored by the probe module 210. For example, with reference to FIG. 5, the operations 1206 and 1208 may execute across Property Types1-5.
  • It is to be further appreciated that although much of this disclosure discusses anomaly detection in the context of a search engine, other example embodiments may use the anomaly detection methods and systems described herein to detect anomalies in other types of computer systems. For example, the computer system may be an inventory data store. In such a case, the probe module 210 may be configured to detect as property types, among other things, the number of items stored per category, the number of auction items per category, and the like.
  • As another example, the computer system may be a computer infrastructure (e.g., a collection of computer servers). In such a case, the probe module 210 may be configured to detect as property types, among other things, a processor load, bandwidth consumption, thread count, running processes count, memory usage, throughput count, rate of disk seeks, rate of packets transmitted or received, or rate of response.
  • In other embodiments, the property values tracked by the anomaly detection engine 212 may include dimensions in addition to what is described above. For example, in the embodiment discussed above, one may conceptualize that a table may be used to store the property values, where the columns are the metrics tracked by the different probes modules, and the rows are the different values for those property types at different times. An extension would be a 3D-table or cube. For each (property, value) cell, there may be a series of aspects instead of a single number. An aspect may be a vertical stack out of the page. In example (1), the aspect might be different countries. Thus, a cell in a table may be related to a specific query (perhaps ‘iPhone 5S’) and property (perhaps number of results). But using the aspects, the results vary by country, so the single cell is replaced by a stack of entries, one for each country.
  • As mentioned in the previous section, the anomaly detection engine 212 may be configured to detect a problem with the search software (a disruption) before users do. In some cases, the property values may be received from the same interfaces and using the same computer systems used by the end users. In such cases, the metric data received from the probe module 210 is a proxy, if not identical, to the user experience of users of that computer system. Accordingly, it is to be appreciated that when this disclosure states the anomaly detection engine 212 may detect a problem before users do, it may simply mean that the anomaly detection engine 212 can detect and report a problem without intervention from a user. Thus, compared to traditional systems, example embodiments may use the anomaly detection engine 212 to provide comparatively quick detection of site problems.
  • Example Computer System
  • FIG. 18 shows a diagrammatic representation of a machine in the example form of a computer system 1800 within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. The computer system 1800 comprises, for example, any of the device machine 110, device machine 112, applications servers 118, API server 114, web server 116, database servers 124, or third party server 130. In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a device machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a server computer, a client computer, a personal computer (PC), a tablet, a set-top box (STB), a Personal Digital Assistant (PDA), a smart phone, a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
  • The example computer system 1800 includes a processor 1802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 1804 and a static memory 1806, which communicate with each other via a bus 1808. The computer system 1800 may further include a video display unit 1810 (e.g., liquid crystal display (LCD), organic light emitting diode (OLED), touch screen, or a cathode ray tube (CRT)). The computer system 1800 also includes an alphanumeric input device 1812 (e.g., a physical or virtual keyboard), a cursor control device 1814 (e.g., a mouse, a touch screen, a touchpad, a trackball, a trackpad), a disk drive unit 1816, a signal generation device 1818 (e.g., a speaker) and a network interface device 1820.
  • The disk drive unit 1816 includes a machine-readable medium 1822 on which is stored one or more sets of instructions 1824 (e.g., software) embodying any one or more of the methodologies or functions described herein. The instructions 1824 may also reside, completely or at least partially, within the main memory 1804 and/or within the processor 1802 during execution thereof by the computer system 1800, the main memory 1804 and the processor 1802 also constituting machine-readable media.
  • The instructions 1824 may further be transmitted or received over a network 1826 via the network interface device 1820.
  • While the machine-readable medium 1822 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present invention. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and carrier wave signals.
  • It will be appreciated that, for clarity purposes, the above description describes some embodiments with reference to different functional units or processors. However, it will be apparent that any suitable distribution of functionality between different functional units, processors or domains may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controller. Hence, references to specific functional units are only to be seen as references to suitable means for providing the described functionality, rather than indicative of a strict logical or physical structure or organization.
  • Certain embodiments described herein may be implemented as logic or a number of modules, engines, components, or mechanisms. A module, engine, logic, component, or mechanism (collectively referred to as a “module”) may be a tangible unit capable of performing certain operations and configured or arranged in a certain manner. In certain example embodiments, one or more computer systems (e.g., a standalone, client, or server computer system) or one or more components of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an application or application portion) or firmware (note that software and firmware can generally be used interchangeably herein as is known by a skilled artisan) as a module that operates to perform certain operations described herein.
  • In various embodiments, a module may be implemented mechanically or electronically. For example, a module may comprise dedicated circuitry or logic that is permanently configured (e.g., within a special-purpose processor, application specific integrated circuit (ASIC), or array) to perform certain operations. A module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. It will be appreciated that a decision to implement a module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by, for example, cost, time, energy-usage, and package size considerations.
  • Accordingly, the term “module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), non-transitory, or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. Considering embodiments in which modules or components are temporarily configured (e.g., programmed), each of the modules or components need not be configured or instantiated at any one instance in time. For example, where the modules or components comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different modules at different times. Software may accordingly configure the processor to constitute a particular module at one instance of time and to constitute a different module at a different instance of time.
  • Modules can provide information to, and receive information from, other modules. Accordingly, the described modules may be regarded as being communicatively coupled. Where multiples of such modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the modules. In embodiments in which multiple modules are configured or instantiated at different times, communications between such modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple modules have access. For example, one module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further module may then, at a later time, access the memory device to retrieve and process the stored output. Modules may also initiate communications with input or output devices and can operate on a resource (e.g., a collection of information).
  • Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. One skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. Moreover, it will be appreciated that various modifications and alterations may be made by those skilled in the art without departing from the scope of the invention.
  • The Abstract is provided to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.

Claims (20)

What is claimed is:
1. A computer-implemented system comprising:
a probe module implemented by one or more processors and configured to execute a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result, the probe module further configured to derive a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes; and
an anomaly detection engine implemented by the one or more processors and configured to:
generate a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe,
access a plurality of historical surprise scores generated by the anomaly detection engine, and
responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alert a monitoring system of an anomaly regarding the evolving data set.
2. The computer-implemented system of claim 1, wherein the plurality of surprise scores are generated at a first iteration, and the historical surprise scores were generated at one or more past iterations.
3. The computer-implemented system of claim 1, wherein:
the probe module is further configured to derive a plurality of additional property values, each additional property value from the plurality of property values are derived from the respective result returned by the corresponding probe of the plurality of probes, the plurality of additional property values relating to a different property than the plurality of property values; and
the anomaly detection engine is further configured to:
generate a plurality of additional surprise scores corresponding to the plurality of additional property values, each additional surprise score being generated based on a comparison of a corresponding additional property value from the plurality of additional property values and additional historical property values, the corresponding additional property value and the additional historical property values having been derived from results returned from the same probe,
access a plurality of additional historical surprise scores generated by the anomaly detection engine, and
responsive to a comparison between the plurality of additional surprise scores and the plurality of additional historical surprise scores, alert the monitoring system of an additional anomaly regarding the evolving data set.
4. The computer-implemented system of claim 1, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether the plurality of surprise scores deviate from a historical distribution of the plurality of historical surprise scores.
5. The computer-implemented system of claim 1, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a number of surprise scores from the plurality of surprise scores deviates from the plurality of historical surprise scores.
6. The computer-implemented system of claim 1, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a quantile of the plurality of surprise property values deviates from a historical distribution of quantiles derived from the plurality of historical surprise scores.
7. The computer-implemented system of claim 1, wherein the evolving data set relates to data obtained by monitoring a computer system.
8. The computer-implemented system of claim 1, wherein the evolving data set relates to data obtained by monitoring an inventory of items listed in a database.
9. The computer-implemented system of claim 1, wherein the evolving data set relates to data obtained by monitoring a performance metrics of a plurality of servers.
10. The computer-implemented system of claim 9, wherein the plura of property values relate to at least one of: a processor load, a bandwidth consumption, a thread count, a running processes count, a memory usage, a throughput count, a rate of disk seeks, a rate of packets transmitted or received, or a rate of response.
11. A computer-implemented method comprising:
executing a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result;
deriving a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes;
generating a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe;
accessing a plurality of historical surprise scores generated by the anomaly detection engine and responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alerting a monitoring system of an anomaly regarding the evolving data set.
12. The computer-implemented method of claim 11, wherein the plurality of surprise scores are generated at a first iteration, and the historical surprise scores were generated at one or more past iterations.
13. The computer-implemented method of claim 11, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether the plurality of surprise scores deviate from a historical distribution of the plurality of historical surprise scores.
14. The computer-implemented method of claim 11, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a number of surprise scores from the plurality of surprise scores deviates from the plurality of historical surprise scores.
15. The computer-implemented method of claim 11, wherein the comparison between the plurality of surprise scores and the plurality of historical surprise scores includes determining whether a quantile of the plurality of surprise property values deviates from a historical distribution of quantiles derived from the plurality of historical surprise scores.
16. The computer-implemented method of claim 11, wherein the evolving data set relates to data obtained by monitoring a computer system.
17. The computer-implemented method of claim 11, wherein the evolving data set relates to data obtained by monitoring an inventory of items listed in a database.
18. The computer-implemented method of claim 11, wherein the evolving data set relates to data obtained by monitoring a performance metrics of a plurality of servers.
19. The computer-implemented method of claim 18, wherein the plurality of property values relate to at least one of: a processor load, a bandwidth consumption, a thread count, a running processes count, a memory usage, a throughput count, a rate of disk seeks, a rate of packets transmitted or received, or a rate of response.
20. A non-transitory computer-readable medium storing executable instructions thereon, which, when executed by a processor, cause the processor to perform operations comprising:
executing a plurality of probes on an evolving data set, each probe from the plurality of probes returning a result;
deriving a plurality of property values, each property value from the plurality of property values is derived from a respective result returned by a corresponding probe of the plurality of probes;
generating a plurality of surprise scores corresponding to the plurality of property values, each surprise score being generated based on a comparison of a corresponding property value from the plurality of property values and historical property values, the corresponding property value and the historical property values having been derived from results returned from the same probe;
accessing a plurality of historical surprise scores generated by the anomaly detection engine and responsive to a comparison between the plurality of surprise scores and the plurality of historical surprise scores, alerting a monitoring system of an anomaly regarding the evolving data set.
US14/143,185 2013-02-08 2013-12-30 Systems and methods for detecting anomalies Abandoned US20140229414A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US14/143,185 US20140229414A1 (en) 2013-02-08 2013-12-30 Systems and methods for detecting anomalies
US15/141,225 US20160239368A1 (en) 2013-02-08 2016-04-28 Systems and methods for detecting anomalies

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201361762420P 2013-02-08 2013-02-08
US14/143,185 US20140229414A1 (en) 2013-02-08 2013-12-30 Systems and methods for detecting anomalies

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/141,225 Continuation US20160239368A1 (en) 2013-02-08 2016-04-28 Systems and methods for detecting anomalies

Publications (1)

Publication Number Publication Date
US20140229414A1 true US20140229414A1 (en) 2014-08-14

Family

ID=51296747

Family Applications (3)

Application Number Title Priority Date Filing Date
US14/143,185 Abandoned US20140229414A1 (en) 2013-02-08 2013-12-30 Systems and methods for detecting anomalies
US14/176,474 Active 2035-05-03 US10558512B2 (en) 2013-02-08 2014-02-10 Ballast water tank recirculation treatment system
US15/141,225 Abandoned US20160239368A1 (en) 2013-02-08 2016-04-28 Systems and methods for detecting anomalies

Family Applications After (2)

Application Number Title Priority Date Filing Date
US14/176,474 Active 2035-05-03 US10558512B2 (en) 2013-02-08 2014-02-10 Ballast water tank recirculation treatment system
US15/141,225 Abandoned US20160239368A1 (en) 2013-02-08 2016-04-28 Systems and methods for detecting anomalies

Country Status (2)

Country Link
US (3) US20140229414A1 (en)
WO (1) WO2014124357A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160274807A1 (en) * 2015-03-20 2016-09-22 Ricoh Company, Ltd. Information processing apparatus, information processing method, and information processing system
WO2017060778A3 (en) * 2015-09-05 2017-07-20 Nudata Security Inc. Systems and methods for detecting and scoring anomalies
US9842204B2 (en) 2008-04-01 2017-12-12 Nudata Security Inc. Systems and methods for assessing security risk
US9946864B2 (en) 2008-04-01 2018-04-17 Nudata Security Inc. Systems and methods for implementing and tracking identification tests
US9990487B1 (en) 2017-05-05 2018-06-05 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US10007776B1 (en) 2017-05-05 2018-06-26 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US10127373B1 (en) 2017-05-05 2018-11-13 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US10169906B2 (en) 2013-03-29 2019-01-01 Advanced Micro Devices, Inc. Hybrid render with deferred primitive batch binning
US10528533B2 (en) * 2017-02-09 2020-01-07 Adobe Inc. Anomaly detection at coarser granularity of data
WO2020131391A1 (en) * 2018-12-20 2020-06-25 Microsoft Technology Licensing, Llc Automatic anomaly detection in computer processing pipelines
CN113242839A (en) * 2018-12-14 2021-08-10 Abb瑞士股份有限公司 Water treatment system and water treatment method
US20210395106A1 (en) * 2018-10-17 2021-12-23 Organo Corporation Water quality management method, ion adsorption device, information processing device and information processing system

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011058578A1 (en) * 2009-11-13 2011-05-19 Mehta Virendra J Method and system for purifying water
EP2977355A4 (en) * 2013-03-22 2016-03-16 Tech Cross Co Ltd Ballast water treatment system
US9828266B2 (en) 2014-08-27 2017-11-28 Algenol Biotech LLC Systems and methods for sterilizing liquid media
CN104673665B (en) * 2015-03-05 2016-06-22 韩先锋 The clump count warning devices of ballast water for ship
US20170057833A1 (en) * 2015-09-01 2017-03-02 Xylem IP Holdings LLC. Recirculating type active substance treatment system
WO2017071944A1 (en) * 2015-10-27 2017-05-04 Koninklijke Philips N.V. Anti-fouling system, controller and method of controlling the anti-fouling system
GB201614497D0 (en) * 2016-08-25 2016-10-12 Rs Hydro Ltd Water quality sensing
CN107473328B (en) * 2017-09-27 2020-11-20 广船国际有限公司 Seawater treatment system and seawater treatment control method
CN111399378A (en) * 2020-03-16 2020-07-10 上海交通大学 Ballast water sterilization device optimization method and system based on fuzzy and closed-loop control
US11319038B1 (en) * 2020-12-31 2022-05-03 Clean Wake, Llc Systems and methods for decontaminating watercraft

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044406A1 (en) * 2002-03-29 2005-02-24 Michael Stute Adaptive behavioral intrusion detection systems and methods
US7337206B1 (en) * 2002-07-15 2008-02-26 Network Physics Method for detecting congestion in internet traffic
US8306931B1 (en) * 2009-08-06 2012-11-06 Data Fusion & Neural Networks, LLC Detecting, classifying, and tracking abnormal data in a data stream
US8639797B1 (en) * 2007-08-03 2014-01-28 Xangati, Inc. Network monitoring of behavior probability density
US20140074731A1 (en) * 2012-09-13 2014-03-13 Fannie Mae System and method for automated data discrepancy analysis
US20150051847A1 (en) * 2011-11-22 2015-02-19 Electric Power Research Institute, Inc. System and method for anomaly detection
US20150186989A1 (en) * 2013-12-27 2015-07-02 Ebay Inc. Pricing and listing configuration recommendation engine
US20150227992A1 (en) * 2006-11-16 2015-08-13 Genea Energy Partners, Inc. Building Optimization Platform And Web-Based Invoicing System
US20150237215A1 (en) * 2009-07-17 2015-08-20 Jaan Leemet Determining Usage Predictions And Detecting Anomalous User Activity Through Traffic Patterns

Family Cites Families (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3155609A (en) 1960-05-09 1964-11-03 Pampel Leonard Fredrick Stabilization of a closed or open water system through the selective utilization of light
US3917945A (en) 1973-06-21 1975-11-04 Hayashi Katsuhiki Method and apparatus for detecting the degree of contamination of waste water
US4298467A (en) 1977-06-06 1981-11-03 Panlmatic Company Water treatment system
US4752401A (en) 1986-02-20 1988-06-21 Safe Water Systems International, Inc. Water treatment system for swimming pools and potable water
US5948272A (en) 1986-04-29 1999-09-07 Lemelson; Jerome H. System and method for detecting and neutralizing microorganisms in a fluid using a laser
US4992380A (en) 1988-10-14 1991-02-12 Nalco Chemical Company Continuous on-stream monitoring of cooling tower water
DK96989D0 (en) 1989-02-28 1989-02-28 Faxe Kalkbrud Aktieselskabet PROCEDURE FOR MONITORING BIOLOGICAL PROCESSES
US5322569A (en) 1991-10-08 1994-06-21 General Dynamics Corporation Ultraviolet marine anti-biofouling systems
US5411889A (en) 1994-02-14 1995-05-02 Nalco Chemical Company Regulating water treatment agent dosage based on operational system stresses
CN1069162C (en) 1994-05-02 2001-08-08 诺尔科化学公司 Compositions of fluorescent biocides for use as improved antimicrobials
US5466425A (en) 1994-07-08 1995-11-14 Amphion International, Limited Biological decontamination system
DK0828692T3 (en) 1995-05-11 1999-08-23 Biobalance As New method for controlling biodegradation
CN1183382C (en) 1997-06-11 2005-01-05 纳尔科化学公司 Solid-state fluorometer and methods of use therefor
JP3920504B2 (en) 1999-08-10 2007-05-30 株式会社荏原製作所 UV sterilizer
NO312413B1 (en) 2000-01-04 2002-05-06 Forinnova As Method and apparatus for preventing the bloom of microorganisms in an aqueous system
US6500345B2 (en) 2000-07-31 2002-12-31 Maritime Solutions, Inc. Apparatus and method for treating water
US6403030B1 (en) 2000-07-31 2002-06-11 Horton, Iii Isaac B. Ultraviolet wastewater disinfection system and method
US7160370B2 (en) 2000-12-22 2007-01-09 Saltech Corporation Systems and methods for contaminant detection within a fluid, ultraviolet treatment and status notification
JP3881183B2 (en) 2001-03-13 2007-02-14 株式会社荏原製作所 UV irradiation equipment
CA2341089C (en) * 2001-03-16 2002-07-02 Paul F. Brodie Ship ballast water sterilization method and system
US20080206095A1 (en) 2001-07-11 2008-08-28 Duthie Robert E Micro-organism reduction in liquid by use of a metal halide ultraviolet lamp
WO2003097855A2 (en) * 2002-05-14 2003-11-27 Baylor College Of Medicine Small molecule inhibitors of her2 expression
US7897045B2 (en) 2002-06-29 2011-03-01 Marenco Technology Group, Inc. Ship-side ballast water treatment systems including related apparatus and methods
JP3914850B2 (en) 2002-09-11 2007-05-16 株式会社東芝 Ozone-promoted ozone-oxidized water treatment device and ozone-promoted oxidation module
US6972415B2 (en) 2002-09-26 2005-12-06 R-Can Environmental Inc. Fluid treatment system with UV sensor and intelligent driver
JP2004188273A (en) 2002-12-09 2004-07-08 Japan Organo Co Ltd Ultraviolet irradiation system
JP4079795B2 (en) 2003-02-17 2008-04-23 株式会社東芝 Water treatment control system
US7595003B2 (en) 2003-07-18 2009-09-29 Environmental Technologies, Inc. On-board water treatment and management process and apparatus
US7618536B2 (en) 2003-09-03 2009-11-17 Ekomarine Ab Method of treating a marine object
US7544291B2 (en) 2004-12-21 2009-06-09 Ranco Incorporated Of Delaware Water purification system utilizing a plurality of ultraviolet light emitting diodes
US7820038B2 (en) 2005-03-29 2010-10-26 Kabushiki Kaisha Toshiba Ultraviolet radiation water treatment system
ZA200803767B (en) 2005-11-08 2009-11-25 Council Scient Ind Res An apparatus for disinfection of sea water/ship's ballast water and a method thereof
US8812239B2 (en) * 2005-12-14 2014-08-19 Ut-Battelle, Llc Method and system for real-time analysis of biosensor data
US7585416B2 (en) 2006-03-20 2009-09-08 Council Of Scientific & Industrial Research Apparatus for filtration and disinfection of sea water/ship's ballast water and a method of same
DE102006045558A1 (en) * 2006-09-25 2008-04-03 Rwo Gmbh Water treatment plant
WO2008039147A2 (en) 2006-09-26 2008-04-03 Alfawall Aktiebolag System, process and control unit for treating ballast water
US8097130B2 (en) * 2006-10-18 2012-01-17 Balboa Instruments, Inc. Integrated water treatment system
JP4264111B2 (en) 2007-03-01 2009-05-13 株式会社東芝 UV irradiation system and water quality monitoring device
DE102007018670A1 (en) 2007-04-18 2008-10-23 Wedeco Ag Device for germinating ballast water in ship by ultraviolet radiation, comprises a pump line through which ballast water is received and discharged and which is interfused by two groups of ultraviolet transparent cladding tubes
JP2009103479A (en) 2007-10-22 2009-05-14 National Maritime Research Institute Water quality monitoring method and device
EP2394963B1 (en) 2008-11-21 2016-02-17 The University of Tokushima Ultraviolet sterilization device for outdoor water
BRPI1008435A2 (en) 2009-02-13 2015-08-25 Lee Antimicrobial Solutions Llc Method for microbial control and / or disinfection / remediation of an environment and diffuser device to produce phpg.
WO2010095707A1 (en) 2009-02-23 2010-08-26 ローム株式会社 Water purifier
TWI568687B (en) 2009-06-15 2017-02-01 沙烏地***油品公司 Suspended media membrane biological reactor system and process including suspension system and multiple biological reactor zones
CN201545720U (en) 2009-09-27 2010-08-11 清华大学 Low voltage high-strength UV ship ballast water processing device
WO2011049546A1 (en) 2009-10-20 2011-04-28 Enviro Tech As Apparatus for installation of ultraviolet system for ballast water treatment in explosive atmosphere of shipboard pump rooms and offshore platforms
GB0919477D0 (en) 2009-11-06 2009-12-23 Otv Sa Water purification apparatus and method
JP4835785B2 (en) 2010-02-25 2011-12-14 住友電気工業株式会社 Ship ballast water treatment equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044406A1 (en) * 2002-03-29 2005-02-24 Michael Stute Adaptive behavioral intrusion detection systems and methods
US7337206B1 (en) * 2002-07-15 2008-02-26 Network Physics Method for detecting congestion in internet traffic
US20150227992A1 (en) * 2006-11-16 2015-08-13 Genea Energy Partners, Inc. Building Optimization Platform And Web-Based Invoicing System
US8639797B1 (en) * 2007-08-03 2014-01-28 Xangati, Inc. Network monitoring of behavior probability density
US20150237215A1 (en) * 2009-07-17 2015-08-20 Jaan Leemet Determining Usage Predictions And Detecting Anomalous User Activity Through Traffic Patterns
US8306931B1 (en) * 2009-08-06 2012-11-06 Data Fusion & Neural Networks, LLC Detecting, classifying, and tracking abnormal data in a data stream
US20150051847A1 (en) * 2011-11-22 2015-02-19 Electric Power Research Institute, Inc. System and method for anomaly detection
US20140074731A1 (en) * 2012-09-13 2014-03-13 Fannie Mae System and method for automated data discrepancy analysis
US20150186989A1 (en) * 2013-12-27 2015-07-02 Ebay Inc. Pricing and listing configuration recommendation engine

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9946864B2 (en) 2008-04-01 2018-04-17 Nudata Security Inc. Systems and methods for implementing and tracking identification tests
US11036847B2 (en) 2008-04-01 2021-06-15 Mastercard Technologies Canada ULC Systems and methods for assessing security risk
US10997284B2 (en) 2008-04-01 2021-05-04 Mastercard Technologies Canada ULC Systems and methods for assessing security risk
US10839065B2 (en) 2008-04-01 2020-11-17 Mastercard Technologies Canada ULC Systems and methods for assessing security risk
US9842204B2 (en) 2008-04-01 2017-12-12 Nudata Security Inc. Systems and methods for assessing security risk
US10169906B2 (en) 2013-03-29 2019-01-01 Advanced Micro Devices, Inc. Hybrid render with deferred primitive batch binning
US10162539B2 (en) * 2015-03-20 2018-12-25 Ricoh Company, Ltd. Information processing apparatus, information processing method, and information processing system
US20160274807A1 (en) * 2015-03-20 2016-09-22 Ricoh Company, Ltd. Information processing apparatus, information processing method, and information processing system
US10129279B2 (en) 2015-09-05 2018-11-13 Mastercard Technologies Canada ULC Systems and methods for detecting and preventing spoofing
US10965695B2 (en) 2015-09-05 2021-03-30 Mastercard Technologies Canada ULC Systems and methods for matching and scoring sameness
CN108780479B (en) * 2015-09-05 2022-02-11 万事达卡技术加拿大无限责任公司 System and method for detecting and scoring anomalies
WO2017060778A3 (en) * 2015-09-05 2017-07-20 Nudata Security Inc. Systems and methods for detecting and scoring anomalies
CN108780479A (en) * 2015-09-05 2018-11-09 万事达卡技术加拿大无限责任公司 For to the abnormal system and method for being detected and scoring
US9749358B2 (en) 2015-09-05 2017-08-29 Nudata Security Inc. Systems and methods for matching and scoring sameness
US9813446B2 (en) 2015-09-05 2017-11-07 Nudata Security Inc. Systems and methods for matching and scoring sameness
US9800601B2 (en) 2015-09-05 2017-10-24 Nudata Security Inc. Systems and methods for detecting and scoring anomalies
US9749356B2 (en) 2015-09-05 2017-08-29 Nudata Security Inc. Systems and methods for detecting and scoring anomalies
US10212180B2 (en) 2015-09-05 2019-02-19 Mastercard Technologies Canada ULC Systems and methods for detecting and preventing spoofing
US9979747B2 (en) 2015-09-05 2018-05-22 Mastercard Technologies Canada ULC Systems and methods for detecting and preventing spoofing
US9749357B2 (en) 2015-09-05 2017-08-29 Nudata Security Inc. Systems and methods for matching and scoring sameness
US10749884B2 (en) 2015-09-05 2020-08-18 Mastercard Technologies Canada ULC Systems and methods for detecting and preventing spoofing
US10805328B2 (en) 2015-09-05 2020-10-13 Mastercard Technologies Canada ULC Systems and methods for detecting and scoring anomalies
US10528533B2 (en) * 2017-02-09 2020-01-07 Adobe Inc. Anomaly detection at coarser granularity of data
US10127373B1 (en) 2017-05-05 2018-11-13 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US10007776B1 (en) 2017-05-05 2018-06-26 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US9990487B1 (en) 2017-05-05 2018-06-05 Mastercard Technologies Canada ULC Systems and methods for distinguishing among human users and software robots
US20210395106A1 (en) * 2018-10-17 2021-12-23 Organo Corporation Water quality management method, ion adsorption device, information processing device and information processing system
CN113242839A (en) * 2018-12-14 2021-08-10 Abb瑞士股份有限公司 Water treatment system and water treatment method
WO2020131391A1 (en) * 2018-12-20 2020-06-25 Microsoft Technology Licensing, Llc Automatic anomaly detection in computer processing pipelines
US10901746B2 (en) 2018-12-20 2021-01-26 Microsoft Technology Licensing, Llc Automatic anomaly detection in computer processing pipelines
CN113227978A (en) * 2018-12-20 2021-08-06 微软技术许可有限责任公司 Automatic anomaly detection in computer processing pipelines

Also Published As

Publication number Publication date
US20140224714A1 (en) 2014-08-14
US10558512B2 (en) 2020-02-11
WO2014124357A1 (en) 2014-08-14
US20160239368A1 (en) 2016-08-18

Similar Documents

Publication Publication Date Title
US20160239368A1 (en) Systems and methods for detecting anomalies
JP7465939B2 (en) A Novel Non-parametric Statistical Behavioral Identification Ecosystem for Power Fraud Detection
US10354309B2 (en) Methods and systems for selecting an optimized scoring function for use in ranking item listings presented in search results
US9323811B2 (en) Query suggestion for e-commerce sites
US11074546B2 (en) Global back-end taxonomy for commerce environments
US11392963B2 (en) Determining and using brand information in electronic commerce
US20160012124A1 (en) Methods for automatic query translation
US11734736B2 (en) Building containers of uncategorized items
US10140339B2 (en) Methods and systems for simulating a search to generate an optimized scoring function
US20220083556A1 (en) Managing database offsets with time series
US20150254680A1 (en) Utilizing product and service reviews
US9424352B2 (en) View item related searches
AU2014321274B2 (en) Recommendations for selling past purchases
WO2013090475A1 (en) Recognizing missing offerings in a marketplace
US20160019623A1 (en) International search result weghting
US20150095147A1 (en) Monetizing qualified leads
US20130262507A1 (en) Method and system to provide inline saved searches
US20150134417A1 (en) Methods, systems, and apparatus for dynamic consumer segmentation

Legal Events

Date Code Title Description
AS Assignment

Owner name: EBAY INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GOLDBERG, DAVID;SHAN, YINAN SYNC;SIGNING DATES FROM 20131213 TO 20131219;REEL/FRAME:031857/0546

AS Assignment

Owner name: PAYPAL, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EBAY INC.;REEL/FRAME:036171/0144

Effective date: 20150717

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION