US20140372175A1 - Method and system for detection, classification and prediction of user behavior trends - Google Patents

Method and system for detection, classification and prediction of user behavior trends Download PDF

Info

Publication number
US20140372175A1
US20140372175A1 US14/303,621 US201414303621A US2014372175A1 US 20140372175 A1 US20140372175 A1 US 20140372175A1 US 201414303621 A US201414303621 A US 201414303621A US 2014372175 A1 US2014372175 A1 US 2014372175A1
Authority
US
United States
Prior art keywords
data
users
user
raw data
attributes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/303,621
Inventor
Noopur Jain
Santanu Chaudhury
Jobin WILSON
Prateek Kapadia
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Flytxt BV
Original Assignee
Flytxt BV
Flytxt BV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Flytxt BV, Flytxt BV filed Critical Flytxt BV
Assigned to Flytxt B.V reassignment Flytxt B.V ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KAPADIA, PRATEEK, WILSON, Jobin, JAIN, NOOPUR, CHAUDHURY, SANTANU
Publication of US20140372175A1 publication Critical patent/US20140372175A1/en
Priority to US16/137,328 priority Critical patent/US11461795B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W76/00Connection management
    • H04W76/10Connection setup
    • H04W76/12Setup of transport tunnels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/02Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
    • H04W84/04Large scale networks; Deep hierarchical networks
    • H04W84/042Public Land Mobile systems, e.g. cellular systems
    • H04W84/047Public Land Mobile systems, e.g. cellular systems using dedicated repeater stations

Definitions

  • Embodiments herein relate to the field of predictive analytics and more particularly relates to a method and system for detection, classification and prediction of behaviour trends using correspondence analysis.
  • the existing methods of trend recognition and predictions based on numerical time series data are based on individual users, where each user is treated as an independent entity.
  • the representation as well as grouping of millions of users (for example users in a telecommunications network) based on such time-series data is an expensive option in terms of space and time complexity.
  • the existing system lacks the mechanism for a low-dimension representation of the time series for global trending pattern of a data set.
  • FIG. 1 illustrates an overview for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein;
  • FIG. 2 illustrates a flow diagram explaining the various steps involved in predicting the user behavior trends using the correspondence analysis, according to the embodiments as disclosed herein;
  • FIG. 3 depicts the process of reducing the dimensions of data, according to embodiments as disclosed herein;
  • FIG. 4 depicts the process of clustering, according to embodiments as disclosed herein;
  • FIG. 5 is a flowchart illustrating the process of optimizing campaigns and performing product bundling for a user based on clusters, according to embodiments as disclosed herein;
  • FIG. 6 is a graph showing the representation of users in a low dimensional feature space, according to the embodiments as disclosed herein;
  • FIG. 7 is a graph showing the grouping of users having similar trends over certain time period, according to the embodiments as disclosed herein.
  • FIG. 8 illustrates a computing environment implementing the method and system for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein.
  • the method provides a framework for clustering or grouping the users, representing their trends in an n-dimensional space, using correspondence analysis.
  • the method reduces the n-dimensional feature space to a lower dimensional space for easy processing, better interpretation and for generating superior quality clusters, Further, the method applies the correspondence analysis so that each user is assigned with a new coordinate in the lower dimension which maintains a similarity, difference and the relationship between the variables.
  • clustering or grouping of the coordinates based on the similar trends of the users is performed. Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters. Completely unlabeled clusters may be inspected by an administrator for the purpose of manual analysis, labeling and mapping to predicted trends and actions.
  • the embodiments herein achieve a method and system that provides a scalable mechanism for grouping the users based on similar trends in n-dimensional space using correspondence analysis.
  • the method and system is applicable in the context of any user transaction based system (for example in a telecom network, banking system and so on).
  • the method provides a framework for clustering or grouping the users representing similar trends in the n-dimensional space using correspondence analysis.
  • the correspondence analysis is used to recognize the trends or nature of the users on the basis of their numerical attributes as well as temporal variation of such attributes.
  • the method and system disclosed herein reduces the n-dimensional feature space to a lower dimensional space for easy processing and interpretation, without losing the trend information of each user, using correspondence analysis. Further, each user is assigned with a new coordinate in the lower dimension which maintains a similarity, difference and the relationship between the variables, as they existed in the higher dimensional space.
  • clustering or grouping of the coordinates based on the similar trends of the users is performed.
  • unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster.
  • the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • the principal object of the embodiments herein is to provide a scalable method and system for detection, classification and prediction of behaviour trends using correspondence analysis.
  • Another object of the embodiments herein is to provide a scalable method and system for effectively reducing the dimensional space using correspondence analysis on numerical multinomial data for reduction of complexity in cluster analysis and to improve quality of emerging clusters, along with superior prediction accuracies.
  • FIGS. 1 through 5 where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments.
  • FIG. 1 illustrates an overview for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein.
  • a system for example a telecom network, banking system and so on.
  • the transactions of the users are recorded in the network 101 .
  • the network 101 maintains the raw transaction logs of all the users in a file server 102 .
  • the file server 102 comprises the files that have all the transactional details of the users.
  • the raw transactional logs that are present in the file server 102 are uploaded by scheduled data upload jobs orchestrated by cluster master 103 into a distributed file system 105 .
  • Jobs orchestrated by cluster master 103 perform clustering of the users based on their transactions that are having similar trends, in a distributed fashion, over the worker nodes 104 .
  • the raw transactional logs having the n-dimensional feature space is reduced to a lower dimensional space for easy processing and interpretation, without losing the trend information of each user.
  • correspondence analysis is used for trend recognition and dimensionality reduction of the raw transactional data of the users.
  • the correspondence analysis is a descriptive technique that is designed to analyze simple two-way and multi-way tables containing some measure of correspondence between rows and columns.
  • the correspondence analysis is used to recognize the trend of users on the basis of temporal variations of their numerical attributes.
  • Each column of the correspondence table represents a numerical attribute and all the columns will be the observations of the same variable over time at different time instances.
  • the cluster master 103 maintains the uploaded files over the worker nodes 104 and distributed file system 105 (any distributed file system or memory).
  • the raw transactional data logs are distributed across multiple machines and the correspondence analysis is applied on the data.
  • clustering or grouping of the coordinates based on the similar trends of the users is performed. Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • the cluster master 103 further applies association rule mining on the clusters discovered in a lower dimensional space.
  • the cluster master 103 further uses the discovered rules for user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on.
  • the cluster master 103 may be a standalone device.
  • the cluster master 103 may comprise of a plurality of devices, implemented using distributed architecture.
  • the cluster master 103 may be implemented on the cloud.
  • FIG. 2 illustrates a flow diagram explaining the various steps involved in predicting the user behavior trends using the correspondence analysis, according to the embodiments as disclosed herein.
  • the method obtains ( 201 ) the raw data from the network that corresponds to a particular domain.
  • the domain may include but is not limited to a telecommunications network or a banking system.
  • the telecom domain all the transactions are recorded and stored in a network and in a banking system, all the transactions of the users are stored in a bank server.
  • Xij can be value of any numerical attribute observed at different time instances.
  • Data in this case is of u*t dimension or each subject is measured in t-dimensional space.
  • the transactional data of the users can either be obtained from a network designed for storing such data (for example, the telecom network or the bank server).
  • the method performs ( 202 ) pre-processing and feature selection on the raw data.
  • the preprocessing and feature selection on the raw data comprises determining the attributes of the users.
  • One such attribute of the user can be minutes of usage (may be usage of a network in telecommunications domain).
  • the method obtains ( 203 ) trend data from the raw transactional logs.
  • the trend data includes the values that changes over time.
  • the method reduces ( 204 ) the dimensionality of the data format of the raw data (which is a multinomial data, n-dimensional), when the feature selection and trend data are obtained from the raw transactional logs using correspondence analysis ( 301 , 302 ) (as depicted in FIG. 3 ).
  • the new coordinates will be such that those users who are following similar trend in multidimensional time series domain will become closer to each other than that those who are dissimilar.
  • the data can be mapped from t to 2 or 3-dimensional space without losing much information about the trend of the subscribers, then it will be easily interpretable and analyzable and efficiently represented in comparison to the data in t-dimensions.
  • Correspondence analysis is an exploratory data analysis technique for contingency tables and multivariate or multinomial data. Correspondence analysis also emphasizes on the graphical representation of the result in lower dimension for its easy interpretation, maintaining the similarity or dissimilarity between the rows and the column of the table. Embodiments herein apply correspondence analysis in applications where the trend of high dimensional user data with numerical multidimensional attributes of time series domain is required. Correspondence analysis is used to determine similarities and differences among the trends of users with respect to their behavior over time and depicting the same graphically in a joint low-dimensional space.
  • Correspondence analysis assigns each user a co-ordinate in the lower dimension maintaining the similarity, difference and the relationship between the variables in rows and columns of the table, which means those rows which are similar in their trend will be close to each other in the new low dimensional space and those which are dissimilar will be some far apart.
  • Correspondence analysis is based on the Eigen value of a matrix, so it can be used for dimension reduction similar to principal component analysis, which enables an easier interpretation of results. The similarity between users in the new low dimensional space can be graphically visualized.
  • the correspondence analysis is used to recognize the trend of users (subscribers) on the basis of their numerical attributes. Once the correspondence analysis is applied, the correspondence table is generated. Each column of the correspondence table represents a numerical attribute and all the columns will be the observations of the same variable over time at different time instances.
  • the method obtains the number of target dimensions (for example, it can be 2-dimensional or 3-dimensional based on the requirement) as an input for reducing the dimensionality of transactional data of the users.
  • the method performs ( 205 ) the clustering of the users attributes based on parameters to obtain unlabeled clusters based on trend similarity. Clustering of the users is performed to group the users having similar trends. In an embodiment, the method obtains clustering parameters for performing clustering of the users based on parameters. In an embodiment, standard clustering techniques such as DBSCAN (for density based clusters) and k-means clustering algorithm can be used for grouping the similar trends of the users in the lower dimension such that the users with similar trends will be grouped in the same cluster.
  • DBSCAN for density based clusters
  • k-means clustering algorithm can be used for grouping the similar trends of the users in the lower dimension such that the users with similar trends will be grouped in the same cluster.
  • Embodiments disclosed herein first apply DBSCAN clustering to obtain ( 401 ) density based clusters. DBCCAN considers users whose trend differs from the majority of the users as noise because of their lesser density. To avoid loss of this data, the noise is further clustered ( 402 ) using k-means clustering algorithm, before the final clusters are obtained ( 403 ) (as depicted in FIG. 4 ).
  • the clusters formed in lower dimension retain the properties (similarities, differences and relationships) which were there in the n dimensional space.
  • the clusters are assigned ( 206 ) labels based on label information of users according to the actions taken by them previously (historical data).
  • the clusters may be further divided into classes based on at least one other feature and each user in the cluster may be assigned to be a member of at least one class.
  • the users may be then assigned a confidence level for each predicted action, based on the class to which they belong.
  • the method predicts ( 207 ) the future actions of the users based on the trends of attributes that are observed in the case of labeled samples.
  • the prediction step forecasts the future actions of users, based on the past trends of attribute values that are observed in the case of labeled samples.
  • the prediction may be in the form of rules consisting of predicates and relationships among them along with augmented statistics such as confidence measures, indicating a degree of algorithmic confidence on each rule. For example, if there is a churn file that lists the users who are churned, and could make use of the trends exhibited by these users prior to churning to label other users who exhibit similar trends as potential churn candidates.
  • labeled lists corresponding to user actions that are observed in the past (for example churning, postpaid to prepaid switching and so on).
  • the number of labeled users can be identified from a particular list being present. Having more users from a labeled list (representing a class) in a cluster is a strong indication that the cluster likely represents the group of users who could potentially exhibit the same behavior.
  • the various actions in flow diagram 200 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some actions listed in FIG. 2 may be omitted.
  • FIG. 5 is a flowchart illustrating the process of optimizing campaigns and performing product bundling for a user based on clusters, according to embodiments as disclosed herein.
  • association rule mining can be applied on each of these clusters, and thereby automatically use the discovered rules for user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on.
  • Support of an item set is the fraction of all purchases in which that item set appears (e.g. if there are 100 purchases (each purchase may contain multiple items such as bread, jam, butter, oil, juice, milk etc.), if 20 of the purchases had bread as well as butter, then support of bread->butter is 20%). Confidence is the fraction of purchases in which two items appear together to the total number of purchases for the 2nd item (e.g. confidence of bread->butter will be 1.0 if out of all purchases of bread, butter is also purchased together with it).
  • the relationships between features of users within the cluster (wherein examples of the features may be the ARPU of the user, the number of SMSs sent by the user, the number of international calls made by the user and so on) and features of user who were previously converted by previously run campaigns are found ( 501 ) and the underlying hidden rules in the relationships are mined ( 502 ) using association rule mining.
  • the rules obtained are combined ( 503 ) to suggest user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on.
  • each attribute of users is discretized into bins (e.g ARPU (Average Revenue Per User) can be high, medium and low).
  • bins e.g ARPU (Average Revenue Per User) can be high, medium and low.
  • each conversion for each campaign can be treated like a “purchase”.
  • top association rules are mined.
  • conversion information corresponding to several campaigns can be obtained.
  • the discovered rules can be ranked based on how many times they occur within the cluster and then top ranking rules would be combined to generate new rules which can be the basis for designing a new campaign or optimizing an existing campaign.
  • the various actions in flow diagram 500 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some actions listed in FIG. 5 may be omitted.
  • FIG. 6 is a graph showing the representation of users in a low dimensional feature space, according to the embodiments as disclosed herein.
  • the graph shown in the figure depicts a two dimensional feature space with X and Y axes.
  • the graph is obtained by reducing the n-dimensional feature space and applying the correspondence analysis on the numerical time series data.
  • the transactional data of all the ten users are recorded in the telecom network.
  • This model of representing each user's numerical value of the attribute at each time instances forms a multinomial data or an array having n-dimensions (for example u ⁇ n).
  • the first step involved in classification and detection of user behavior trends using correspondence analysis is the reduction of n-dimensional space to lower dimensional space.
  • the dimensionality reduction of the multinomial data is performed for easy processing and interpretation of data without losing trend information of each user.
  • the multinomial data can be reduced to lower dimension (for example 2-dimensional or 3-dimensional based on the requirement).
  • the new coordinates (as shown in the graph) will be such that those users who are following similar trend in the multidimensional time series domain will become closer to each other than those who are dissimilar as shown in the graph.
  • FIG. 7 is a graph showing the grouping of users having similar trends over certain time period, according to the embodiments as disclosed herein.
  • the users of the telecom network can be grouped or clustered as shown in the graph. Clustering or grouping of the coordinates is performed based on the similar trends of the users. These groups or clusters contain the users who are similar in their trends over certain time period. These clusters are used for group based prediction or further analysis on the group.
  • unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster.
  • the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • the actions performed by the users following a similar trend can be predicted. This information is used for predicting the actions of new users of similar trend.
  • Embodiments disclosed herein may be used for video segmentation, as depicted in the following example.
  • an unsupervised segmentation of objects/users in a video needs to be done based on their similarity of their motion, which may be for safety management of large gathering (big crowd) in a public area, to get moving areas in a scene for efficient video compression, to detect unusual events, video surveillance or to analyze video for further for specific purposes.
  • Optical flow is used to get the subsequent position of the pixels from frame to frame. If a pixel was at (u1, v1) position in one frame and it move to (u2, v2) position in the next frame then the magnitude of its movement is calculated as the Euclidian distance between them. Once the optical flow is obtained, the magnitude of pixels displacement is calculated consecutively over all n frames, which results the trend of pixels in time series over (n ⁇ 1) dimensions. Correspondence analysis will map the pixels movement data from n ⁇ 1 dimension to 2-dimension such that pixels which belong to the similar object movement will be close to each other than those have dissimilar motion. On clustering the pixels, all pixels within each cluster will be representing similar trend.
  • Detection of unusual events may be performed if areas where objects motion is not regular or deviations from normal behavior are detected.
  • FIG. 8 illustrates a computing environment implementing the method and system for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein.
  • the compute environment may consist of plurality such units, forming a distributed cluster, over which the algorithms are executed in a scalable fashion.
  • the computing environment 801 comprises at least one processing unit 804 that is equipped with a control unit 802 and an Arithmetic Logic Unit (ALU) 803 , a memory 805 , a storage unit 806 , plurality of networking devices 808 and a plurality Input output (I/O) devices 807 .
  • the processing unit 804 is responsible for processing the instructions of the algorithm.
  • the processing unit 804 receives commands from the control unit in order to perform its processing. Further, any logical and arithmetic operations involved in the execution of the instructions are computed with the help of the ALU 803 .
  • the overall computing environment 801 can be composed of multiple homogeneous and/or heterogeneous cores, multiple CPUs of different kinds, special media and other accelerators.
  • the processing unit 804 is responsible for processing the instructions of the algorithm. Further, the plurality of processing units 804 may be located on a single chip or over multiple chips. Further a plurality of nodes such as 801 may be interconnected over a network to form a distributed computing environment, where the method described gets executed in a distributed fashion.
  • the algorithm comprising of instructions and codes required for the implementation are stored in either the memory unit 805 or the storage 806 or both. At the time of execution, the instructions may be fetched from the corresponding memory 805 and/or storage 806 , and executed by the processing unit 804 .
  • networking devices 808 or external I/O devices 807 may be connected to the computing environment to support the implementation through the networking unit and the I/O device unit.
  • Embodiments disclosed herein enable compression of large amounts of temporal data related to users to smaller and more manageable amounts of data, hereby reducing the time required for processing the data and complexity of the system required for computing.
  • Embodiments disclosed herein enable detection of unusual events based on the raw data.
  • the unusual event may be a behaviour of a user which does not match his history and/or the cluster of users to which he belongs.
  • the unusual event may be a user of a telecommunication network sending a large number of SMSs within a short period of time, when he previously used to send only a few SMSs.
  • Embodiments disclosed herein account for temporal changes in user behaviour.
  • the embodiments disclosed herein can be implemented through at least one software program running on at least one hardware device and performing network management functions to control the elements.
  • the elements shown in FIGS. 1 and 5 include blocks which can be at least one of a hardware device, or a combination of hardware device and software module.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and system for detection, classification and prediction of user behavior trends using correspondence analysis is disclosed. The method and system reduces the n-dimensional feature space to lower dimensional space for easy processing, improved quality of emerging clusters and superior prediction accuracies. Further, the method applies the correspondence analysis so that each user is assigned with a new coordinate in the lower dimension which maintains a similarity, difference and the relationship between the variables. Once the correspondence analysis is completed, clustering or grouping of the coordinates based on the similar trends of the users is performed. Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.

Description

  • The present application is based on, and claims priority from, IN Application Number 2581/CHE/2013, filed on 13 Jun. 2013, the disclosure of which is hereby incorporated by reference herein.
  • TECHNICAL FIELD
  • Embodiments herein relate to the field of predictive analytics and more particularly relates to a method and system for detection, classification and prediction of behaviour trends using correspondence analysis.
  • BACKGROUND
  • In competitive business environments, companies frequently desire to forecast events that influence business metrics and performance indicators. Indeed, such ability is often important for effective decision making. Information obtained from accurate event forecast, results in more efficient operations and cost savings for the business. For example, the business that forecasts particular requirements in the near future can make profitable adjustments to its business practices based on this information. As another example, if the business can accurately predict potential failures or inefficiencies in the business process, then requirements can be analyzed to mitigate such failures.
  • By recognizing future trends, companies can potentially increase efficiency and gain competitive advantage. Accurate recognition of such trends also results in significant cost savings and improved business processes.
  • In certain business applications, there are many situations where the behavior of users should be predicted and analyzed for taking actions according to the behavioral trends. Further, the events generated by the users are sources of precious information about their behavior, interactions, preferences as well as temporal changes in their behavior and preferences. In the current scenario, the marketers are not able to take the advantage of the data related to the user that is available in large amounts. This prevents the service providers or marketers from providing accurate service personalization, customized personal offers and others based on the user behavior trends. In case of large data sets, it would be complex and expensive to predict behavior of each and every user at an individual level
  • The existing methods of trend recognition and predictions based on numerical time series data are based on individual users, where each user is treated as an independent entity. The representation as well as grouping of millions of users (for example users in a telecommunications network) based on such time-series data is an expensive option in terms of space and time complexity. The existing system lacks the mechanism for a low-dimension representation of the time series for global trending pattern of a data set.
  • BRIEF DESCRIPTION OF THE FIGURES
  • Embodiments herein are illustrated in the accompanying drawings, throughout which like reference letters indicate corresponding parts in the various figures. The embodiments herein will be better understood from the following description with reference to the drawings, in which:
  • FIG. 1 illustrates an overview for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein;
  • FIG. 2 illustrates a flow diagram explaining the various steps involved in predicting the user behavior trends using the correspondence analysis, according to the embodiments as disclosed herein;
  • FIG. 3 depicts the process of reducing the dimensions of data, according to embodiments as disclosed herein;
  • FIG. 4 depicts the process of clustering, according to embodiments as disclosed herein;
  • FIG. 5 is a flowchart illustrating the process of optimizing campaigns and performing product bundling for a user based on clusters, according to embodiments as disclosed herein;
  • FIG. 6 is a graph showing the representation of users in a low dimensional feature space, according to the embodiments as disclosed herein;
  • FIG. 7 is a graph showing the grouping of users having similar trends over certain time period, according to the embodiments as disclosed herein; and
  • FIG. 8 illustrates a computing environment implementing the method and system for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein.
  • DETAILED DESCRIPTION OF EMBODIMENTS
  • The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein can be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
  • Provided herein is a scalable mechanism for grouping the users based on similar trends in n-dimensional space using correspondence analysis. The method provides a framework for clustering or grouping the users, representing their trends in an n-dimensional space, using correspondence analysis. The method reduces the n-dimensional feature space to a lower dimensional space for easy processing, better interpretation and for generating superior quality clusters, Further, the method applies the correspondence analysis so that each user is assigned with a new coordinate in the lower dimension which maintains a similarity, difference and the relationship between the variables.
  • Once the correspondence analysis is done, clustering or grouping of the coordinates based on the similar trends of the users is performed. Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters. Completely unlabeled clusters may be inspected by an administrator for the purpose of manual analysis, labeling and mapping to predicted trends and actions.
  • The embodiments herein achieve a method and system that provides a scalable mechanism for grouping the users based on similar trends in n-dimensional space using correspondence analysis.
  • Further, the method and system is applicable in the context of any user transaction based system (for example in a telecom network, banking system and so on). The method provides a framework for clustering or grouping the users representing similar trends in the n-dimensional space using correspondence analysis.
  • The correspondence analysis is used to recognize the trends or nature of the users on the basis of their numerical attributes as well as temporal variation of such attributes.
  • The method and system disclosed herein reduces the n-dimensional feature space to a lower dimensional space for easy processing and interpretation, without losing the trend information of each user, using correspondence analysis. Further, each user is assigned with a new coordinate in the lower dimension which maintains a similarity, difference and the relationship between the variables, as they existed in the higher dimensional space.
  • Once the correspondence analysis is done, clustering or grouping of the coordinates based on the similar trends of the users is performed.
  • Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • The principal object of the embodiments herein is to provide a scalable method and system for detection, classification and prediction of behaviour trends using correspondence analysis.
  • Another object of the embodiments herein is to provide a scalable method and system for effectively reducing the dimensional space using correspondence analysis on numerical multinomial data for reduction of complexity in cluster analysis and to improve quality of emerging clusters, along with superior prediction accuracies.
  • Referring now to the drawings and more particularly to FIGS. 1 through 5 where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments.
  • FIG. 1 illustrates an overview for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein. As depicted in the FIG. 1, consider a group of users performing transactions with a system (for example a telecom network, banking system and so on). The transactions of the users are recorded in the network 101. Further, the network 101 maintains the raw transaction logs of all the users in a file server 102.
  • In an embodiment, the file server 102 comprises the files that have all the transactional details of the users.
  • The raw transactional logs that are present in the file server 102 are uploaded by scheduled data upload jobs orchestrated by cluster master 103 into a distributed file system 105.
  • Jobs orchestrated by cluster master 103 perform clustering of the users based on their transactions that are having similar trends, in a distributed fashion, over the worker nodes 104.
  • In an embodiment, the raw transactional logs having the n-dimensional feature space is reduced to a lower dimensional space for easy processing and interpretation, without losing the trend information of each user.
  • In an embodiment, correspondence analysis is used for trend recognition and dimensionality reduction of the raw transactional data of the users.
  • Typically, the correspondence analysis is a descriptive technique that is designed to analyze simple two-way and multi-way tables containing some measure of correspondence between rows and columns.
  • The correspondence analysis is used to recognize the trend of users on the basis of temporal variations of their numerical attributes. Each column of the correspondence table represents a numerical attribute and all the columns will be the observations of the same variable over time at different time instances.
  • In an embodiment, the cluster master 103 maintains the uploaded files over the worker nodes 104 and distributed file system 105 (any distributed file system or memory). The raw transactional data logs are distributed across multiple machines and the correspondence analysis is applied on the data.
  • Once the correspondence analysis is completed, clustering or grouping of the coordinates based on the similar trends of the users is performed. Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • The cluster master 103 further applies association rule mining on the clusters discovered in a lower dimensional space. The cluster master 103 further uses the discovered rules for user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on.
  • The cluster master 103 may be a standalone device. The cluster master 103 may comprise of a plurality of devices, implemented using distributed architecture. The cluster master 103 may be implemented on the cloud.
  • FIG. 2 illustrates a flow diagram explaining the various steps involved in predicting the user behavior trends using the correspondence analysis, according to the embodiments as disclosed herein. As depicted in the flow diagram 200, initially the method obtains (201) the raw data from the network that corresponds to a particular domain. For example, the domain may include but is not limited to a telecommunications network or a banking system. In a telecom domain all the transactions are recorded and stored in a network and in a banking system, all the transactions of the users are stored in a bank server.
  • The data format which is used herein as an example is U=1, 2, 3 . . . u subjects, for each subject numerical value of the attribute at each time instance T=1, 2, 3 . . . t is measured, so in table format it will look like
  • T1 T2 T3 T4 . . . Tt
    User
    1 X11 X12 X13 X14 . . . X1t
    User
    2 X21 X22 X23 X24 . . . X2t
    . . . . . . .
    . . . . . . .
    . . . . . . .
    User u Xu1 Xu2 Xu3 Xu4 . . . X ut
  • Here Xij can be value of any numerical attribute observed at different time instances. Data in this case is of u*t dimension or each subject is measured in t-dimensional space.
  • The transactional data of the users can either be obtained from a network designed for storing such data (for example, the telecom network or the bank server). Once the transactional data (raw data) is obtained, the method performs (202) pre-processing and feature selection on the raw data. In an embodiment, the preprocessing and feature selection on the raw data comprises determining the attributes of the users. One such attribute of the user can be minutes of usage (may be usage of a network in telecommunications domain).
  • Further, the method obtains (203) trend data from the raw transactional logs. The trend data includes the values that changes over time.
  • Further, the method reduces (204) the dimensionality of the data format of the raw data (which is a multinomial data, n-dimensional), when the feature selection and trend data are obtained from the raw transactional logs using correspondence analysis (301, 302) (as depicted in FIG. 3). For data with low dimensionality, the new coordinates will be such that those users who are following similar trend in multidimensional time series domain will become closer to each other than that those who are dissimilar. In an example, consider users were in t-dimensional space, if the data can be mapped from t to 2 or 3-dimensional space without losing much information about the trend of the subscribers, then it will be easily interpretable and analyzable and efficiently represented in comparison to the data in t-dimensions.
  • Correspondence analysis is an exploratory data analysis technique for contingency tables and multivariate or multinomial data. Correspondence analysis also emphasizes on the graphical representation of the result in lower dimension for its easy interpretation, maintaining the similarity or dissimilarity between the rows and the column of the table. Embodiments herein apply correspondence analysis in applications where the trend of high dimensional user data with numerical multidimensional attributes of time series domain is required. Correspondence analysis is used to determine similarities and differences among the trends of users with respect to their behavior over time and depicting the same graphically in a joint low-dimensional space. Correspondence analysis assigns each user a co-ordinate in the lower dimension maintaining the similarity, difference and the relationship between the variables in rows and columns of the table, which means those rows which are similar in their trend will be close to each other in the new low dimensional space and those which are dissimilar will be some far apart. Correspondence analysis is based on the Eigen value of a matrix, so it can be used for dimension reduction similar to principal component analysis, which enables an easier interpretation of results. The similarity between users in the new low dimensional space can be graphically visualized.
  • In an embodiment, the correspondence analysis is used to recognize the trend of users (subscribers) on the basis of their numerical attributes. Once the correspondence analysis is applied, the correspondence table is generated. Each column of the correspondence table represents a numerical attribute and all the columns will be the observations of the same variable over time at different time instances.
  • In an embodiment, the method obtains the number of target dimensions (for example, it can be 2-dimensional or 3-dimensional based on the requirement) as an input for reducing the dimensionality of transactional data of the users.
  • Once the dimensionality of the data is reduced using correspondence analysis, the method performs (205) the clustering of the users attributes based on parameters to obtain unlabeled clusters based on trend similarity. Clustering of the users is performed to group the users having similar trends. In an embodiment, the method obtains clustering parameters for performing clustering of the users based on parameters. In an embodiment, standard clustering techniques such as DBSCAN (for density based clusters) and k-means clustering algorithm can be used for grouping the similar trends of the users in the lower dimension such that the users with similar trends will be grouped in the same cluster.
  • Embodiments disclosed herein first apply DBSCAN clustering to obtain (401) density based clusters. DBCCAN considers users whose trend differs from the majority of the users as noise because of their lesser density. To avoid loss of this data, the noise is further clustered (402) using k-means clustering algorithm, before the final clusters are obtained (403) (as depicted in FIG. 4).
  • The clusters formed in lower dimension retain the properties (similarities, differences and relationships) which were there in the n dimensional space.
  • Based on the trend similarity with the labeled samples, the clusters are assigned (206) labels based on label information of users according to the actions taken by them previously (historical data). The clusters may be further divided into classes based on at least one other feature and each user in the cluster may be assigned to be a member of at least one class. The users may be then assigned a confidence level for each predicted action, based on the class to which they belong.
  • Further, the method predicts (207) the future actions of the users based on the trends of attributes that are observed in the case of labeled samples. In an embodiment, the prediction step forecasts the future actions of users, based on the past trends of attribute values that are observed in the case of labeled samples. The prediction may be in the form of rules consisting of predicates and relationships among them along with augmented statistics such as confidence measures, indicating a degree of algorithmic confidence on each rule. For example, if there is a churn file that lists the users who are churned, and could make use of the trends exhibited by these users prior to churning to label other users who exhibit similar trends as potential churn candidates. Further, there can be multiple labeled lists corresponding to user actions that are observed in the past (for example churning, postpaid to prepaid switching and so on). In each of the unlabeled clusters that emerge, the number of labeled users can be identified from a particular list being present. Having more users from a labeled list (representing a class) in a cluster is a strong indication that the cluster likely represents the group of users who could potentially exhibit the same behavior. The various actions in flow diagram 200 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some actions listed in FIG. 2 may be omitted.
  • FIG. 5 is a flowchart illustrating the process of optimizing campaigns and performing product bundling for a user based on clusters, according to embodiments as disclosed herein. After the formation of clusters in lower dimension space, association rule mining can be applied on each of these clusters, and thereby automatically use the discovered rules for user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on. Association rule mining is a method for discovering interesting relations between variables in large databases. It finds complete association from all the items to the others given historical purchase data (market-basket analysis). E.g. if most people who buy bread and milk also tend to buy butter, a rule milk, bread->butter [support=5%, confidence=100%] may be discovered. Support of an item set is the fraction of all purchases in which that item set appears (e.g. if there are 100 purchases (each purchase may contain multiple items such as bread, jam, butter, oil, juice, milk etc.), if 20 of the purchases had bread as well as butter, then support of bread->butter is 20%). Confidence is the fraction of purchases in which two items appear together to the total number of purchases for the 2nd item (e.g. confidence of bread->butter will be 1.0 if out of all purchases of bread, butter is also purchased together with it).
  • After the formation of clusters in lower dimension space using other features of users within the clusters and the campaigns which historically were sent to them, the relationships between features of users within the cluster (wherein examples of the features may be the ARPU of the user, the number of SMSs sent by the user, the number of international calls made by the user and so on) and features of user who were previously converted by previously run campaigns are found (501) and the underlying hidden rules in the relationships are mined (502) using association rule mining. The rules obtained are combined (503) to suggest user targeted applications, such as optimizing advertising campaigns, performing product bundling, pricing and so on.
  • In an example, each attribute of users is discretized into bins (e.g ARPU (Average Revenue Per User) can be high, medium and low). Now each conversion for each campaign can be treated like a “purchase”. So corresponding to each campaign, top association rules are mined. Now, within each cluster, conversion information corresponding to several campaigns can be obtained. Now the discovered rules can be ranked based on how many times they occur within the cluster and then top ranking rules would be combined to generate new rules which can be the basis for designing a new campaign or optimizing an existing campaign.
  • The various actions in flow diagram 500 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some actions listed in FIG. 5 may be omitted.
  • FIG. 6 is a graph showing the representation of users in a low dimensional feature space, according to the embodiments as disclosed herein. The graph shown in the figure depicts a two dimensional feature space with X and Y axes. The graph is obtained by reducing the n-dimensional feature space and applying the correspondence analysis on the numerical time series data.
  • Considering a sample of ten users in a telecom network as an example. The transactional data of all the ten users are recorded in the telecom network. The transactional data (raw data) of all the users is represented using U=1, 2, 3 . . . u users, for each user, numerical value of the attribute at each time instance T=1, 2, 3 . . . t is measured. This model of representing each user's numerical value of the attribute at each time instances forms a multinomial data or an array having n-dimensions (for example u×n).
  • The first step involved in classification and detection of user behavior trends using correspondence analysis is the reduction of n-dimensional space to lower dimensional space.
  • The dimensionality reduction of the multinomial data is performed for easy processing and interpretation of data without losing trend information of each user. The multinomial data can be reduced to lower dimension (for example 2-dimensional or 3-dimensional based on the requirement). In the lower dimensional feature space (2-dimensional as in the graph), the new coordinates (as shown in the graph) will be such that those users who are following similar trend in the multidimensional time series domain will become closer to each other than those who are dissimilar as shown in the graph.
  • FIG. 7 is a graph showing the grouping of users having similar trends over certain time period, according to the embodiments as disclosed herein. Once the multinomial data is reduced to a lower dimension (2-dimensional) as described in FIG. 3, the users of the telecom network can be grouped or clustered as shown in the graph. Clustering or grouping of the coordinates is performed based on the similar trends of the users. These groups or clusters contain the users who are similar in their trends over certain time period. These clusters are used for group based prediction or further analysis on the group.
  • Further, unlabeled cluster members are assigned class membership proportional to the labeled samples in the cluster. Finally, the method predicts the future actions of the users based on the past trends that are observed from the labeled clusters.
  • From the transactional data or historical data of the users in the telecom network, the actions performed by the users following a similar trend can be predicted. This information is used for predicting the actions of new users of similar trend.
  • Consider a group of 10 users (as depicted in table 1 below, which depicts the ARPU for each user) having a similar trend.
  • TABLE 1
    ARPU Month 1 Month 2 Month 3 Month 4
    User 1 473.05 740 439 0
    User 2 247 100 99 0
    User 3 372 508 282 0
    User 4 80 105.1 55 30
    User 5 235 334 50 120.17
    User 6 409 309 9 500
    User 7 73.01 75.05 0 144.01
    User 8 105 176 129 509
    User 9 65 0 0 10
    User 10 200 0 0 50
  • After applying correspondence analysis of this type of numerical time series data, it maps it to two-dimensional feature space by assigning new coordinates to the users such that those which are following similar trend will be close to each other in this new space as indicated in table 2.
  • TABLE 2
    ARPU Dimension 1 Dimension 2
    User 1 0.715 1.706
    User 2 0.14 1.975
    User 3 0.785 1.2
    User 4 −0.854 −0.967
    User 5 −1.047 −0.134
    User 6 −1.488 −0.656
    User 7 −1.266 −0.083
    User 8 −1.705 −0.622
    User 9 2.051 −1.46
    User 10 2.6 −0.9
  • In an example, consider that various brands based on their historical stock prices can be expressed as a time series (change in price over time). Assuming that it is required to identify brands which are similar in terms of their stock value trends over a period of time. The dimensionality of the historical stock prices is reduced from a multi-dimensional time series data into a 2D space followed by clustering. This will result in clusters of similar brands (for example, brands like Yahoo and Amazon may fall in one cluster and so on). Once grouping is done, timeseries models can be learned at the cluster level (e.g. ARMA models) to make predictions of future stock values.
  • Embodiments disclosed herein may be used for video segmentation, as depicted in the following example. Consider that an unsupervised segmentation of objects/users in a video needs to be done based on their similarity of their motion, which may be for safety management of large gathering (big crowd) in a public area, to get moving areas in a scene for efficient video compression, to detect unusual events, video surveillance or to analyze video for further for specific purposes.
  • For finding trend of object movement in a video, use magnitude of pixel movements over frames as an attribute of the trend recognition. Optical flow is used to get the subsequent position of the pixels from frame to frame. If a pixel was at (u1, v1) position in one frame and it move to (u2, v2) position in the next frame then the magnitude of its movement is calculated as the Euclidian distance between them. Once the optical flow is obtained, the magnitude of pixels displacement is calculated consecutively over all n frames, which results the trend of pixels in time series over (n−1) dimensions. Correspondence analysis will map the pixels movement data from n−1 dimension to 2-dimension such that pixels which belong to the similar object movement will be close to each other than those have dissimilar motion. On clustering the pixels, all pixels within each cluster will be representing similar trend.
  • Often gatherings involve movement of crowds in confined spaces such as city streets, overhead bridges, or narrow passageways. Because of the small space and big crowd, there can be many catastrophic events. If the usual motion at these places can be known apriori, then it is possible to predict locations of possible stampedes and hence do better safety management in those areas.
  • Detection of unusual events may be performed if areas where objects motion is not regular or deviations from normal behavior are detected.
  • FIG. 8 illustrates a computing environment implementing the method and system for detection and classification of user behavior trends using correspondence analysis, according to the embodiments as disclosed herein. The compute environment may consist of plurality such units, forming a distributed cluster, over which the algorithms are executed in a scalable fashion. As depicted the computing environment 801 comprises at least one processing unit 804 that is equipped with a control unit 802 and an Arithmetic Logic Unit (ALU) 803, a memory 805, a storage unit 806, plurality of networking devices 808 and a plurality Input output (I/O) devices 807. The processing unit 804 is responsible for processing the instructions of the algorithm. The processing unit 804 receives commands from the control unit in order to perform its processing. Further, any logical and arithmetic operations involved in the execution of the instructions are computed with the help of the ALU 803.
  • The overall computing environment 801 can be composed of multiple homogeneous and/or heterogeneous cores, multiple CPUs of different kinds, special media and other accelerators. The processing unit 804 is responsible for processing the instructions of the algorithm. Further, the plurality of processing units 804 may be located on a single chip or over multiple chips. Further a plurality of nodes such as 801 may be interconnected over a network to form a distributed computing environment, where the method described gets executed in a distributed fashion.
  • The algorithm comprising of instructions and codes required for the implementation are stored in either the memory unit 805 or the storage 806 or both. At the time of execution, the instructions may be fetched from the corresponding memory 805 and/or storage 806, and executed by the processing unit 804.
  • In case of any hardware implementations various networking devices 808 or external I/O devices 807 may be connected to the computing environment to support the implementation through the networking unit and the I/O device unit.
  • Embodiments disclosed herein enable compression of large amounts of temporal data related to users to smaller and more manageable amounts of data, hereby reducing the time required for processing the data and complexity of the system required for computing.
  • Embodiments disclosed herein enable detection of unusual events based on the raw data. The unusual event may be a behaviour of a user which does not match his history and/or the cluster of users to which he belongs. For example, the unusual event may be a user of a telecommunication network sending a large number of SMSs within a short period of time, when he previously used to send only a few SMSs.
  • Embodiments disclosed herein account for temporal changes in user behaviour.
  • The embodiments disclosed herein can be implemented through at least one software program running on at least one hardware device and performing network management functions to control the elements. The elements shown in FIGS. 1 and 5 include blocks which can be at least one of a hardware device, or a combination of hardware device and software module.
  • The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the embodiments as described herein.

Claims (20)

What is claimed is:
1. A method for detection of user behaviour trends, wherein the method comprises of
performing pre-processing and feature selection on raw data by a cluster master, wherein the raw data comprises of data related to temporal behaviour of a user;
obtaining trend data from the raw data by the cluster master;
reducing dimensionality of the raw data by the cluster master to a lower dimension using correspondence analysis, wherein the data with the lower dimension causes users with similar behaviour to be closer to each other than those who are dissimilar;
performing clustering on the data with the lower dimension by the cluster master based on attributes of the user; and
assigning at least one label to the clustered data by the cluster master.
2. The method, as claimed in claim 1, wherein pre-processing and feature selection on the raw data comprises determining the attributes of the users.
3. The method, as claimed in claim 1, wherein trend data comprises behaviour that changes over time.
4. The method, as claimed in claim 1, wherein assigning at least one label to the clustered data is based on label information of previous users according to actions taken by the users previously.
5. The method, as claimed in claim 1, wherein the method further comprises of predicting future actions of the users based on the labeled clustered data.
6. The method, as claimed in claim 5, wherein the method further comprises of augmenting the predictions of future actions by generating confidence measures based on class membership proportional to the labeled clustered data.
7. The method, as claimed in claim 1, wherein the method further comprises of
applying association rule mining on the clustered data to discover at least one rule; and
using the at least one discovered rule for user targeted applications.
8. The method, as claimed in claim 7, wherein applying association rule mining on the clustered data to discover at least one rule comprises of
finding relationships between features of users in a cluster features of users who were previously converted by historical campaigns and features of previous campaigns themselves;
mining underlying rules in the clustered data; and
discovering defining attributes of each campaign, relationship of attributes of each campaign, other attributes of the campaign and previously converted users.
9. The method, as claimed in claim 1, wherein the method further comprises of detecting unusual events based on the raw data.
10. The method, as claimed in claim 1, wherein the raw data is at least one of numerical multinomial data; and an array having n-dimensions, where the raw data comprises of continuous individual features.
11. A computer program product comprising computer executable program code recorded on a computer readable non-transitory storage medium, said computer executable program code when executed, causing a method for detection, classification and prediction of user behaviour trends, comprising:
performing pre-processing and feature selection on raw data, wherein the raw data comprises of data related to temporal behaviour of a user;
obtaining trend data from the raw data;
reducing dimensionality of the raw data to a lower dimension using correspondence analysis, wherein the data with the lower dimension causes users with similar behaviour to be closer to each other than those who are dissimilar;
performing clustering on the data with the lower dimension based on attributes of the user; and
assigning at least one label to the clustered data.
12. The computer program product, as claimed in claim 11, wherein pre-processing and feature selection on the raw data comprises determining the attributes of the users.
13. The computer program product, as claimed in claim 11, wherein trend data comprises behaviour that changes over time.
14. The computer program product, as claimed in claim 11, wherein assigning at least one label to the clustered data is based on label information of previous users according to actions taken by the users previously.
15. The computer program product, as claimed in claim 11, wherein the method further comprises of predicting future actions of the users based on the labeled clustered data.
16. The computer program product, as claimed in claim 15, wherein the method further comprises of augmenting the predictions of future actions by generating confidence measures based on class membership proportional to the labeled clustered data.
17. The computer program product, as claimed in claim 11, wherein the method further comprises of
applying association rule mining on the clustered data to discover at least one rule; and
using the at least one discovered rule for user targeted applications.
18. The computer program product, as claimed in claim 17, wherein applying association rule mining on the clustered data to discover at least one rule comprises of
finding relationships between features of users in a cluster features of users who were previously converted by historical campaigns and features of previous campaigns themselves;
mining underlying rules in the clustered data; and
discovering defining attributes of each campaign, relationship of attributes of each campaign, other attributes of the campaign and previously converted users.
19. The computer program product, as claimed in claim 11, wherein the method further comprises of detecting unusual events based on the raw data.
20. The computer program product, as claimed in claim 11, wherein the raw data is at least one of numerical multinomial data; and an array having n-dimensions, where the raw data comprises of continuous individual features.
US14/303,621 2013-01-21 2014-06-13 Method and system for detection, classification and prediction of user behavior trends Abandoned US20140372175A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/137,328 US11461795B2 (en) 2013-06-13 2018-09-20 Method and system for automated detection, classification and prediction of multi-scale, multidimensional trends

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN258CH2013 2013-01-21
IN258/CHE/2013 2013-01-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/137,328 Continuation-In-Part US11461795B2 (en) 2013-06-13 2018-09-20 Method and system for automated detection, classification and prediction of multi-scale, multidimensional trends

Publications (1)

Publication Number Publication Date
US20140372175A1 true US20140372175A1 (en) 2014-12-18

Family

ID=51207626

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/159,458 Active US9672527B2 (en) 2013-01-21 2014-01-21 Associating and consolidating MME bearer management functions
US14/303,621 Abandoned US20140372175A1 (en) 2013-01-21 2014-06-13 Method and system for detection, classification and prediction of user behavior trends

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/159,458 Active US9672527B2 (en) 2013-01-21 2014-01-21 Associating and consolidating MME bearer management functions

Country Status (1)

Country Link
US (2) US9672527B2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371547A1 (en) * 2015-06-19 2016-12-22 eConnect, Inc. Predicting behavior from surveillance data
US20180113911A1 (en) * 2016-10-26 2018-04-26 Seiko Epson Corporation Data processing apparatus and data processing method
RU2694747C1 (en) * 2015-09-21 2019-07-16 Гугл Инк. Controller, control method and program
US10402986B2 (en) * 2017-12-20 2019-09-03 Facebook, Inc. Unsupervised video segmentation
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110930068A (en) * 2019-12-10 2020-03-27 安徽新知数媒信息科技有限公司 Traditional reading material visual experience index prediction method
US10824903B2 (en) * 2016-11-16 2020-11-03 Facebook, Inc. Deep multi-scale video prediction
US10891545B2 (en) 2017-03-10 2021-01-12 International Business Machines Corporation Multi-dimensional time series event prediction via convolutional neural network(s)
US10902442B2 (en) 2016-08-30 2021-01-26 International Business Machines Corporation Managing adoption and compliance of series purchases
US20210311941A1 (en) * 2018-12-21 2021-10-07 Tencent Technology (Shenzhen) Company Limited Method and device for determining social rank of node in social network
US20230043820A1 (en) * 2021-08-04 2023-02-09 Verizon Media Inc. Method and system for user group determination, churn identification and content selection

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9986535B2 (en) * 2012-03-31 2018-05-29 Tejas Networks Limited Method and system for managing mobile management entity (MME) in a telecommunication network
WO2015146910A1 (en) * 2014-03-24 2015-10-01 シャープ株式会社 Server device and terminal device
WO2015144218A1 (en) * 2014-03-26 2015-10-01 Telefonaktiebolaget L M Ericsson (Publ) Establishment of a wireless backhaul connection from a small cell rbs
US9723642B2 (en) * 2014-08-07 2017-08-01 At&T Intellectual Property I, L.P. Method and device for managing communication sessions using tunnels
CN105515720B (en) * 2014-09-25 2019-02-26 中国电信股份有限公司 The correlating method of signaling and data, device and system
EP3254189B1 (en) * 2015-02-18 2019-01-16 Huawei Technologies Co., Ltd. Upgrading of a mobile network function
CN106455131B (en) * 2015-08-10 2022-04-15 北京三星通信技术研究有限公司 Method and equipment for controlling WLAN (Wireless local area network) load
US10205507B2 (en) * 2015-08-28 2019-02-12 Tejas Networks, Ltd. Relay architecture, relay node, and relay method thereof
US9979730B2 (en) * 2015-10-30 2018-05-22 Futurewei Technologies, Inc. System and method for secure provisioning of out-of-network user equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020194159A1 (en) * 2001-06-08 2002-12-19 The Regents Of The University Of California Parallel object-oriented data mining system
US20070156673A1 (en) * 2005-12-30 2007-07-05 Accenture S.P.A. Churn prediction and management system
US20080162268A1 (en) * 2006-11-22 2008-07-03 Sheldon Gilbert Analytical E-Commerce Processing System And Methods
US20120330779A1 (en) * 1997-11-14 2012-12-27 Tuzhilin Alexander S Predicting Purchasing Requirements

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100943078B1 (en) * 2007-12-14 2010-02-18 한국전자통신연구원 A bearer control and management method in the IP-based evolved UMTS Network
US8902805B2 (en) * 2008-10-24 2014-12-02 Qualcomm Incorporated Cell relay packet routing
US20100260126A1 (en) * 2009-04-13 2010-10-14 Qualcomm Incorporated Split-cell relay packet routing
GB0907213D0 (en) * 2009-04-27 2009-06-10 Sharp Kk Relay apparatus and method
EP2443870B1 (en) * 2009-06-17 2014-10-15 InterDigital Patent Holdings, Inc. Method and apparatus for performing handover with a relay node
CN103139852B (en) * 2009-07-23 2017-08-18 华为技术有限公司 Method of the voice call fallback to circuit commutative field, apparatus and system
CN101998554A (en) * 2009-08-18 2011-03-30 中兴通讯股份有限公司 Switching method based on mobile relay and mobile radio relay system
CN103188681B (en) * 2009-09-28 2016-08-10 华为技术有限公司 Data transmission method, apparatus and system
US9900210B2 (en) * 2010-04-09 2018-02-20 Nokia Solutions And Networks Oy Establishing connectivity between a relay node and a configuration entity
WO2012016593A1 (en) * 2010-08-05 2012-02-09 Fujitsu Limited Wireless communication system and method for mapping of control messages on the un- interface
US9021073B2 (en) * 2010-08-11 2015-04-28 Verizon Patent And Licensing Inc. IP pool name lists
EP2620028B1 (en) * 2010-09-23 2020-04-29 BlackBerry Limited System and method for dynamic coordination of radio resources usage in a wireless network environment
US8514756B1 (en) * 2010-10-15 2013-08-20 Juniper Networks, Inc. Collectively addressing wireless devices
RU2547824C2 (en) * 2011-02-03 2015-04-10 Нек Корпорейшн Mobile communication system, relay-station mobility management apparatus, relay-station mobility management method and computer-readable data medium
US9426700B2 (en) * 2011-03-25 2016-08-23 Lg Electronics Inc. Method and apparatus for performing handover procedure in wireless communication system including mobile relay node
US8817690B2 (en) * 2011-04-04 2014-08-26 Qualcomm Incorporated Method and apparatus for scheduling network traffic in the presence of relays
US20120252355A1 (en) * 2011-04-04 2012-10-04 Qualcomm Incorporated Apparatus and method for handing over relays
WO2012148210A2 (en) * 2011-04-29 2012-11-01 Lg Electronics Inc. Method for processing data associated with session management and mobility management
US20120294163A1 (en) * 2011-05-19 2012-11-22 Renesas Mobile Corporation Apparatus and Method for Direct Device-to-Device Communication in a Mobile Communication System
US8811347B2 (en) * 2011-06-16 2014-08-19 Lg Electronics Inc. Method and apparatus for selecting MME in wireless communication system including mobile relay node
KR101973462B1 (en) * 2011-07-08 2019-08-26 엘지전자 주식회사 Method for performing detach procedure and terminal thereof
US9363753B2 (en) * 2011-07-19 2016-06-07 Qualcomm Incorporated Sleep mode for user equipment relays
KR101573156B1 (en) * 2011-08-12 2015-12-01 엘지전자 주식회사 Method for processing data associated with idle mode signaling reduction in a wireless communication system
EP3634079A1 (en) * 2011-08-19 2020-04-08 InterDigital Patent Holdings, Inc. Method and apparatus for using non-access stratum procedures in a mobile station to access resources of component carriers belonging to different radio access technologies
TWI590680B (en) * 2011-09-30 2017-07-01 內數位專利控股公司 Method and acess point (ap) for handover of a wireless transmitter/receiver unit (wtru) moving between a local network and another network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120330779A1 (en) * 1997-11-14 2012-12-27 Tuzhilin Alexander S Predicting Purchasing Requirements
US20020194159A1 (en) * 2001-06-08 2002-12-19 The Regents Of The University Of California Parallel object-oriented data mining system
US20070156673A1 (en) * 2005-12-30 2007-07-05 Accenture S.P.A. Churn prediction and management system
US20080162268A1 (en) * 2006-11-22 2008-07-03 Sheldon Gilbert Analytical E-Commerce Processing System And Methods

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11734958B2 (en) * 2015-06-19 2023-08-22 eConnect, Inc. Predicting behavior from surveillance data
US20160371547A1 (en) * 2015-06-19 2016-12-22 eConnect, Inc. Predicting behavior from surveillance data
RU2694747C1 (en) * 2015-09-21 2019-07-16 Гугл Инк. Controller, control method and program
US10650317B2 (en) 2015-09-21 2020-05-12 Google Llc Detecting and correcting potential errors in user behavior
US10902442B2 (en) 2016-08-30 2021-01-26 International Business Machines Corporation Managing adoption and compliance of series purchases
US10831755B2 (en) * 2016-10-26 2020-11-10 Seiko Epson Corporation Data processing apparatus and data processing method
US20180113911A1 (en) * 2016-10-26 2018-04-26 Seiko Epson Corporation Data processing apparatus and data processing method
US10824903B2 (en) * 2016-11-16 2020-11-03 Facebook, Inc. Deep multi-scale video prediction
US10891545B2 (en) 2017-03-10 2021-01-12 International Business Machines Corporation Multi-dimensional time series event prediction via convolutional neural network(s)
US10896371B2 (en) 2017-03-10 2021-01-19 International Business Machines Corporation Multi-dimensional time series event prediction via convolutional neural network(s)
US10402986B2 (en) * 2017-12-20 2019-09-03 Facebook, Inc. Unsupervised video segmentation
US20210311941A1 (en) * 2018-12-21 2021-10-07 Tencent Technology (Shenzhen) Company Limited Method and device for determining social rank of node in social network
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110930068A (en) * 2019-12-10 2020-03-27 安徽新知数媒信息科技有限公司 Traditional reading material visual experience index prediction method
US20230043820A1 (en) * 2021-08-04 2023-02-09 Verizon Media Inc. Method and system for user group determination, churn identification and content selection

Also Published As

Publication number Publication date
US9672527B2 (en) 2017-06-06
US20140204864A1 (en) 2014-07-24

Similar Documents

Publication Publication Date Title
US20140372175A1 (en) Method and system for detection, classification and prediction of user behavior trends
US11461795B2 (en) Method and system for automated detection, classification and prediction of multi-scale, multidimensional trends
ur Rehman et al. Big data reduction framework for value creation in sustainable enterprises
US9792259B2 (en) Systems and/or methods for interactive exploration of dependencies in streaming data
US10860858B2 (en) Utilizing a trained multi-modal combination model for content and text-based evaluation and distribution of digital video content to client devices
Yao et al. Combining visual customer segmentation and response modeling
US10282670B2 (en) Method of website optimisation for a website hosted on a server system, and a server system
Arun et al. Big data: review, classification and analysis survey
US20150120731A1 (en) Preference based clustering
Hemalatha Market basket analysis–a data mining application in Indian retailing
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
WO2020174233A1 (en) Machine-learned model selection network planning
Abdulla Application of MIS in E-CRM: A Literature Review in FMCG Supply Chain
Yang et al. D $^ 2 $2 HistoSketch: Discriminative and dynamic similarity-preserving sketching of streaming histograms
US10586163B1 (en) Geographic locale mapping system for outcome prediction
Wu et al. Mining trajectory patterns with point-of-interest and behavior-of-interest
Berkani et al. Spatio-temporal forecasting: A survey of data-driven models using exogenous data
Spurlock et al. Dynamic view selection for multi-camera action recognition
Kumar et al. Multimodal Neural Network For Demand Forecasting
Krueger et al. Visual analysis of visitor behavior for indoor event management
Nwogbaga A review of big data clustering methods and research issues
Li et al. An improved slope one algorithm for collaborative filtering
Panimalar et al. A review of churn prediction models using different machine learning and deep learning approaches in cloud environment
JP7262335B2 (en) Prediction device, learning device, prediction method, and program
Khelifi et al. A new fusion framework for motion segmentation in dynamic scenes

Legal Events

Date Code Title Description
AS Assignment

Owner name: FLYTXT B.V, NETHERLANDS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JAIN, NOOPUR;CHAUDHURY, SANTANU;WILSON, JOBIN;AND OTHERS;SIGNING DATES FROM 20140804 TO 20140916;REEL/FRAME:033858/0375

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION