CN110674100A - User demand prediction method and framework based on full-channel operation data - Google Patents

User demand prediction method and framework based on full-channel operation data Download PDF

Info

Publication number
CN110674100A
CN110674100A CN201910928706.7A CN201910928706A CN110674100A CN 110674100 A CN110674100 A CN 110674100A CN 201910928706 A CN201910928706 A CN 201910928706A CN 110674100 A CN110674100 A CN 110674100A
Authority
CN
China
Prior art keywords
data
user
machine learning
channel
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910928706.7A
Other languages
Chinese (zh)
Other versions
CN110674100B (en
Inventor
李虎
曾毅峰
王之良
徐飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd Credit Card Center
Original Assignee
Shanghai Pudong Development Bank Co Ltd Credit Card Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd Credit Card Center filed Critical Shanghai Pudong Development Bank Co Ltd Credit Card Center
Priority to CN201910928706.7A priority Critical patent/CN110674100B/en
Publication of CN110674100A publication Critical patent/CN110674100A/en
Application granted granted Critical
Publication of CN110674100B publication Critical patent/CN110674100B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Fuzzy Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a user demand prediction method and a framework based on full-channel operation data, wherein the method comprises the following steps: step 1: data collection, namely acquiring original operation data of a user from all systems of a whole channel and storing the original operation data in a big data frame file system; step 2: data processing, namely, constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel; and step 3: establishing a machine learning model, dividing user image data into a data set for machine learning model training and a test data set for machine learning model verification, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model; and 4, step 4: and recommending the service, namely performing model evaluation on the final machine learning model, predicting the corresponding requirement of the original operation data of the user by using the model evaluation, and recommending the service. Compared with the prior art, the method is more suitable for large-capacity data and has good user experience.

Description

User demand prediction method and framework based on full-channel operation data
Technical Field
The invention relates to the technical field of computers, in particular to a user demand prediction method and a user demand prediction framework based on full-channel operation data.
Background
With the maturity of new technology, new and advanced applications come from the fusion of 5G, artificial intelligence and the Internet of things in the future, and an intelligent connected world is created, which affects all individuals, industries, society and economy. Among them, the emergence of artificial intelligence has brought a new era for the great potential of mobile applications. For several years, mobile application developers have employed artificial intelligence in their innovations. Such as Siri from apple inc. Machine learning is developing rapidly and users now need a flexible algorithm to enhance the experience.
After 2013, the explosion development of internet finance pushes big data to a new climax. At present, due to the vigorous development of internet finance and consumption finance, the traditional customer operation mode of banking industry is tested seriously, and a new customer operation means is urgently needed. The traditional customer operation mode has the following defects:
(1) the information coverage is not enough: at present, the number of natural people collected by a personal credit investigation center in the central row reaches 8.6 hundred million people, but only 3 hundred million people have credit records;
(2) the information validity is insufficient: credit records mainly come from financial institutions such as commercial banks and rural credit agencies, and have serious shortcuts in data timeliness, comprehensiveness and hierarchy.
(3) The services are various, and the client cannot find the required services;
(4) the system is not intelligent enough;
(5) the customer experience result cannot be fed back in time.
Disclosure of Invention
The present invention aims to overcome the defects of the prior art and provide a user demand prediction method and architecture based on full-channel operation data.
The purpose of the invention can be realized by the following technical scheme:
a user demand prediction method based on full-channel operation data comprises the following steps:
step 1: acquiring original operation data of a user from each system of a whole channel, and storing the original operation data in a big data frame file system;
step 2: constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel;
and step 3: establishing a machine learning model, dividing user image data into a test data set for machine learning model verification and a training data set for machine learning model training, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model;
and 4, step 4: and after model evaluation is carried out on the machine learning model which is finally trained and verified, the corresponding requirements of the original operation data of the user are predicted by using the model, and the service is recommended.
Further, the full channels in step 1 include an animation channel, an APP channel, a WeChat channel, an IVR (Interactive Voice Response) channel, a Payment service window channel, and a CSR (customer service Resepresenceive) artificial customer service channel.
Further, the user portrait data in step 2 includes customer basic information, customer transaction information, customer activity information, and customer buried point information.
Further, the method also comprises the step 5: and after receiving the service corresponding to the requirement, the user collects user feedback and trains and optimizes the machine learning model.
Further, the step 1 specifically includes: and (4) acquiring user original operation data from all the systems of the whole channel by using an ETL tool, and storing the user original operation data in a big data frame file system.
Further, the big data frame file system in step 1 adopts an HDFS file system.
Further, the indexes of the model evaluation in the step 4 include classification, regression, ranking, clustering, topical models and recommendation.
The invention also provides a framework of the user demand prediction method based on the full-channel operation data, and the framework comprises the following steps:
the operation data acquisition module is used for acquiring original data by using an ETL tool and storing the original data in an HDFS file system;
the characteristic engineering module is used for carrying out characteristic construction, characteristic extraction and characteristic selection from the original data;
the model training module is used for training the machine learning model;
the model verification module is used for verifying the machine learning model;
and the model application module is used for running the trained and verified machine learning model on line.
Compared with the prior art, the invention has the following advantages:
(1) the invention adopts Hadoop correlation technique; hadoop is a large data processing framework that can be used for storage and computing services from a single server cluster to thousands of server clusters. Hadoopdistributed File System (HDFS) provides a big data storage service that can span multiple computers, while MapReduce provides a framework for parallel processing. The HDFS file system is adopted for data storage, so that infinite capacity storage can be carried out theoretically, and the bottleneck that data storage of a traditional database is limited is solved. MapReduce provides a calculation solution for massive data.
(2) The invention adopts machine learning (deep learning) related technology; as shown in fig. 2, in the conventional programming, a person inputs rules (i.e., programs) and data to be processed according to the rules, and a system outputs answers; in the machine learning modeling process, people input data and answers expected from the data, and the system outputs rules or called models. These rules can then be applied to new data and cause the computer to generate answers autonomously, with the machine learning system being trained rather than explicitly programmed. Many examples relating to a task are input into the machine learning system, which finds statistical structures in these examples, and eventually finds rules to automate the task.
Drawings
FIG. 1 is a schematic diagram of an architecture for practicing the present invention;
FIG. 2 is a schematic diagram of the technical advantage of the present invention;
FIG. 3 is a flow chart of a method of the present invention;
FIG. 4 is a flowchart of a method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
Nowadays, banks are developing rapidly in informationization, a large amount of valuable business data and user data are accumulated, user requirements are mined by means of big data, appropriate products are recommended for users, operation benefits are increased under the condition that user disturbance and marketing cost are reduced, and the method is a key point for urgently promoting the business of credit cards of various banks.
The invention establishes a customer service recommendation system based on related concepts of interconnected finance, big data and machine learning, mainly comprises data acquisition, feature engineering, model training, model verification and model application, the overall structure is shown in figure 1,
the figure includes:
1. operational data collection
The original data is obtained from each system by using an ETL tool and is stored on an HDFS file system. The raw data needs to have the following characteristics:
a. have "representativeness".
b. For the classification problem, the data skew cannot be too severe, and the number of different classes of data does not have a difference of several orders of magnitude.
c. And estimating the memory consumption of the training model for the magnitude of the evaluation data, the number of samples and the number of characteristics. If the amount of data is too large, it may be considered to reduce training samples, reduce dimensions, or use a distributed machine learning system.
2. Feature engineering
The method comprises the steps of feature construction, feature extraction and feature selection from original data. The maximum effectiveness of original data can be exerted well by characteristic engineering, the effect and performance of an algorithm can be obviously improved, and the effect of a simple model is better than that of a complex model sometimes. Most of the time of data mining is spent on feature engineering, and is a very basic and necessary step for machine learning.
3. Model training
In machine learning, training is an important step and can directly influence the result of machine learning, and no matter which content needs a model in machine learning, a reasonable algorithm is selected for corresponding modeling, so that people can make the model better and more accurate.
4. Model validation
Through test data, the effectiveness of the model is verified, error samples are observed, and the reasons for error generation are analyzed, so that a breakthrough point for improving the performance of the algorithm can be found. The error analysis mainly analyzes error sources, data, characteristics and algorithms.
5. Model application
The success or failure of the model is directly determined by the on-line operation effect of the model. The method does not simply comprise the conditions of accuracy, errors and the like, and also comprises whether the running speed (time complexity), the resource consumption degree (space complexity) and the stability are acceptable.
The raw operation data of the customers in the whole channel are gathered by days, ① customer portrait (or called characteristic) data are obtained after simple data cleaning, processing and storing, reasonable sample data are selected from ① to construct a ② training data set and a ③ testing data set for machine learning of the data, and the training data set and the testing data set are crucial to the accuracy of model effect, and the relevant specific information in the graph is as follows:
Figure BDA0002219640600000051
the method mainly focuses on considering the demand direction of users in different channels, forms independent customer figures for each user by collecting user behavior information (mainly by buried point information collection) of the users in each channel, arranges scattered unstructured data information to form historical operation information of the users, combines the historical transaction behavior and other information of the users, calculates what services each customer may need in a future time period by using a model algorithm by establishing a model and a training model and taking the customer figures as input parameters, collects behavior information of the users after the customers contact through each channel, provides sufficient data samples for the evolution of the model in reaction to the training and optimization of the model, forms information closed loops of the users, channels and the model, and realizes quasi-real-time cyclic customer behavior prediction, the service which is needed most by the client is accurately predicted. As shown in particular in fig. 4.
The method and the architecture system have obvious benefits before and after production; the whole channel service demand prediction system is applied to the cartoon customer service channel, and the staged transaction amount is obviously increased compared with 1-5 months in 2018.
The method and the architecture system improve the hit rate before and after production, and have good user experience; the full-channel service demand prediction system is applied to the cartoon customer service channel, the calling times of the cartoon channel interface reach 1.2 ten thousand times every day, the hit rate is maintained at 74%, the calling times of the cartoon channel interface are obviously increased compared with the calling times of the cartoon channel interface before the cartoon channel is online in 2018 in 1-5 months, the viscosity of a user is improved, the user is more willing to select the cartoon channel to handle business when the user has service demands, and the user experience is good.
The method and the architecture system of the invention are used for personalized recommendation service for different customers; the personalized recommendation service adopted by the full-channel service demand prediction system recommends different service nodes aiming at different customers, and can meet different customer demands from person to person, and the recommended service nodes are the same for different customers before the system is on line, so that the demands of target customers cannot be met, the loss of the target customers is caused, and the loss of the customers can cause irreparable loss to a credit card center.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A user demand prediction method based on full-channel operation data is characterized by comprising the following steps:
step 1: acquiring original operation data of a user from each system of a whole channel, and storing the original operation data in a big data frame file system;
step 2: constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel;
and step 3: establishing a machine learning model, dividing user image data into a test data set for machine learning model verification and a training data set for machine learning model training, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model;
and 4, step 4: and after model evaluation is carried out on the machine learning model which is finally trained and verified, the corresponding requirements of the original operation data of the user are predicted by using the model, and the service is recommended.
2. The method according to claim 1, wherein the full channel in step 1 comprises an animation channel, an APP channel, a WeChat channel, an IVR channel, a Payment service Window channel and a CSR customer service channel.
3. The method of claim 1, wherein the user profile data in step 2 comprises customer basic information, customer transaction information, customer activity information and customer site information.
4. The method for predicting the demand of the user based on the channel-wide operation data as claimed in claim 1, further comprising the step 5: and after receiving the service corresponding to the requirement, the user collects user feedback and trains and optimizes the machine learning model.
5. The method for predicting the user demand based on the full channel operation data as claimed in claim 1, wherein the step 1 specifically comprises: and (4) acquiring user original operation data from all the systems of the whole channel by using an ETL tool, and storing the user original operation data in a big data frame file system.
6. The method for predicting user demand based on full channel operation data according to claim 1, wherein the big data frame file system in step 1 adopts an HDFS file system.
7. The method according to claim 1, wherein the model evaluation indexes in step 4 include classification, regression, ranking, clustering, topical models and recommendations.
8. An architecture based on the full-channel operation data based user demand prediction method according to any one of claims 1 to 7, characterized in that the architecture comprises:
the operation data acquisition module is used for acquiring original data by using an ETL tool and storing the original data in an HDFS file system;
the characteristic engineering module is used for carrying out characteristic construction, characteristic extraction and characteristic selection from the original data;
the model training module is used for training the machine learning model;
the model verification module is used for verifying the machine learning model;
and the model application module is used for running the trained and verified machine learning model on line.
CN201910928706.7A 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data Active CN110674100B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910928706.7A CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910928706.7A CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Publications (2)

Publication Number Publication Date
CN110674100A true CN110674100A (en) 2020-01-10
CN110674100B CN110674100B (en) 2023-02-10

Family

ID=69079893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910928706.7A Active CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Country Status (1)

Country Link
CN (1) CN110674100B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612366A (en) * 2020-05-27 2020-09-01 中国联合网络通信集团有限公司 Channel quality evaluation method and device, electronic equipment and storage medium
CN112561598A (en) * 2020-12-23 2021-03-26 中国农业银行股份有限公司重庆市分行 Customer loss prediction and retrieval method and system based on customer portrait
CN112598443A (en) * 2020-12-25 2021-04-02 山东鲁能软件技术有限公司 Online channel business data processing method and system based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423442A (en) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis
US20180240158A1 (en) * 2017-02-17 2018-08-23 Kasatria Analytics Sdn Bhd Computer implemented system and method for customer profiling using micro-conversions via machine learning
CN109509040A (en) * 2019-01-03 2019-03-22 广发证券股份有限公司 Predict modeling method, marketing method and the device of fund potential customers

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180240158A1 (en) * 2017-02-17 2018-08-23 Kasatria Analytics Sdn Bhd Computer implemented system and method for customer profiling using micro-conversions via machine learning
CN107423442A (en) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis
CN109509040A (en) * 2019-01-03 2019-03-22 广发证券股份有限公司 Predict modeling method, marketing method and the device of fund potential customers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐璐瑶等: "基于大数据的用户画像***概述", 《电子世界》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612366A (en) * 2020-05-27 2020-09-01 中国联合网络通信集团有限公司 Channel quality evaluation method and device, electronic equipment and storage medium
CN112561598A (en) * 2020-12-23 2021-03-26 中国农业银行股份有限公司重庆市分行 Customer loss prediction and retrieval method and system based on customer portrait
CN112598443A (en) * 2020-12-25 2021-04-02 山东鲁能软件技术有限公司 Online channel business data processing method and system based on deep learning

Also Published As

Publication number Publication date
CN110674100B (en) 2023-02-10

Similar Documents

Publication Publication Date Title
US10977293B2 (en) Technology incident management platform
WO2020249125A1 (en) Method and system for automatically training machine learning model
Verenich et al. Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring
CN103714139B (en) Parallel data mining method for identifying a mass of mobile client bases
CN106095942B (en) Strong variable extracting method and device
US11811708B2 (en) Systems and methods for generating dynamic conversational responses using cluster-level collaborative filtering matrices
US11790183B2 (en) Systems and methods for generating dynamic conversational responses based on historical and dynamically updated information
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
KR20200039852A (en) Method for analysis of business management system providing machine learning algorithm for predictive modeling
CN110674100B (en) User demand prediction method and framework based on full-channel operation data
US20230267302A1 (en) Large-Scale Architecture Search in Graph Neural Networks via Synthetic Data
CN116703466A (en) System access quantity prediction method based on improved wolf algorithm and related equipment thereof
CN116843395A (en) Alarm classification method, device, equipment and storage medium of service system
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
Nagashima et al. Data Imputation Method based on Programming by Example: APREP-S
CN112950392A (en) Information display method, posterior information determination method and device and related equipment
US20240185250A1 (en) Computerized-method and computerized-system for generating a classification machine learning model for implementation with no training requirement
CN116800831B (en) Service data pushing method, device, storage medium and processor
Bhandarkar et al. The Smart Analysis of Performing Scalable Inference for Big Data Analytics
Vishwakarma et al. House Price Forecasting Based on Hybrid Multi-regression Model
Admasu Web Traffic Analysis and Forecasting using Deep Learning Time-Series Approach In case of Commercial Bank of Ethiopia
Rudrappa Shivu Online News Popularity Prediction using LSTM and Bi-LSTM
Zhang Research on Intelligent Analysis and Processing System of Financial Big Data Based on Machine Learning
CN117216364A (en) Resource recommendation method and device, electronic equipment and storage medium
Colladon et al. A new mapping of technological interdependence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Tie Jincheng

Inventor after: Li Hu

Inventor after: Zeng Yifeng

Inventor after: Wang Zhiliang

Inventor after: Xu Fei

Inventor before: Li Hu

Inventor before: Zeng Yifeng

Inventor before: Wang Zhiliang

Inventor before: Xu Fei