CN113421131B - Intelligent marketing system based on big data content - Google Patents

Intelligent marketing system based on big data content Download PDF

Info

Publication number
CN113421131B
CN113421131B CN202110822601.0A CN202110822601A CN113421131B CN 113421131 B CN113421131 B CN 113421131B CN 202110822601 A CN202110822601 A CN 202110822601A CN 113421131 B CN113421131 B CN 113421131B
Authority
CN
China
Prior art keywords
data
time
real
calculation
engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110822601.0A
Other languages
Chinese (zh)
Other versions
CN113421131A (en
Inventor
孟艳冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sinotech Nanjing Co ltd
Original Assignee
Sinotech Nanjing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sinotech Nanjing Co ltd filed Critical Sinotech Nanjing Co ltd
Priority to CN202110822601.0A priority Critical patent/CN113421131B/en
Publication of CN113421131A publication Critical patent/CN113421131A/en
Application granted granted Critical
Publication of CN113421131B publication Critical patent/CN113421131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an intelligent marketing system based on big data content, which comprises Event Time and WaterMark introduced; data processing and pre-calculation are carried out through a Flink engine, and bottom-layer own data assets of a plurality of application platforms are integrated; according to the service scene, the indexes such as actual delay time, data analysis failure amount, external service calling time consumption and the like are processed by the user-defined data; performing resource tuning, and converting metadata of data in the real-time data stream; and building a middleware to associate a plurality of data sources for real-time data mixing. The invention belongs to the technical field of computing engines, and particularly provides a general real-time computing engine, which is used for constructing middleware between a platform (hardware and an operating system) and an application to provide general services for two or more parties and is provided with a standard program interface and a standard protocol based big data content intelligent marketing system.

Description

Intelligent marketing system based on big data content
Technical Field
The invention belongs to the technical field of calculation engines, and particularly relates to an intelligent marketing system based on big data content.
Background
The prior art intelligent marketing calculation based on user portrait identification comprises data cleaning, data calculation, IDmapping and a data query engine; the specific technology adopts analysis index calculation, data aggregation, output data, data analysis and label management, and the main function is based on intelligent marketing calculation based on user image recognition under multi-source and multi-format data; a real-time computing engine based solution data processing, comprising: detecting whether a new data processing rule is input; acquiring and analyzing the latest data processing rule through a real-time computing engine; and carrying out data processing according to the analyzed latest data processing rule to obtain result data.
The defects of the existing intelligent marketing system based on big data content are as follows:
1) Creating a calculation tag system based on rules, wherein when the data magnitude is large, data processing cannot be performed in real time, and a calculation engine needs to consume extremely high physical resources;
2) The traditional middleware design focuses attention on the transparency of the middleware, does not need to care about the problems of distributivity and the like, is only suitable for specific types of upper-layer application, is not combined with marketing technology, and cannot adapt to the current mobile computing environment;
3) The base system is complex and has heavier functions, and applicable enterprises or clients are few and cannot be lightweight to meet the requirements of most small and medium-sized enterprises.
Disclosure of Invention
Aiming at the situation, in order to overcome the defects of the prior art, the invention provides a real-time computing engine based on common Spark and flank, stream data sources are mainly Kafka, a computing platform bottom cluster uses YARN for resource scheduling, output of stream computing needs to cover all needed storage analysis engines of online business, such as Elastic Search, kafka, hbase and the like, in order to solve the problem of distributed isomerism of computing data, a middleware is built, the middleware is positioned between a platform (hardware and an operating system) and an application, general service is provided for two or more parties, and an intelligent marketing system based on large data content with standard program interfaces and protocols is provided.
The technical scheme adopted by the invention is as follows: the invention discloses an intelligent marketing system based on big data content, which comprises a calculation engine, wherein marketing calculation of the calculation engine comprises the following steps:
step one: the computing engine includes a base middleware;
step two: the Flink provides exact-on consistency semantics, has a very perfect plurality of window mechanisms, introduces Event Time and WaterMark, and provides state access of rich states;
step three: the real-time computing capability can be summarized into the capability of a data channel, the real-time computing may not completely meet the requirements of real-time analysis of us, and the real-time computing capability can be realized by performing data processing and pre-computing through a Flink engine and finally falls to a corresponding storage analysis engine to integrate the bottom-layer own data assets of a plurality of application platforms or log, behavior data and the like acquired in real time;
step four: on the basis of the metrics of the Flink, the indexes such as the actual delay time, the data analysis failure amount, the external service call time consumption and the like of the data processing are customized according to the service scene; all indexes are reported to Kafka through a self-defined report, and then are output to an ES and a guide after being structured through a real-time ETL, so that data loss or delay is avoided, and real-time alarming and preprocessing are carried out;
step five: on the basis of the bottom layers, the management and control of real-time tasks are firstly performed, wherein the management and control comprises that all the flank or Spark tasks on a platform are used for optimizing resources and converting metadata of data in a real-time data stream;
step six: and constructing a middleware to associate a plurality of data sources for real-time data mixing, providing cross-source T+0 query, completing a report data calculation engine, making up the defect of calculation capacity of a report tool, and distributing open calculation capacity in each stage related to calculation.
Further, the computing engine can perform stateful computation on finite data streams and infinite data streams, can be deployed in various cluster environments, can perform rapid computation on data scales of various sizes, and can perform batch processing, interactive computation and streaming computation simultaneously.
Further, applications on top of the compute engine include index analysis, real-time features, security wind control, ETL, real-time recommendations, and the like.
Further, the basic middleware comprises a routing and web server, an RPC framework, a message middleware, a cache service, a configuration center, a distributed transaction, a task scheduling and database layer, wherein the routing and web server is used for processing and forwarding communication data of other servers, and the RPC framework is a remote service call framework in a micro-service age; the message middleware supports software for sending and receiving messages between distributed systems, the cache service is used for a distributed high-speed data storage layer, typically memory storage, the configuration center is a system for uniformly managing all configurations in each project, the distributed transactions comprise participants of the transactions, servers supporting the transactions, resource servers and transaction managers which are respectively located on different nodes of different distributed systems and used for database expansibility, and the task scheduling is as follows: a system for providing functions of timing, task arrangement, distributed running and batch and the like in a distributed environment; the database layer is used for supporting elastic expansion and TDDL of the sub-database and sub-table, and the database connection pool Drium, canlog synchronous Canal and the like.
By adopting the scheme, the intelligent marketing system based on the big data content has the following beneficial effects:
1. the Flink computing engine simulates batch computing based on stream computing, has better expansibility in technology, and has higher computing efficiency and faster computing speed in the long term, and consumed resources are relatively reduced;
2. the simplicity and the flexibility of the computing engine are improved, and the requirement that data integration is required by a plurality of applications is met;
3. the problem of overtightening of logic and data coupling of a business layer is solved, the skill of a lightweight middleware can be separated from a specific implementation method of the underlying service logic, and a plurality of applications are butted only by using the in-process introduction result.
Drawings
FIG. 1 is a diagram of a data channel of an intelligent marketing system based on big data content according to the present invention.
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in FIG. 1, the intelligent marketing system based on the big data content comprises a calculation engine, wherein the marketing calculation of the calculation engine comprises the following steps:
step one: the computing engine includes a base middleware;
step two: the Flink provides exact-on consistency semantics, has a very perfect plurality of window mechanisms, introduces Event Time and WaterMark, and provides state access of rich states;
step three: the real-time computing capability can be summarized into the capability of a data channel, the real-time computing may not completely meet the requirements of real-time analysis of us, and the real-time computing capability can be realized by performing data processing and pre-computing through a Flink engine and finally falls to a corresponding storage analysis engine to integrate the bottom-layer own data assets of a plurality of application platforms or log, behavior data and the like acquired in real time;
step four: on the basis of the metrics of the Flink, the indexes such as the actual delay time, the data analysis failure amount, the external service call time consumption and the like of the data processing are customized according to the service scene; all indexes are reported to Kafka through a self-defined report, and then are output to an ES and a guide after being structured through a real-time ETL, so that data loss or delay is avoided, and real-time alarming and preprocessing are carried out;
step five: on the basis of the bottom layers, the management and control of real-time tasks are firstly performed, wherein the management and control comprises that all the flank or Spark tasks on a platform are used for optimizing resources and converting metadata of data in a real-time data stream;
step six: and constructing a middleware to associate a plurality of data sources for real-time data mixing, providing cross-source T+0 query, completing a report data calculation engine, making up the defect of calculation capacity of a report tool, and distributing open calculation capacity in each stage related to calculation.
The computing engine can perform stateful computation on finite data streams and infinite data streams, can be deployed in various cluster environments, can perform rapid computation on data scales with various sizes, and can perform batch processing, interactive computation and streaming computation simultaneously.
Applications on top of the compute engine include index analysis, real-time features, security wind control, ETL, real-time recommendations, and the like.
The basic middleware comprises a routing and web server, an RPC framework, a message middleware, a cache service, a configuration center, a distributed transaction, a task scheduling and database layer, wherein the routing and web server is used for processing and forwarding communication data of other servers, and the RPC framework is a remote service call framework in a micro-service age; the message middleware supports software for sending and receiving messages between distributed systems, the cache service is used for a distributed high-speed data storage layer, generally memory storage, the configuration center is a system for uniformly managing all configurations in each project, the distributed transactions comprise participants of the transactions, servers for supporting the transactions, resource servers and transaction managers which are respectively positioned on different nodes of different distributed systems and used for database expansibility, and the task scheduling is a system for providing functions of timing, task arrangement, distributed running batch and the like in a distributed environment; the database layer is used for supporting elastic expansion and TDDL of the sub-database and sub-table, and the database connection pool Drium, canlog synchronous Canal and the like.
Ordinary regular portrait or real-time computing technology may not fully meet the needs of our real-time analysis, for example, the general needs are all the needs of multidimensional analysis and impromptu inquiry, and our technology performs data processing and pre-computation by the flank engine, and the data is finally stored in the corresponding storage analysis engine for real-time calling and accessing;
the middleware is positioned between the data source and the upper layer application to provide the calculation of the general calculation service, does not depend on the calculation capability of the database, can be flexibly plugged into the application for use, supports distributed calculation, faces to scenes with different data volumes in the actual application environment, and ensures that various data services provided by the middleware are continuously available when the data volume is continuously increased;
when a large number of heterogeneous systems exist in an enterprise, cross-system and cross-platform operations are not needed, and data from different data sources are put together for calculation; and the data calculation middleware is adopted to open a link barrier, so that the communication function is realized, and the data in different heterogeneous platforms are integrated.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
The invention and its embodiments have been described above with no limitation, and the actual construction is not limited to the embodiments of the invention as shown in the drawings. In summary, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical solution should not be creatively devised without departing from the gist of the present invention.

Claims (2)

1. An intelligent marketing system based on big data content, which is characterized by comprising a calculation engine, wherein marketing calculation of the calculation engine comprises the following steps:
step one: the computing engine includes a base middleware;
step two: the Flink provides exact-on consistency semantics, has a very perfect plurality of window mechanisms, introduces Event Time and WaterMark, and provides state access of rich states;
step three: the method comprises the steps that data processing and pre-calculation are carried out through a Flink engine, the real-time calculation capability is the capability of a data channel, and the bottom-layer own data assets of a plurality of application platforms or log and behavior data collected in real time are integrated;
step four: on the basis of the metrics of the Flink, the actual delay time, the data analysis failure amount and the external service calling time consumption index are customized according to the service scene; all indexes are reported to Kafka through a self-defined report, and then are output to an ES and a guide after being structured through a real-time ETL, so that data loss or delay is avoided, and real-time alarming and preprocessing are carried out;
step five: on the basis of the bottom layers, the management and control of real-time tasks are firstly performed, wherein the management and control comprises that all the flank or Spark tasks on a platform are used for optimizing resources and converting metadata of data in a real-time data stream;
step six: building a middleware to associate a plurality of data sources for real-time data mixing, providing cross-source T+0 query, completing a report data calculation engine, making up the defect of calculation capacity of a report tool, and distributing open calculation capacity in each stage related to calculation;
the computing engine can perform stateful computation on limited data streams and infinite data streams, can be deployed in various cluster environments, can perform rapid computation on data scales with various sizes, and can perform batch processing, interactive computation and streaming computation simultaneously;
the basic middleware comprises a routing and web server, an RPC framework, a message middleware, a cache service, a configuration center, a distributed transaction, a task scheduling and database layer, wherein the routing and web server is used for processing and forwarding communication data of other servers, and the RPC framework is a remote service call framework in a micro-service age; the message middleware supports software for sending and receiving messages between distributed systems, the cache service is used for distributed memory storage, the configuration center is a system for uniformly managing all configurations in each project, the distributed transactions comprise participants of the transactions, servers supporting the transactions, resource servers and transaction managers which are respectively positioned on different nodes of different distributed systems and used for database expansibility, and the task scheduling is a system for providing timing, task arrangement and distributed batch running functions in a distributed environment; the database layer is used for supporting elastic expansion and TDDL of the sub-database and sub-table, and the database is connected with the Canal synchronized by the pool Driud and Binlog.
2. The big data content based intelligent marketing system of claim 1, wherein: applications on top of the compute engine include index analysis, real-time features, security wind control, ETL, real-time recommendations.
CN202110822601.0A 2021-07-21 2021-07-21 Intelligent marketing system based on big data content Active CN113421131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110822601.0A CN113421131B (en) 2021-07-21 2021-07-21 Intelligent marketing system based on big data content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110822601.0A CN113421131B (en) 2021-07-21 2021-07-21 Intelligent marketing system based on big data content

Publications (2)

Publication Number Publication Date
CN113421131A CN113421131A (en) 2021-09-21
CN113421131B true CN113421131B (en) 2023-11-28

Family

ID=77721424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110822601.0A Active CN113421131B (en) 2021-07-21 2021-07-21 Intelligent marketing system based on big data content

Country Status (1)

Country Link
CN (1) CN113421131B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880475A (en) * 2012-10-23 2013-01-16 上海普元信息技术股份有限公司 Real-time event handling system and method based on cloud computing in computer software system
CN106100902A (en) * 2016-08-04 2016-11-09 腾讯科技(深圳)有限公司 High in the clouds index monitoring method and apparatus
CN112000636A (en) * 2020-08-31 2020-11-27 民生科技有限责任公司 User behavior statistical analysis method based on Flink streaming processing
CN112116463A (en) * 2020-05-20 2020-12-22 上海金融期货信息技术有限公司 Spark engine-based intelligent analysis system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880475A (en) * 2012-10-23 2013-01-16 上海普元信息技术股份有限公司 Real-time event handling system and method based on cloud computing in computer software system
CN106100902A (en) * 2016-08-04 2016-11-09 腾讯科技(深圳)有限公司 High in the clouds index monitoring method and apparatus
CN112116463A (en) * 2020-05-20 2020-12-22 上海金融期货信息技术有限公司 Spark engine-based intelligent analysis system
CN112000636A (en) * 2020-08-31 2020-11-27 民生科技有限责任公司 User behavior statistical analysis method based on Flink streaming processing

Also Published As

Publication number Publication date
CN113421131A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
CN109492040B (en) System suitable for processing mass short message data in data center
US11182098B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
CN109240821B (en) Distributed cross-domain collaborative computing and service system and method based on edge computing
CN103678609B (en) Large data inquiring method based on distribution relation-object mapping processing
US12008027B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
Isah et al. A scalable and robust framework for data stream ingestion
CN107895046B (en) Heterogeneous data integration platform
CN111885439B (en) Optical network integrated management and duty management system
CN113568938B (en) Data stream processing method and device, electronic equipment and storage medium
CN106777142A (en) Service layer's system and method based on mobile Internet mass data
CN112367354B (en) Cloud edge resource map intelligent scheduling system and scheduling method thereof
CN111126852A (en) BI application system based on big data modeling
CN115811546A (en) System and method for realizing network cooperative distributed processing for scientific and technological service
Cao et al. Analytics everywhere for streaming iot data
Theeten et al. Chive: Bandwidth optimized continuous querying in distributed clouds
CN103412883A (en) Semantic intelligent information publishing and subscribing method based on P2P technology
CN114706994A (en) Operation and maintenance management system and method based on knowledge base
CN113421131B (en) Intelligent marketing system based on big data content
CN107679097A (en) A kind of distributed data processing method, system and storage medium
Vanhove et al. Managing the synchronization in the lambda architecture for optimized big data analysis
CN112506960B (en) Multi-model data storage method and system based on ArangoDB engine
CN114596046A (en) Integrated platform based on unified digital model of business center station and data center station
CN102298648A (en) Out-of-process access method of open real-time database
Arora et al. Big data technologies: brief overview
Pellegrino Pushing dynamic and ubiquitous event-based interactions in the Internet of services: a middleware for event clouds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant