CN116307345A - Ceramic industry data system and acquisition method - Google Patents

Ceramic industry data system and acquisition method Download PDF

Info

Publication number
CN116307345A
CN116307345A CN202310512446.1A CN202310512446A CN116307345A CN 116307345 A CN116307345 A CN 116307345A CN 202310512446 A CN202310512446 A CN 202310512446A CN 116307345 A CN116307345 A CN 116307345A
Authority
CN
China
Prior art keywords
data
layer
business
collected
collecting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310512446.1A
Other languages
Chinese (zh)
Inventor
梁英林
孔令超
林国友
吕火生
黄世志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gongqing City Zhongtaolian Supply Chain Service Co ltd
Lin Zhoujia Home Network Technology Co ltd
Linzhou Lilijia Supply Chain Service Co ltd
Foshan Zhongtaolian Supply Chain Service Co Ltd
Tibet Zhongtaolian Supply Chain Service Co Ltd
Original Assignee
Gongqing City Zhongtaolian Supply Chain Service Co ltd
Lin Zhoujia Home Network Technology Co ltd
Linzhou Lilijia Supply Chain Service Co ltd
Foshan Zhongtaolian Supply Chain Service Co Ltd
Tibet Zhongtaolian Supply Chain Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gongqing City Zhongtaolian Supply Chain Service Co ltd, Lin Zhoujia Home Network Technology Co ltd, Linzhou Lilijia Supply Chain Service Co ltd, Foshan Zhongtaolian Supply Chain Service Co Ltd, Tibet Zhongtaolian Supply Chain Service Co Ltd filed Critical Gongqing City Zhongtaolian Supply Chain Service Co ltd
Priority to CN202310512446.1A priority Critical patent/CN116307345A/en
Publication of CN116307345A publication Critical patent/CN116307345A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Manufacturing & Machinery (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a ceramic industry data system and a collection method, wherein the system comprises the following steps: a business platform for collecting business data of each business part; a data center for receiving the business data of each business section and performing data processing to generate data information; the data product is used for displaying and feeding back corresponding data information; the invention aims to provide a ceramic industry data system and a data acquisition method, solve the problem of data island, perform centralized processing on data and establish a unified and shared data base platform.

Description

Ceramic industry data system and acquisition method
Technical Field
The invention relates to the technical field of data processing, in particular to a ceramic industry data system and a ceramic industry data acquisition method.
Background
In the development of the ceramic industry, the industry of enterprises is energized by taking the traditional industry, the Internet and the financial capital as core paths; there is a course of development, and enterprise data has a certain accumulation. The difficulty faced at present is that enterprises have larger data volume, but the data extraction is difficult, and how to refine the data is a problem to be solved urgently. Along with the wave trend of digital transformation, industrial factory production lines are urgent to realize the development of higher quality for enterprises through digital transformation.
The technical solutions of the current data system are many, but are not suitable for all the scenes of enterprises. These data systems only solve a certain difficulty of the enterprise, and at the same time, other problems are brought; moreover, the resource utilization rate of the whole data system is low, the development work is complicated, and the full-link control of the data circulation cannot be performed.
Disclosure of Invention
The invention aims to provide a ceramic industry data system and a data acquisition method, which solve the problem of data island, perform centralized processing on data and establish a unified and shared data base platform.
To achieve the purpose, the invention adopts the following technical scheme: a ceramic industry data system, comprising:
a business platform for collecting business data of each business part;
a data center for receiving the business data of each business section and performing data processing to generate data information;
and the data product is used for displaying and feeding back the corresponding data information.
Preferably, the data center comprises a data warehouse, data processing and data warehouse modeling;
the data warehouse is used for storing the business data acquired by each business part and classifying the business data into a plurality of data sources;
the data processing is used for carrying out operation on the classified data sources in two modes of real-time operation and off-line calculation;
and the number bin modeling is used for storing the operation processing results of the classified data sources.
Preferably, the data warehouse is classified into a plurality of data sources, including: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
Preferably, the real-time operation mode specifically includes that the device data and the log data are collected into a message queue Kafka through a mode that a data stream is Fluentd, the service database is collected into the message queue Kafka through a mode that the data stream is Canal, and then the message queue Kafka is operated through a streaming computing frame Flink processed in real time, and the operation result is stored in a database Doris.
Preferably, the offline computing mode specifically includes that the device data and the log data are collected to a message queue Kafka in a mode that a data stream is Fluentd, the service data are collected to the message queue Kafka in a mode that the data stream is Canal, and then the message queue Kafka is collected to a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
The overall offline digital bin technical architecture is mainly based on Doris, hadoop stores a part of ODS layer data, and Hive is used as a backup of historical data.
Preferably, the data bin modeling comprises an original data layer, a detail theme layer, a dimension data layer, a wide-table data layer and a data application layer;
the original data layer is used for keeping original data without any processing;
the detail topic layer is used for cleaning the data of the original data layer and establishing an event topic model according to the business division topics;
the dimension data layer is used for cleaning the data of the original data layer based on actual business to construct a consistency data analysis dimension table of the whole enterprise;
the wide table data layer is used for carrying out association analysis on the detail subject layer and the dimension data layer based on upper application and product index summary, and constructing a summary index wide table with public granularity;
the data application layer is used for carrying out statistics processing on the detail topic layer, the dimension data layer and the wide-table data layer by layer according to service requirements to provide each data application system.
A data acquisition method for ceramic industry comprises the following steps:
establishing a data warehouse, establishing a data storage unit for storing the business data collected by each business department, and classifying the business data into a plurality of data sources;
data operation processing, namely performing operation on classified data sources through two modes of real-time operation and off-line calculation;
and establishing a number bin modeling, and storing and classifying the operation processing result of the data source in the number bin modeling.
Preferably, in the step of creating a data warehouse, the step specifically includes classifying the collected utility data into a plurality of data sources, including: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
Preferably, in the step of the data operation processing, the real-time operation mode specifically includes collecting the device data and the log data into a message queue Kafka in a mode that a data stream is Fluentd, collecting the service database into the message queue Kafka in a mode that the data stream is Canal, and then storing an operation result into a database Doris after the message queue Kafka is operated by a stream computing framework Flink of the real-time processing.
Preferably, in the step of the data operation processing, the offline computing mode specifically includes that the device data and the log data are collected into a message queue Kafka in a mode that a data stream is Fluentd, the service database is collected into the message queue Kafka in a mode that the data stream is Canal, and then the message queue Kafka is collected into a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
The technical scheme of the invention has the beneficial effects that: the invention constructs a data storage warehouse of each business part by collecting business data of each business part, processes the collected data by a data center table, cleans, analyzes, digs and the like, and then generates a value product capable of displaying and feeding back corresponding data information.
The invention can solve the problem that island is generated by the business data and the problem that information is not communicated among the business parts; the extraction of the data is quickened, and the centralized processing of the data can be carried out; the data is mined, analyzed and the like through the data center table, so that higher quality development is brought to industry; and (3) carrying out asset formation on the data to generate corresponding value products, and bringing real commercial value to enterprises.
The whole architecture aims at the characteristics of different data flows in industry, and different technologies are flexibly used for collecting data; the integrated technical architecture of the stream batch is used, so that the resources are reduced, and the data processing is quickened; hierarchical modeling of a plurality of bins is used, and circulation of all-link data is clearly seen.
Drawings
FIG. 1 is a schematic diagram of a frame of one embodiment of the present invention;
FIG. 2 is a schematic diagram of a technical architecture of an embodiment of the present invention;
FIG. 3 is a schematic diagram of a data flow architecture according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further described below by the specific embodiments with reference to the accompanying drawings.
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
Referring to fig. 1-3, a ceramic industry data system, comprising:
a business platform for collecting business data of each business part;
a data center for receiving the business data of each business section and performing data processing to generate data information;
and the data product is used for displaying and feeding back the corresponding data information.
The invention constructs a data storage warehouse of each business part by collecting business data of each business part, processes the collected data by a data center table, cleans, analyzes, digs and the like, and then generates a value product capable of displaying and feeding back corresponding data information.
The invention can solve the problem that island is generated by the business data and the problem that information is not communicated among the business parts; the extraction of the data is quickened, and the centralized processing of the data can be carried out; the data is mined, analyzed and the like through the data center table, so that higher quality development is brought to industry; and (3) carrying out asset formation on the data to generate corresponding value products, and bringing real commercial value to enterprises.
The whole framework of the invention aims at the characteristics of different data streams in industry, and flexibly uses different technologies to collect data; the integrated technical architecture of the stream batch is used, so that the resources are reduced, and the data processing is quickened; hierarchical modeling of a plurality of bins is used, and circulation of all-link data is clearly seen.
In the present invention, the technical words appearing are shown in the following table:
Figure SMS_1
preferably, the data center comprises a data warehouse, data processing and data warehouse modeling;
the data warehouse is used for storing the business data acquired by each business part and classifying the business data into a plurality of data sources;
the data processing is used for carrying out operation on the classified data sources in two modes of real-time operation and off-line calculation;
and the number bin modeling is used for storing the operation processing results of the classified data sources.
Preferably, the data warehouse is classified into a plurality of data sources, including: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
Currently, the data sources of the industry are commonly divided into equipment data, and the data is generated by equipment of a factory; log data including system logs, business database logs, etc.; traffic data, including MySQL, sqlServer, elasticSearch, mongoDB, etc.; interface data provided by a third party system; document data, and other data, etc. Combining different data characteristics and related service scenes, flexibly collecting data, and dividing data processing into two lines for operation processing, wherein the operation processing comprises real-time operation and off-line calculation; the data overall operation and modeling is based on a database Doirs, and the overall technology is based on Doris for three reasons: firstly, doris is a stream batch computing framework, so that the familiarity of developers can be reduced, and the workload of real-time development is also greatly reduced; secondly, doris is mainly Sql, so that subsequent development personnel quickly and conveniently get on hand; third, the source data is stored in a unified place, thus being convenient to manage.
Meanwhile, the real-time operation mode specifically includes that the device data and the log data are collected to a message queue Kafka through a mode that data flow is Fluentd, the service database is collected to the message queue Kafka through a mode that data flow is Canal, and then the message queue Kafka is operated through a streaming computing framework Flink which is processed in real time, and operation results are stored in a database Doris.
The flow technology of real-time operation: the related data sources are collected according to different data flow modes and other technologies; the data is collected to a message queue Kafka which is used as a buffer memory to ensure that the data is reliable and stable when big data comes up. Then, the method calculates to the Doris in real time by writing the Flink program, or synchronizes to the Doris firstly by using the route Load of the Doris, and then directly calculates by using the Doris.
Specifically, the offline computing mode is to collect the device data and the log data into a message queue Kafka in a mode that a data stream is Fluentd, collect the service data into the message queue Kafka in a mode that the data stream is Canal, and collect the message queue Kafka into a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
The method comprises the steps of collecting equipment data, log data and business data into a system infrastructure Hadoop, and storing the data on the system infrastructure Hadoop, wherein the system infrastructure Hadoop can store heterogeneous data, so that the problem that data fields are continuously changed and increased can be solved, and the table structure of the data fields is not required to be modified when the data fields are increased; data is stored at the system infrastructure Hadoop layer as the original data layer ODS layer where the data is synchronized to Doris using a computing engine Spark and other levels of the data warehouse are built based on Doris.
The business data and the interface data are collected on the Doris, because the change frequency of the data field is low, when the change of the data field is increased, the data can still exist in the library, so that the data flow can be reduced as much as possible, and the data can be managed conveniently.
The document data uses Hadoop API, other data and possible data sources use Spark, and the data are collected on the Hadoop and also serve as ODS layers.
The overall offline digital bin technical architecture is mainly based on Doris, hadoop stores a part of ODS layer data, and Hive is used as a backup of historical data.
Preferably, the data bin modeling comprises an original data layer, a detail theme layer, a dimension data layer, a wide-table data layer and a data application layer;
the original data layer is used for keeping original data without any processing;
the detail topic layer is used for cleaning the data of the original data layer and establishing an event topic model according to the business division topics;
the dimension data layer is used for cleaning the data of the original data layer based on actual business to construct a consistency data analysis dimension table of the whole enterprise;
the wide table data layer is used for carrying out association analysis on the detail subject layer and the dimension data layer based on upper application and product index summary, and constructing a summary index wide table with public granularity;
the data application layer is used for carrying out statistics processing on the detail topic layer, the dimension data layer and the wide-table data layer by layer according to service requirements to provide each data application system.
A data acquisition method for ceramic industry comprises the following steps:
establishing a data warehouse, establishing a data storage unit for storing the business data collected by each business department, and classifying the business data into a plurality of data sources;
data operation processing, namely performing operation on classified data sources through two modes of real-time operation and off-line calculation;
and establishing a number bin modeling, and storing and classifying the operation processing result of the data source in the number bin modeling.
Preferably, in the step of creating a data warehouse, the step specifically includes classifying the collected utility data into a plurality of data sources, including: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
Specifically, in the step of the data operation processing, the real-time operation mode specifically includes collecting the device data and the log data into a message queue Kafka in a mode that a data stream is Fluentd, collecting the service database into the message queue Kafka in a mode that the data stream is Canal, and then storing an operation result into a database Doris after the message queue Kafka is operated by a streaming computing framework Flink of the real-time processing.
Preferably, in the step of the data operation processing, the offline computing mode specifically includes that the device data and the log data are collected into a message queue Kafka in a mode that a data stream is Fluentd, the service database is collected into the message queue Kafka in a mode that the data stream is Canal, and then the message queue Kafka is collected into a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
In the description herein, reference to the term "embodiment," "example," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The technical principle of the present invention is described above in connection with the specific embodiments. The description is made for the purpose of illustrating the general principles of the invention and should not be taken in any way as limiting the scope of the invention. Other embodiments of the invention will be apparent to those skilled in the art from consideration of this specification without undue burden.

Claims (10)

1. A ceramic industry data system, comprising:
a business platform for collecting business data of each business part;
a data center for receiving the business data of each business section and performing data processing to generate data information;
and the data product is used for displaying and feeding back the corresponding data information.
2. The ceramic industry data system of claim 1, wherein the data center includes a data warehouse, data processing, and data warehouse modeling;
the data warehouse is used for storing the business data acquired by each business part and classifying the business data into a plurality of data sources;
the data processing is used for carrying out operation on the classified data sources in two modes of real-time operation and off-line calculation;
and the number bin modeling is used for storing the operation processing results of the classified data sources.
3. The ceramic industry data system of claim 2, wherein the data warehouse is categorized into a plurality of data sources, comprising: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
4. A ceramic industry data system according to claim 3, wherein the real-time operation mode is specifically that the device data and the log data are collected into a message queue Kafka in a mode that a data stream is Fluentd, the service database is collected into the message queue Kafka in a mode that the data stream is Canal, and then the message queue Kafka is operated by a streaming computing framework flunk processed in real time, and the operation result is stored in a database Doris.
5. The ceramic industry data system according to claim 3, wherein the off-line computing method specifically includes collecting the device data and the log data into a message queue Kafka by means of a data stream being Fluentd, collecting the service data into the message queue Kafka by means of a data stream being Canal, and collecting the message queue Kafka into a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
6. The ceramic industry data system of claim 2, wherein the number bin modeling comprises a raw data layer, a detail topic layer, a dimension data layer, a broad table data layer, and a data application layer;
the original data layer is used for keeping original data without any processing;
the detail topic layer is used for cleaning the data of the original data layer and establishing an event topic model according to the business division topics;
the dimension data layer is used for cleaning the data of the original data layer based on actual business to construct a consistency data analysis dimension table of the whole enterprise;
the wide table data layer is used for carrying out association analysis on the detail subject layer and the dimension data layer based on upper application and product index summary, and constructing a summary index wide table with public granularity;
the data application layer is used for carrying out statistics processing on the detail topic layer, the dimension data layer and the wide-table data layer by layer according to service requirements to provide each data application system.
7. The ceramic industry data acquisition method is characterized by comprising the following steps of:
establishing a data warehouse, establishing a data storage unit for storing the business data collected by each business department, and classifying the business data into a plurality of data sources;
data operation processing, namely performing operation on classified data sources through two modes of real-time operation and off-line calculation;
and establishing a number bin modeling, and storing and classifying the operation processing result of the data source in the number bin modeling.
8. The method of claim 7, wherein the step of creating a data warehouse comprises classifying the collected utility data into a plurality of data sources, comprising: device data, log data, business data, interface data, document data, and other data; the device data and the log data are collected in a mode that the data stream is Fluentd; the service database is collected in a way that the data stream is Canal; the interface data are collected in a Spark mode through a data stream; the document data is collected in a mode that the data stream is a Hadoop API; the other data are collected by means of the data stream being Spark.
9. The method for collecting data in ceramic industry according to claim 8, wherein in the step of data operation processing, the real-time operation mode is specifically that the device data and the log data are collected into a message queue Kafka in a mode that a data stream is Fluentd, the service database is collected into the message queue Kafka in a mode that the data stream is Canal, and then the operation result is stored into a database Doris after the message queue Kafka is operated by a stream computing frame Flink of real-time processing.
10. The method for collecting ceramic industry data according to claim 8, wherein in the step of data operation processing, the offline computing method specifically includes collecting the device data and the log data into a message queue Kafka in a manner that a data stream is Fluentd, collecting the service database into the message queue Kafka in a manner that the data stream is Canal, and collecting the message queue Kafka into a system infrastructure Hadoop; collecting the service data into a database Doris by adopting a synchronous tool DataX, collecting the interface data into the database Doris by adopting a calculation engine Spark, and taking the service data and the interface data as an original data layer ODS of the database Doris; collecting the document data into a database Doris by adopting a Hadoop API, collecting the other data into a system infrastructure Hadoop by adopting a calculation engine Spark, and taking the document data and the other data as an original data layer ODS of the database Doris.
CN202310512446.1A 2023-05-09 2023-05-09 Ceramic industry data system and acquisition method Pending CN116307345A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310512446.1A CN116307345A (en) 2023-05-09 2023-05-09 Ceramic industry data system and acquisition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310512446.1A CN116307345A (en) 2023-05-09 2023-05-09 Ceramic industry data system and acquisition method

Publications (1)

Publication Number Publication Date
CN116307345A true CN116307345A (en) 2023-06-23

Family

ID=86788965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310512446.1A Pending CN116307345A (en) 2023-05-09 2023-05-09 Ceramic industry data system and acquisition method

Country Status (1)

Country Link
CN (1) CN116307345A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502559A (en) * 2019-07-25 2019-11-26 浙江公共安全技术研究院有限公司 A kind of data/address bus and transmission method of credible and secure cross-domain data exchange
WO2020019038A1 (en) * 2018-07-25 2020-01-30 Make IT Work Pty Ltd Data warehousing system and process
CN113886465A (en) * 2021-10-11 2022-01-04 重庆长安民生物流股份有限公司 Big data analysis platform for automobile logistics
CN115033646A (en) * 2022-08-11 2022-09-09 深圳联友科技有限公司 Method for constructing real-time warehouse system based on Flink and Doris
CN115982290A (en) * 2022-12-29 2023-04-18 西安交通大学 Big data analysis system for Ether house user behavior and data warehouse establishing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020019038A1 (en) * 2018-07-25 2020-01-30 Make IT Work Pty Ltd Data warehousing system and process
CN110502559A (en) * 2019-07-25 2019-11-26 浙江公共安全技术研究院有限公司 A kind of data/address bus and transmission method of credible and secure cross-domain data exchange
CN113886465A (en) * 2021-10-11 2022-01-04 重庆长安民生物流股份有限公司 Big data analysis platform for automobile logistics
CN115033646A (en) * 2022-08-11 2022-09-09 深圳联友科技有限公司 Method for constructing real-time warehouse system based on Flink and Doris
CN115982290A (en) * 2022-12-29 2023-04-18 西安交通大学 Big data analysis system for Ether house user behavior and data warehouse establishing method

Similar Documents

Publication Publication Date Title
CN106339509A (en) Power grid operation data sharing system based on large data technology
CN111435344A (en) Big data-based drilling acceleration influence factor analysis model
Leng et al. Framework and key enabling technologies for social manufacturing
CN115033646A (en) Method for constructing real-time warehouse system based on Flink and Doris
CN115599524A (en) Data lake system based on cooperative scheduling processing of streaming data and batch data
CN117421376A (en) Method and device for processing number of bins of online analysis of service data stream
CN116307345A (en) Ceramic industry data system and acquisition method
CN115237989A (en) Mine data acquisition system
CN116523328A (en) Intelligent decision-making method for cooperation of aviation equipment and manufacturing industry chain
CN111190704A (en) Task classification processing method based on big data processing framework
CN111209314A (en) System for processing massive log data of power information system in real time
CN112506960B (en) Multi-model data storage method and system based on ArangoDB engine
CN115391429A (en) Time sequence data processing method and device based on big data cloud computing
CN114596046A (en) Integrated platform based on unified digital model of business center station and data center station
CN113421131B (en) Intelligent marketing system based on big data content
CN113973121A (en) Internet of things data processing method and device, electronic equipment and storage medium
Su et al. Research on Enterprise Digital Operation Management Method Based on Digital Middle Platform
Cheng et al. Analysis on the Status of Big Data Processing Framework
Wen et al. Design of user behavior analysis model of e-commerce website based on Spark
Wang et al. Distributed Multi-source Service Data Stream Processing Technology and Application in Power Grid Dispatching System
Zhang et al. Research and application of streaming Data transmission and processing architecture based on Pulsar
Minglun et al. Data oriented analysis of workflow optimization
CN107291954B (en) OC L parallel query method based on MapReduce
Lixiao et al. Research and Design of Multidimensional Data Analysis and Decision Platform for Smart Factory
CN117785980A (en) Online management and analysis system and method based on block chain public chain data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination