WO2024055281A1 - 异常根因分析方法及装置 - Google Patents

异常根因分析方法及装置 Download PDF

Info

Publication number
WO2024055281A1
WO2024055281A1 PCT/CN2022/119262 CN2022119262W WO2024055281A1 WO 2024055281 A1 WO2024055281 A1 WO 2024055281A1 CN 2022119262 W CN2022119262 W CN 2022119262W WO 2024055281 A1 WO2024055281 A1 WO 2024055281A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
product
root cause
production
cause analysis
Prior art date
Application number
PCT/CN2022/119262
Other languages
English (en)
French (fr)
Inventor
王瑜
沈鸿翔
贺王强
沈国梁
兰天
袁菲
汤玥
王海金
何德材
吴建民
王洪
Original Assignee
京东方科技集团股份有限公司
北京中祥英科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司, 北京中祥英科技有限公司 filed Critical 京东方科技集团股份有限公司
Priority to PCT/CN2022/119262 priority Critical patent/WO2024055281A1/zh
Priority to CN202280003212.8A priority patent/CN118056189A/zh
Publication of WO2024055281A1 publication Critical patent/WO2024055281A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment

Definitions

  • the present application relates to the field of computer technology, and in particular to abnormal root cause analysis methods and devices.
  • this application provides an abnormal root cause analysis method and device.
  • an anomaly root cause analysis method which method includes:
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter;
  • the detection data determine the normal product data and abnormal product data in the product data to be processed
  • an abnormal root cause analysis method including:
  • the positive sample and the negative sample are input into the second root cause analysis model to obtain the second influencing factor information of the judgment result of the target object.
  • an anomaly root cause analysis system including a data management server, an analysis server and a display;
  • the data management server is configured to store data, and extract, convert or load data; the data includes at least one of production data and detection data;
  • the analysis server is configured to obtain the product data to be processed corresponding to the target product from the data management server when receiving the task request, and determine the normal product data and the normal product data in the product data to be processed based on the detection data in the product data to be processed. Abnormal product data; and inputting the normal product data and the abnormal product data into the first root cause analysis model to obtain the first impact factor information of the detection result of the target product, where the first impact factor includes the One or more of the production data, wherein the first root cause analysis model indicates a tree model; the product data to be processed is based on the first preset parameters on the production data and detection data corresponding to the target product. obtained by fusion;
  • the display is configured to display the first impact factor information through a visual interface.
  • an abnormality root cause analysis device including:
  • the first data acquisition module is used to obtain the product data to be processed corresponding to the target product; wherein the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter;
  • a first data processing module configured to determine normal product data and abnormal product data in the product data to be processed based on the detection data
  • the first root cause determination module is used to input the normal product data and the abnormal product data into the first root cause analysis model to obtain the first influencing factor information of the detection result of the target product, the first The influencing factors include one or more of the production data, wherein the first root cause analysis model indicates a tree model.
  • an abnormality root cause analysis device including:
  • the second data acquisition module is used to obtain the sample data to be processed corresponding to the target object
  • a second data processing module configured to determine positive samples and negative samples in the sample data to be processed; wherein both the positive samples and the negative samples include first parameters;
  • the second root cause determination module is used to input the positive sample and the negative sample into the second root cause analysis model to obtain the second influencing factor information of the judgment result of the target object.
  • an electronic device including:
  • a memory a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the exception reporting described in the first aspect and various possible designs of the first aspect is implemented Root cause analysis method.
  • an electronic device including:
  • a memory a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the exception root described in the second aspect and various possible designs of the second aspect is realized.
  • Cause analysis method when the processor executes the program, the exception root described in the second aspect and various possible designs of the second aspect is realized.
  • a computer-readable storage medium is provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the processor executes the computer-executable instructions, the above first aspect and the first aspect are realized. Anomaly root cause analysis methods described in terms of various possible designs.
  • a computer-readable storage medium is provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the processor executes the computer-executable instructions, the above second aspect and the second aspect are implemented. Anomaly root cause analysis methods described in terms of various possible designs.
  • a computer program product including a computer program.
  • the computer program When the computer program is executed by a processor, the abnormal root cause analysis as described in the first aspect and various possible designs of the first aspect is implemented. method.
  • a computer program product including a computer program.
  • the computer program is executed by a processor, the abnormal root cause as described in the second aspect and various possible designs of the second aspect is realized. Analytical method.
  • the product data to be processed corresponding to the target product is classified based on the detection data corresponding to each target product, and normal product data and abnormal product data are obtained.
  • the normal product data indicates the product data to be processed of the target product with normal detection results.
  • the abnormal product data indicates the product data to be processed of the target product with abnormal detection results.
  • Both the normal product data and the abnormal product data include production parameters.
  • FIG. 1 is a schematic diagram of a distributed computing environment according to an exemplary embodiment of the present application.
  • Figure 2 is a schematic diagram of a software module in an anomaly root cause analysis system according to an exemplary embodiment of the present application.
  • Figure 3 is a schematic diagram of a data management server shown in this application according to an exemplary embodiment.
  • Figure 4 is a flow chart of an abnormal root cause analysis method illustrated in this application according to an exemplary embodiment.
  • FIG. 5 is a flow chart of another anomaly root cause analysis method according to an exemplary embodiment of the present application.
  • Figure 6 is a flow chart of yet another abnormality root cause analysis method according to an exemplary embodiment of the present application.
  • FIG. 7 is a hardware structure diagram of an electronic device in which an abnormality root cause analysis device is located according to an exemplary embodiment of the present application.
  • Figure 8 is a block diagram of an abnormality root cause analysis device according to an exemplary embodiment of this application.
  • Figure 9 is a block diagram of another anomaly root cause analysis device shown in this application according to an exemplary embodiment.
  • first, second, third, etc. may be used in this application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
  • first information may also be called second information, and similarly, the second information may also be called first information.
  • word “if” as used herein may be interpreted as "when” or “when” or “in response to determining.”
  • defects may occur during the manufacturing process. Examples of defects include particles, residue, line defects, holes, spatter, wrinkles, discoloration, and bubbles. Defects that occur in the manufacturing of semiconductor electronic devices are difficult to track.
  • the present application provides an anomaly root cause analysis system.
  • the anomaly root cause analysis system includes a distributed computing system that includes one or more networked computers configured to execute in parallel to perform at least one common task; one or more computer-readable storage media that store Instructions that cause the distributed computing system to perform the following operations.
  • the distributed computing system includes: a data management server configured to store data and extract, transform or load data, wherein the data includes at least one of production data and detection data; an analysis server, It is configured to obtain data from the data management server when receiving a task request, and perform algorithm analysis on the data to obtain the abnormal root cause (ie, impact factor information, that is, first impact factor information and/or second impact factor information ); and a display configured to provide a visual interface to display the anomaly root cause analysis results.
  • an anomaly root cause analysis system is used for defect analysis in display panel manufacturing.
  • distributed computing system generally refers to an interconnected computer network having multiple network nodes that connect multiple servers or hosts to each other or to an external network (e.g., the Internet) .
  • network node generally refers to a physical network device.
  • Example network nodes include routers, switches, hubs, bridges, load balancers, security gateways, or firewalls.
  • Home generally refers to a physical computing device configured to implement, for example, one or more virtual machines or other suitable virtualization components.
  • a host may include a server with a hypervisor configured to support one or more virtual machines or other suitable types of virtual components.
  • FIG. 1 illustrates a distributed computing environment in accordance with some embodiments of the present application.
  • multiple autonomous computers/workstations called nodes communicate with each other in a network such as a LAN (Local Area Network) to solve tasks, such as executing applications.
  • Each computer node typically includes its own processor(s), memory, and communications links to other nodes.
  • the computers may be located within a specific location (eg, a cluster network) or may be connected through a wide area network (LAN) such as the Internet.
  • LAN wide area network
  • different applications can share information and resources.
  • Networks in distributed computing environments may include local area networks (LAN) and wide area networks (WAN).
  • Networks may include wired technologies (e.g., Ethernet) and wireless technologies (e.g., Code Division Multiple Access (CDMA), Global System for Mobile (GSM), Universal Mobile Telephone Service (UMTS), Bluetooth, wait).
  • CDMA Code Division Multiple Access
  • GSM Global System for Mobile
  • UMTS Universal Mobile Telephone Service
  • Computing nodes in a distributed network may include any computing device, such as a computing device or a user device. Compute nodes may also include data centers. As used herein, a computing node may refer to any computing device or multiple computing devices (ie, a data center). Software modules may execute on a single computing node (e.g., server) or distributed over multiple nodes in any suitable manner.
  • the distributed computing environment may also include one or more storage nodes for storing information related to the execution of the software modules and/or output and/or other functions generated by the execution of the software modules.
  • One or more storage nodes communicate with each other in the network and with one or more compute nodes in the network.
  • the anomaly root cause analysis system includes a distributed computing system that includes one or more networked computers configured to execute in parallel to perform at least one common task; one or more storage instructions.
  • a computer-readable storage medium the instructions, when executed by the distributed computing system, cause the distributed computing system to perform corresponding operating steps.
  • the distributed computing system includes a data management server configured to store data, and to extract, transform, or load data; an analysis server connected to the data management server and configured to receive a task request from The data management server obtains data and performs analysis tasks, and the display is configured to display analysis task results through a visual interface.
  • the analysis server includes multiple business servers (similar to back-end servers) and multiple algorithm servers, and the multiple algorithm servers are configured to obtain data directly from the data management server.
  • the distributed computer system further includes a query engine connected to the data management server and configured to obtain data directly from the data management server.
  • the query engine is a query engine based on Impala technology.
  • the term "connected to" means having a direct flow of information or data from a first component of the system to a second component and/or from a second component of the system to the first component Relationship.
  • the above-mentioned analysis server can obtain the to-be-processed product data corresponding to the target product from the above-mentioned data management server when receiving the task request, and determine the normal products in the to-be-processed product data based on the detection data in the to-be-processed product data.
  • the first root cause analysis model indicates the tree model
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter; accordingly, the above-mentioned display can be displayed through the visual interface Display the first impact factor information.
  • the data management server includes an ETL module configured to extract, transform, or load data from at least one data source into a database of the data management server.
  • at least one algorithm server is configured to obtain the data to be analyzed directly from the data management server.
  • at least one algorithm server is configured to perform calculation analysis on the data to be analyzed and send the result data to the data management server.
  • At least one algorithm server deploys various general algorithms for anomaly root cause analysis, such as algorithms based on big data analysis, which can be algorithms based on specific machine learning models, such as decision trees, random forests, GBDT, LGBM, XGBoost, CatBoost, One or more of Naive Bayes, Support Vector Machine, Adaboost, neural network model, etc., or other statistical algorithm models, such as WOE&IV, Apriori, etc. It also includes the anomaly root cause analysis algorithm mentioned below, which is not limited here.
  • At least one algorithm server is configured to analyze the data to identify the cause of the anomaly.
  • the algorithm server is further configured to reason or predict whether an abnormality occurs based on the production data.
  • the term "ETL module” refers to computer program logic configured to provide functionality such as extracting, transforming, or loading data.
  • the ETL module is stored on the storage node, loaded into memory, and executed by the processor.
  • the ETL module is stored on one or more storage nodes in the distributed network, loaded into one or more memories in the distributed network, and processed by one or more of the distributed networks processor execution.
  • the data management server stores data used in the anomaly root cause analysis system.
  • the data management server stores data required for algorithm analysis by the algorithm server.
  • a data management server stores the results of algorithmic analysis.
  • data from multiple data sources are cleaned and merged into data to be analyzed by the ETL module.
  • Examples of data used for abnormal root cause analysis include product history data, process parameter data, detected abnormality location data, etc.
  • the amount of data in the manufacturing process is huge. For example, there may be more than hundreds of gigabytes of data in a factory site every day. In order to meet users' needs for defect analysis, it is necessary to increase the speed at which the algorithm server reads production data.
  • the data required for algorithm analysis is stored in a database based on Apache Hbase technology to improve efficiency and save storage space.
  • the results of algorithm analysis and other auxiliary data are stored in a data warehouse based on Apache Hive technology.
  • the data can be stored in a database using Apache Beam technology (or Apache Beam model). It can be understood that the data management server may include one or more of an Hbase-based database, a Hive-based data warehouse, and an Apahce Beam-based database.
  • Apache Hive is an open source data warehouse system built on top of Hadoop, which is used to query and analyze big data in structured and semi-structured forms stored in Hadoop files. Apache Hive is mainly used for batch processing, so it is called OLAP.
  • Apache Hbase is a non-relational column-oriented distributed database that runs on top of Hadoop Distributed File System (HDFS). Furthermore, it is a NoSQL open source database that stores data in columns. Apache Hbase is mainly used for transaction processing and is called OLTP. In the case of Apache HbaseTM, real-time processing is possible. Apache Hbase is a NoSQL database.
  • HDFS Hadoop Distributed File System
  • Apache Beam is an open source unified model for defining batch and streaming data parallel processing pipelines. Using the open source BeamSDK, you can build a defined pipeline program.
  • various components of the data management platform may be in the form of distributed data storage based on Apache Hadoop, Apache Hive, or distributed data storage based on Apache Beam. form.
  • Figure 3 illustrates a data management server in some embodiments according to the present application.
  • the data management server includes a distributed storage system (DFS), such as Hadoop Distributed File System (HDFS).
  • the data management server is configured to store data collected from at least one data source.
  • the data source can be a database in the factory production system or other data sources, which are not limited here.
  • the data produced during the factory production process is stored in relational databases (such as Oracle, Mysql, etc.), but grid computing applications based on relational database management systems (RDBMS) have limited hardware scalability.
  • RDBMS relational database management systems
  • Data management servers include data lakes, data warehouses, and NoSQL databases.
  • the data management platform includes multiple sets of data with different contents and/or storage structures.
  • each set of data is defined as a "data layer”
  • data lakes, data warehouses, and NoSQL databases are data management Different data layers in the server.
  • the data lake is configured to store a first set of data, the first set of data is formed by extracting original data from at least one data source through the ETL module, and the first set of data has the same content as the original data.
  • the ETL module first extracts raw data from at least one data source into a data management server to form a first data layer (eg, a data lake).
  • a data lake is a centralized HDFS or KUDU database configured to store any structured or unstructured data.
  • the data lake DL is configured to store a first set of data extracted by the ETL module from at least one data source.
  • the first set of data has the same content as the original data.
  • the dimensions and attributes of the original data are saved in the first set of data.
  • the first set of data stored in the data lake includes dynamically updated data.
  • the dynamically updated data includes real-time updated data from a Kudu-based database, or periodically updated data in the Hadoop distributed file system.
  • the periodically updated data stored in the Hadoop distributed file system is the periodically updated data stored in Apache Hive-based storage.
  • the dynamically updated data includes real-time updated data and periodically updated data.
  • real-time updates mean updates below the minute level, excluding minutes; periodic updates mean updates above the minute level and including minutes. It can be understood that the process of data from the data source to the first data layer is the backup of data content between the two data management systems.
  • the data warehouse is configured to store the second set of data, which is formed by cleaning and standardizing the first set of data through the ETL module.
  • the data management server includes a second data tier, such as a data warehouse DW.
  • a data warehouse DW includes an internal storage system that is configured to provide data in an abstract manner, such as in table format (Table) or view format (View), without exposing the file system.
  • Data warehouse DW can be based on Apache Hive.
  • the ETL module ETLP is configured to extract, clean, transform or load the first set of data to form the second set of data.
  • the second set of data is formed by subjecting the first set of data to cleaning and standardization.
  • the data in the data warehouse layer can be understood as the data obtained after preprocessing the data in the data lake layer.
  • Preprocessing includes cleaning of data, such as removing empty fields, removing duplicates, removing useless fields, etc. Specifically, the server identifies missing values ("NA”, "/”, “null”, “unknown") and converts them into a unified missing value form. Preprocessing also includes standardization of data. For example, the server will detect different time field formats and perform unified standard format conversion.
  • preprocessing also includes data aggregation and fusion. That is, the second set of data also includes data aggregation and fusion processing of the first set of data.
  • Data summary refers to the statistics of the same fields or records in the data table, such as quantity summary, percentage calculation, etc.
  • the defective rate of a substrate (glass) can be calculated by calculating the number of defective panels (panels) contained in a substrate/the total number of panels.
  • Fusion refers to the fusion of data tables. For abnormal root cause analysis, abnormal content data and root cause result data are often generated in two data tables respectively. Through data table fusion, the time content data table and root cause result data can be combined according to the same index field in the two data tables.
  • Data tables are merged into one table.
  • the production data table and inspection data table can be fused according to the same ID to form a complete data table for subsequent analysis.
  • different data can be integrated and split based on different analysis topics to improve the efficiency of subsequent data processing.
  • the transfer of data from the first data layer to the second data layer is to further process the backed up data to facilitate data management, display and other processing operations in the data management server.
  • the preprocessing process in this application can be performed in the data management server, or the data preprocessing (such as cleaning, fusion, etc.) operations can be completed during the data analysis and calculation (executed by the analysis server). For There is no limit on the execution timing of the preprocessing process.
  • the NoSQL database is configured to store a third set of data formed by transforming the second set of data by the ETL module.
  • the third set of data is key-value data.
  • the data management platform includes a third data layer (eg, NoSQL database).
  • the third data layer is a NoSQL type database that stores data that can be used for computing processing, such as HBase, ClikHouse.
  • the ETL module is configured to transform the second set of data of the second data layer to form a third set of data. It can be understood that the change of data storage structure from the second data layer to the third data layer forms a NoSQL type database structure, such as a columnar database structure such as HBase.
  • the data obtained by the analysis server is data in a NoSQL database.
  • a first table is generated in a third data layer and a second table (eg, an external table) is generated in a second data layer.
  • the first table and the second table are configured to be synchronized such that when data is written to the second table, the first table will be simultaneously updated to include the corresponding data.
  • the distributed computing processing module may be used to read data written into the second data layer.
  • the MapReduce module in Hadoop can be used as a distributed computing processing module for reading data written to the second data layer.
  • the data written into the second data layer can then be written into the third data layer.
  • the MapReduce module reads the data written to the second data layer, it can generate an HFile file and bulk load it into the third data layer.
  • first set of data, the second set of data, and the third set of data can be stored and queried based on one or more data tables.
  • the data table of the third group of data can be the same table as the second group of data, or the data table of the second group of data can be split into multiple sub-tables. Multiple sub-tables can be multiple sub-data tables with index relationships.
  • the data table of the third group of data includes multiple sub-data tables with index relationships formed by splitting the data table of the second group of data. Sub-data tables can be split based on the filtering criteria of the user interaction interface and the key and/or value information of the third group of data.
  • the first index in the plurality of index relationships corresponds to the filtering criteria of the front-end interface, for example, corresponds to the user-defined analysis scope or criteria in the user interaction interface communicating with the data management server, thereby facilitating faster data query and calculation process.
  • the plurality of sub-data tables include a first sub-table, a second sub-table and a third sub-table; the first sub-table includes data filtering options presented by the visual interface; the second sub-table The sub-table includes product serial numbers; the third sub-table includes data corresponding to the product serial numbers.
  • the plurality of sub-data tables further includes a fourth sub-table that includes manufacturing site information and/or equipment information, and the third sub-table includes codes for the manufacturing site and/or equipment. or abbreviation.
  • the plurality of sub-tables has an index relationship between at least two sub-tables of the plurality of sub-tables.
  • split the data in the multiple sub-tables based on filtering criteria, key and/or value information of the third set of data.
  • the plurality of sub-tables include a first sub-table (eg, attribute sub-table), which includes data filtering options (such as production time, production equipment, production engineering, etc.); the second sub-table, which includes the product serial number (for example, the substrate identification number or the batch identification number); and the third sub-table (for example, the main sub-table), which includes the third set of data related to the product sequence The value corresponding to the number.
  • the environmental factors described in this article include: environmental particle conditions, equipment temperature and equipment pressure, etc.
  • the second sub-table may include different designated keys, such as substrate identification numbers or lot identification numbers (eg, multiple second sub-tables).
  • the values in the third set of data correspond to the substrate identification number through an index relationship between the third sub-table and the second sub-table.
  • the plurality of sub-tables further includes a fifth sub-table (for example, a metadata sub-table), which includes values corresponding to the batch identification number in the third set of data.
  • the second sub-table also includes a batch identification number; the value corresponding to the batch identification number in the third set of data can be obtained through the index relationship between the second sub-table and the fifth sub-table.
  • the plurality of sub-tables also includes a fourth sub-table (eg, code generator sub-table) that includes manufacturing site information and/or equipment information.
  • the third sub-table includes codes or abbreviations of manufacturing sites and/or equipment. Through the index relationship between the third sub-table and the fourth sub-table, the manufacturing site information and/or equipment can be obtained from the fourth sub-table. information.
  • the third sub-table only stores manufacturing site and/or equipment information, which can reduce the amount of data storage.
  • tables stored in the third data layer may be divided into at least three sub-tables.
  • the first sub-table corresponds to the data range options used for user filtering or definition in the user interaction interface (such as production time, production equipment, production engineering, etc.).
  • the second subtable corresponds to the specified key (e.g. product ID).
  • the third sub-table corresponds to values (for example, production data and inspection data corresponding to product IDs).
  • the product range that the user needs to analyze can be determined through the first sub-table, and the corresponding data (value) in the third sub-table is queried based on the serial number (key) of the corresponding product in the second sub-table.
  • the third data layer utilizes a NoSQL database based on Hbase; the specified key in the second sub-table can be a row key; and the fused data in the third sub-table (column family corresponding to the row key data) can be stored in the column family data model.
  • the values in the third sub-table may be data after fusion of production data and detection data.
  • the third data layer may also include a fourth sub-table.
  • the fourth subtable includes characters corresponding to the codes stored in the third subtable (eg, device name, manufacturing site). Indexes or queries between the first, second, and third subtables may be based on the code. The fourth subtable can be used to replace the code with characters before the results are presented to the user interface.
  • this document describes data flows, data transformations, and data structures between various components of a data management server.
  • the raw data collected by the data source includes production data and inspection data.
  • production data includes history data and parameter data.
  • Historical data information contains information about specific processes that a product (such as a panel or substrate) underwent during manufacturing. Examples of the specific handling a product undergoes during manufacturing include plants, processes, stations, equipment, chambers, slots, and operators.
  • Parametric data information contains information about specific environmental parameters and their changes that a product (such as a panel or substrate) is subjected to during manufacturing. Examples of specific environmental parameters and changes to which a product is subjected during manufacturing include ambient particle conditions, equipment temperature, equipment pressure, etc.
  • Defect information contains information on product quality based on inspection. Example product quality information includes defect type, defect location, defect size, etc.
  • various business data generated by a factory are integrated into multiple data sources (eg, Oracle database).
  • the ETL module ETLP uses data stack tools, SQOOP tools, kettle tools, Pentaho tools or DataX tools to extract data from multiple data sources into the data lake.
  • the data is then cleaned, transformed and loaded into data warehouses and NoSQL databases.
  • Data lakes, data warehouses, and NoSQL databases utilize tools such as Kudu, Hive, and Hbase to store large amounts of data and analysis results.
  • the information generated during various stages of the manufacturing process is obtained by various sensors and inspection equipment and is subsequently stored in multiple data sources.
  • the root cause analysis system extracts and stores them in the data management server, and the calculation and analysis results generated by the root cause analysis system are also saved in the data management server.
  • Data synchronization (data flow) between various data layers (tables) of the data management server is realized through the ETL module.
  • the ETL module is configured to obtain the parameter configuration template of the synchronization process, including network permission and database port configuration, incoming database name and table name, outgoing database name and table name, field correspondence, task type, scheduling cycle, etc.
  • the ETL module configures the parameters of the synchronization process based on the parameter configuration template.
  • the ETL module synchronizes data and cleans the synchronized data based on the process configuration template.
  • the ETL module cleans the data through SQL statements to remove null values, remove outliers, and establish correlations between related tables.
  • Data synchronization tasks include data synchronization between multiple data sources and the data management server, as well as data synchronization between various layers of the data management server.
  • data extraction to the data lake can be done in real time or offline.
  • offline mode corresponding to the batch import below
  • data extraction tasks are scheduled periodically.
  • the extracted data may be stored in a Hadoop distributed file system-based storage device (eg, a Hive-based database).
  • real-time mode corresponding to the real-time import below
  • data extraction tasks can be executed by OGG (Oracle GoldenGate) combined with Apache Kafka.
  • OGG Order to the real-time import below
  • the extracted data can be stored in a Kudu-based database.
  • OGG reads log files from multiple data sources (e.g., Oracle database) to obtain add/delete data.
  • the topic information is read by Flink and Json is selected as the synchronization field type.
  • the front-end interface may perform display, query, and/or analysis based on data stored in a Kudu-based database.
  • the front-end interface may be based on data stored in any one or any combination of a Kudu-based database, a Hadoop distributed file system (e.g., an Apache Hive-based database), and/or an Apache Hbase-based database. Perform displays, queries, and/or analyses.
  • short-term data (e.g., generated over a few months) is stored in a Kudu-based database
  • long-term data e.g., all data generated over all cycles
  • a Hadoop distributed file system for example, an Apache Hive-based database
  • the ETL module is configured to extract data stored in a Kudu-based database into a Hadoop distributed file system (e.g., an Apache Hive-based database).
  • data fusion may be performed based on different topics for the second set of data in the second data layer.
  • the fused data has a high degree of theming and aggregation, which greatly improves the query speed.
  • tables from a data warehouse can be used to build tables with dependencies structured based on different user needs or different topics, assigning names to the tables based on their respective purposes.
  • Various topics can correspond to different data analysis needs.
  • topics can correspond to different analysis needs.
  • a topic may correspond to an analysis of anomalies attributed to one or more manufacturing node groups (e.g., one or more equipment), and data fusion based on the topic may include historical information about the manufacturing process and Data fusion of defect information.
  • a topic may correspond to an analysis of anomalies attributed to one or more parameter types, and data fusion based on the topic may include data fusion regarding parameter feature information and defect information.
  • a topic may correspond to an analysis of anomalies attributed to one or more device operations (e.g., a device defined by a corresponding operating site where the corresponding device performs the corresponding operation), and data fusion based on the topic may Data fusion is performed on at least two types of information including parameter feature information, manufacturing process history information and defect information.
  • the subject may correspond to feature extraction of at least one type of parameter information to generate parameter feature information, wherein a maximum value, a minimum value, an average value, and a median value are extracted for one type of parameter information. one or more.
  • at least one type of parameter information includes at least one device parameter, such as data on temperature, humidity, pressure, etc., and also includes data on environmental granularity.
  • the data management server may be an Apache Beam-based database to implement batch and stream parallel processing of data.
  • use Apache Beam to connect to data sources, extract data within the preset production cycle in batches, store it in databases such as hive, Hbase or ClickHouse for data precipitation, use it to analyze the server, and perform abnormal root cause analysis through analysis algorithms, so as to Accurately locate the cause of the abnormality and trace it back in time.
  • the implementation of a database based on Apache Beam includes: first, the distributed computing system receives the BeamSDK class library component; second, builds a data pipeline (pipe line) and defines the data type of key-value pairs , optionally, the key is the sample (product) ID, and the value is the corresponding production data and detection data; again, define the data processing method in the pipeline, optionally, the total number of samples (products), the number of exceptions, the abnormality rate, To calculate the arrival rate, you can also define the relevant methods of data processing by the ETL module mentioned above in the pipeline; finally, define the data flow direction at the end of the pipeline, and optionally the data can flow to the analysis server (such as business server, algorithm server).
  • users can edit and configure processes for data source merging, data conversion, and data calculation operators through drag-and-drop components in the visual interface.
  • the software module also includes a load balancing server connected to the analysis server.
  • the load balancing server (for example, the first load balancing server) is configured to receive the task request and to allocate the task request to one or more of the plurality of business servers to implement the communication between the plurality of business servers. load balancing.
  • the load balancing server (for example, the second load balancing server) is configured to allocate tasks from the multiple business servers to one or more of the multiple algorithm servers to achieve load balancing among the multiple algorithm servers.
  • the load balancing server is a load balancing server based on Nginx technology.
  • the abnormal root cause analysis method is introduced below. It can be understood that all or part of the steps in this method can be implemented based on a distributed computing system, an analysis server, or an algorithm server.
  • Figure 4 is a flow chart of an abnormal root cause analysis method according to an exemplary embodiment of the present application, which includes the following steps:
  • Step 401 Obtain product data to be processed corresponding to the target product.
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter.
  • the product when it is necessary to determine the cause that affects the detection result of the product, the product is used as the target product, and the production data and detection data of the target product are obtained.
  • the production data represents data related to the production of the target product, such as the processing equipment that the target product passes through, the production temperature of the target product, etc.
  • Production data includes production parameters (that is, production parameter names) and parameter values corresponding to the production parameters. For example, when the production data includes production temperature and its corresponding specific value, the production temperature is the production parameter, and the specific value corresponding to the production temperature is the production temperature. the corresponding parameter value.
  • the production data represents the historical information and processing information of the product during production and processing.
  • the resume parameters include information such as product ID, basic product attributes, process sections, process sites, equipment models and other information that the product has experienced during the production process.
  • Processing parameters include processing information of products in equipment corresponding to different process sections and/or equipment models, such as pressure, temperature, pressure holding time, etc.
  • the detection data represents data related to detecting the target product.
  • the detection data indicates the detection result of the target product after production and processing, and the detection result indicates whether the target product (that is, the target product after production and processing) is abnormal.
  • the detection data includes detection parameters (that is, detection parameter names) and parameter values corresponding to the detection parameters.
  • detection parameters that is, detection parameter names
  • parameter values corresponding to the detection parameters For example, when the detection data includes the detection results of the target product and its corresponding specific values (such as whether the target product is abnormal or the extent of the abnormality), then the detection result
  • the specific values corresponding to the detection results are the parameter values corresponding to the detection results of the target product.
  • the product when the process section ends, the product will be optically or electrically tested to check whether the product quality meets the standard, so as to obtain the corresponding test result.
  • the name of the test result is the test parameter
  • the specific value of the test result is the test parameter. the corresponding parameter value.
  • the first preset parameters that exist in both the production data and the detection data are used to fuse the production data and detection data of the target product to obtain the product data of the target product.
  • the number of target products is at least one.
  • the first preset parameter is a product identification
  • each piece of production data includes the product identification and its corresponding parameter value.
  • Each piece of detection data includes product identification and its corresponding parameter values. For each piece of production data (that is, the production data corresponding to a target product), obtain the parameter value corresponding to the product identification in the production data, and use the detection data with the parameter value corresponding to the product identification as the detection data to be fused.
  • the production data and the detection data to be fused are fused, that is, merged, to obtain the product data of the target product corresponding to the production data.
  • the production data corresponding to a target product includes parameter 1 and its corresponding parameter value and parameter 2 and its corresponding parameter value; parameter 1 is the first preset parameter, then after determining the detection data to be fused, the detection data Including parameter 1 and its corresponding parameter value and parameter 3 and its corresponding parameter value.
  • the product data obtained by fusion is parameter 1 and its corresponding parameter value, parameter 2 and its corresponding parameter value, and parameter 3 and its corresponding parameter value.
  • production data and inspection data are usually obtained from two tables. Fusion of these two tables into one table through the first preset parameter can facilitate subsequent data processing and calculation.
  • the data to be processed corresponding to the target product can be obtained through the data pipeline built based on the above Apache Beam model.
  • Step 402 Determine normal product data and abnormal product data in the product data to be processed based on the detection data.
  • the target is determined based on the detection data (ie, the parameter values corresponding to the detection parameters) in the product data to be processed corresponding to the target product.
  • the detection data ie, the parameter values corresponding to the detection parameters
  • the target product is used as normal product data
  • the target product includes product 1
  • the detection parameters corresponding to product 1 include defective point detection results.
  • the parameter value corresponding to the defective point detection result indicates that the target product has defective points.
  • product 1 is an abnormal product.
  • product 1 The corresponding product data to be processed is abnormal product data.
  • the number of defective points or the proportion of defects in the detection results reaches a preset threshold, it is considered an abnormal product, otherwise it is a normal product.
  • Step 403 Input the normal product data and abnormal product data into the first root cause analysis model to obtain the first impact factor information of the test results of the target product.
  • the first impact factor includes one or more of the production data, where, The first root cause analysis model indicates the tree model.
  • both the normal product data and the abnormal product data are input into the first root cause analysis model (such as a tree model), so that the third A root cause analysis model analyzes it to use the production parameters in the production data to determine the influencing factor information that affects the test results of the target product, that is, the first influencing factor information.
  • the first root cause analysis model such as a tree model
  • the first impact factor information includes the first impact factor and/or the impact score corresponding to the first impact factor.
  • the impact score indicates the degree of impact of the first impact factor on the test results of the target product.
  • the test result indicates that the target product is abnormal.
  • the impact score corresponding to the first impact factor is higher, it means that the first impact factor affects the abnormality of the target product. The higher the degree, that is, the more likely the first impact factor is to be the cause of the abnormality affecting the target product.
  • the causes of abnormal product results can be mainly attributed to abnormalities that occur during the production process.
  • the above-mentioned first influencing factor must be reflected in the production data, which may be caused by certain parameters in the production data acting alone or together. the result of. Therefore, what the first influence factor finally presents is one or more parameters in the above-mentioned production data.
  • a root cause analysis model is needed. Conduct intelligent analysis on the production parameters that caused the abnormality to obtain the impact score of each production parameter or combination, thereby determining the first impact factor information.
  • the first influencing factor includes one or more factors in the production data; the first root cause analysis model indicates a tree model.
  • both normal product data and abnormal product data are input into the first root cause analysis model, that is, the tree model.
  • the process of obtaining the first impact factor information can be through multiple rounds of training on the tree model to finally obtain the impact factor. It is also possible to obtain the influence factors without training the tree model and use the calculation principle of the tree model.
  • the normal product data and the abnormal product data are input into the tree model to obtain the first influencing factor information of the detection results of the target product, including calculating the purity index of the production data, based on the purity index Determine the first impact factor information.
  • the tree model is a decision tree model. Its principle is to determine the root node and multi-level sub-nodes of the decision tree by calculating the purity index. The importance of the corresponding influencing factors gradually decreases from the root node to the successive sub-nodes. Therefore, the first influencing factor information that affects the detection results can be directly obtained through the decision tree algorithm (calculation of purity index) without training.
  • the influence score corresponding to the first influence factor can be determined through a tree model, that is, the purity index is determined.
  • the purity index is determined.
  • the information gain index can be used as the purity index
  • the information gain rate index can be used as the purity index.
  • the Gini system can be used as a purity indicator.
  • inputting the normal product data and the abnormal product data into a tree model, and obtaining the first impact factor information of the detection result of the target product includes training the tree model, thereby obtaining the first impact factor information.
  • Factor information see detailed steps below.
  • the product data to be processed corresponding to the target product is classified based on the detection data corresponding to each target product, and normal product data and abnormal product data are obtained.
  • the normal product data indicates the product data to be processed of the target product with normal detection results.
  • the abnormal product data indicates the product data to be processed of the target product with abnormal detection results.
  • Both normal product data and abnormal product data include production parameters.
  • the production parameters are used to determine the influencing factor information that affects the detection results of the target product, that is, the first influencing factor information, so that when the detection result is abnormal, It can determine the cause of product anomalies, realize automatic analysis of product data, and realize the automatic determination of the cause of product anomalies, that is, the root cause, without relying on manual determination, ensuring the accuracy of determining the influencing factors of the product, and improving the determination efficiency.
  • Figure 5 is a flow chart of another anomaly root cause analysis method shown in this application according to an exemplary embodiment. This embodiment describes how to determine the first influencing factor based on the previous embodiment. The process will be described in detail below with reference to a specific embodiment. As shown in Figure 5, the method includes the following steps:
  • Step 501 Obtain product data to be processed corresponding to the target product.
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter.
  • the discrete data in the product data to be processed is encoded so that the encoded data is more convenient for subsequent data analysis.
  • the encoding process is specifically: obtaining the production The type corresponding to the parameter.
  • the type corresponding to the production parameter is a preset discrete type
  • the parameter value corresponding to the production parameter in the product data to be processed is encoded, and the encoding result is used as the new parameter value of the production parameter.
  • the survival parameters in the product data to be processed include the name of the process site.
  • the type corresponding to the name of the process site obtained is the name type. If the name type belongs to the preset discrete type, it is determined that the parameter value corresponding to the name of the product site needs to be encoded. , where the parameter value corresponding to the process site name indicates whether the target product passes through the process site corresponding to the process site name.
  • the specific process is: obtaining the parameter value corresponding to the process site name from the product data to be processed.
  • the parameter value corresponding to the process site name indicates that the target product passes through the process site, the parameter value corresponding to the process site name is updated to the first coded value, that is, the first coded value is used as the parameter value corresponding to the process site name.
  • the parameter value corresponding to the process site name indicates that the target product has not passed the process site
  • the parameter value corresponding to the process site name is updated to the second coded value, that is, the first coded value is used as the parameter value corresponding to the process site name.
  • the first encoding value and the second encoding value can be set according to actual needs.
  • the first encoding value is 0 and the second encoding value is 1.
  • the type corresponding to the production parameter it can be obtained from the preset related table.
  • the parameter values corresponding to the detection parameters can also be encoded.
  • the processing process is similar to the encoding process of the parameter values corresponding to the production parameters, and will not be described in detail here.
  • the data storage device in this application can be implemented based on the data management server mentioned above, or other storage devices and databases can be used for storage, which is not limited in this application.
  • the data storage device may be a distributed database.
  • the parallel processing of distributed databases can meet the storage and processing requirements of massive data. Users can process simple data through SQL queries, and custom functions can be used to implement complex processing. Therefore, when analyzing massive data, extracting data into a distributed database will not cause damage to the original data on the one hand, and improve the efficiency of data analysis on the other hand.
  • the method of extracting data into the storage device includes one or more of the following methods: 1) Manual import, for the data to be analyzed The user can complete the data import at one time through the interactive interface, so that the storage device can obtain the data to be analyzed; 2) Batch import, similar to manual import, the user can call the API interface of the distributed file system HDFS through the interactive interface or address, to import a large amount of data in batches; 3) real-time import, by establishing a connection between the original database and the storage device in the analysis system, based on technologies such as Kafka, to achieve real-time import of original data.
  • Step 502 Determine normal product data and abnormal product data in the product data to be processed based on the detection data.
  • Step 503 Input normal product data and abnormal product data into the tree model to train the tree model.
  • Step 504 Determine the first influencing factor information based on the trained tree model.
  • normal product data and abnormal product data are input into the tree model to train the tree model.
  • the trained tree model can output the production parameters that affect the detection results of the target product, that is, obtain the first influencing factor information.
  • the above training of the tree model means adjusting the number of production parameters and the corresponding weights of the production parameters.
  • the first influence factor information is determined based on the weight of the production parameters. Specifically, during the training process, the number of production parameters is increased or decreased, and the weight of the production parameters is adjusted. Then, select a certain number of production parameters in order of weight from high to low, and use the selected production parameters as the first influencing factor. Or use the production parameter whose weight is higher than the preset weight threshold as the first influencing factor.
  • the weight of the production parameter can be understood as the influence factor score corresponding to the production parameter.
  • the influence factor score corresponding to the production parameter can be used as the influence score of the first influence factor.
  • the above tree model is trained based on a preset training algorithm.
  • the preset training algorithm includes one or more of a backward search algorithm, a forward search algorithm, a two-way search algorithm, and a random search algorithm.
  • Exemplary backward search algorithms include recursive feature elimination (Recursive Feature Elimination), which uses a tree model to perform multiple rounds of training. After each round of training, features with a number of weight coefficients are eliminated, or a threshold is set to eliminate features smaller than Threshold features, and then perform the next round of training based on the new feature set, and continue to recurse until the number of remaining features reaches the required number of features.
  • Recursive Feature Elimination uses a tree model to perform multiple rounds of training. After each round of training, features with a number of weight coefficients are eliminated, or a threshold is set to eliminate features smaller than Threshold features, and then perform the next round of training based on the new feature set, and continue to recurse until the number of remaining features reaches the required number of features.
  • model training through the backward search algorithm can reduce the number of production parameters.
  • the forward search algorithm first selects an optimal forward search algorithm
  • the single feature subset is used as the first round feature subset. On this basis, another feature is added to form a new two feature subset.
  • Model training is performed, the optimal dual feature subset is selected, and iterative training is continued. Update until the optimal feature subset is found.
  • This method also belongs to the greedy algorithm of heuristic search. Here, model training through the forward search algorithm can increase the number of production parameters.
  • the two-way search algorithm means that backward and forward searches are performed simultaneously until both search for the same optimal feature subset.
  • the random search algorithm randomly generates feature subsets and then performs forward or backward search.
  • the corresponding first influence factor that is, the feature subset will be obtained.
  • the production parameters are fused during the tree model training process.
  • the fusion process indicates performing feature crossover and/or mutation on the production parameters to obtain new features, that is, the production parameters.
  • the fusion processing includes feature intersection processing and/or fusion processing based on genetic algorithm (CA).
  • CA genetic algorithm
  • the fusion process based on the genetic algorithm is: first, a batch of feature sets are randomly produced, and each feature set includes one or more features (ie, production parameters). After the tree is trained, the first scoring is performed based on the model test effect as the evaluation index. The feature selection results are summarized, and new feature sets are generated through crossover, mutation, etc. of each feature set. The features are continuously updated iteratively and the fittest are eliminated. Finally, the feature set with the highest evaluation is obtained, that is, the synthetic parameters.
  • features ie, production parameters
  • the feature intersection processing process is: combining (multiplying or Cartesian product) individual features to form a composite feature, which helps to represent non-linear relationships.
  • linear models can be trained efficiently. Therefore, when using extended linear models, supplemented by feature combinations, it is an effective way to train large-scale data sets, and many different kinds of feature combinations can be created.
  • [A X B] Multiply the values of two features to form a composite parameter.
  • [A x B x C x D x E] A composite parameter formed by multiplying the values of five features.
  • [A x A] A composite parameter formed by squaring the values of a single feature.
  • the mutation processing instructions use log, square and other methods to mutate based on the production parameters themselves to obtain new production parameters.
  • the tree model is a simple machine learning model, that is, it has low complexity and can be considered as a base model. More complex integrated models can be formed by combining multiple base models. The level of complexity can be understood as depending on the number of basic models and the number of model parameters.
  • the first root cause analysis model may also indicate at least one integrated tree model, which is obtained by integrating multiple tree models, that is, the integrated tree model includes multiple tree models. It can be understood that the integrated tree model is also a kind of tree model, but it is more complex than the single tree model.
  • Input the normal product data and abnormal product data into the first root cause analysis model to obtain the first influencing factor information of the target product's detection results including: input the normal product data and abnormal product data into the integrated tree model to analyze Ensemble tree model for training. According to the trained ensemble tree model, the first influencing factor information is determined.
  • the weight of each production parameter is adjusted.
  • the trained ensemble tree model can output the production parameters that affect the detection results of the target product, that is, obtain the first influencing factor information.
  • the first impact factor information is determined based on the weight of the production parameters. For example, select a certain number of production parameters in order of weight from high to low, and use the selected production parameters as the first influencing factor. Or use the production parameter whose weight is higher than the preset weight threshold as the first influencing factor.
  • the weight can be an L1 regular term.
  • feature selection can be achieved based on parameter adjustment of L1 regularization.
  • the penalty term L1 regularization is generally introduced into the loss function, which can generate a sparse weight matrix, that is, a sparse model for feature selection.
  • the integrated method includes at least one of boosting, bagging, and stacking.
  • boosting methods usually target homogeneous weak learners (i.e. homogeneous base models), learn these weak learners sequentially in a highly adaptive method (each base model depends on the previous model), and follow Some deterministic strategy brings them together.
  • adoptive Boosting adaptive boosting AdaBoost
  • Gradient Boosting Adaptive Boosting
  • Adaptive Boosting affects the error function by increasing the weight coefficient of the wrong sample points and reducing the weight coefficient of the correct sample points, thereby allowing the model to focus its learning on sample data with larger weight coefficients.
  • Gradient Boosting is calculated by changing the target value of the sample.
  • a weak model is built for the negative gradient of the loss function (that is, the negative gradient value of the sample is the new target value), and then the learned weak model is As the latest item of the additive model is integrated into the additive model, the weak model is constructed sequentially until a threshold or other stopping condition is met.
  • the training set of bagging's individual weak learners is obtained through random sampling.
  • three sampling sets can be obtained.
  • three weak learners can be trained independently, and then the final strong learner can be obtained through an ensemble strategy for these three weak learners.
  • the bootstrap sampling method is generally used, that is, for the original training set of m samples, one sample is randomly collected and put into the sampling set each time, and then the sample is put back, that is to say, next time The sample may still be collected during sampling. By collecting m times, a sampling set of m samples can finally be obtained. Because it is randomly sampled, each sampling set is different from the original training set and different from other sampling sets, thus obtaining multiple different weak learners.
  • stacking methods usually consider heterogeneous weak learners (i.e. heterogeneous base models), by learning multiple different base models in parallel, and combining them by training a meta-model, based on the predictions of different weak models The result is a final prediction result.
  • heterogeneous weak learners i.e. heterogeneous base models
  • the ensemble tree includes any one of a random forest model, a LGBM model, a GBDT model, an XGBoost model, and a CatBoost model.
  • the first root cause analysis model is a base model (such as a single tree model)
  • the model is relatively simple, in order to ensure more accurate training efficiency, the number of production parameters and the production parameters need to be adjusted during the training process.
  • the corresponding weights of the parameters can be used to obtain more accurate impact factor results.
  • the first root cause analysis model is an integrated model (such as an integrated tree model)
  • conventional training can obtain a more accurate training effect, so there is no need to perform additional individual production parameters. Adjustment of numbers and weights.
  • the initial parameter information of the decision tree of the ensemble tree includes the number of decision tree leaves ranging from 2 to 500, the number of decision trees ranging from 25 to 325, and the maximum depth of the decision tree.
  • the range is 1 to 20
  • the L1 regular term coefficient is 1.00E-10 to 1.00E-01
  • the L2 regular term system is one or more of 1.00E-10 to 1.00E-01.
  • the decision tree information of the LGBM model is that the number of decision tree leaves ranges from 2 to 500, the number of decision trees ranges from 25 to 325, the maximum depth of the decision tree ranges from 1 to 20, and the L1 regularization coefficient is 1.00E When -10 to 1.00E-01 and the L2 regular term system are 1.00E-10 to 1.00E-0, the prediction effect is better than using the default value.
  • the initial parameter information of the decision tree of the ensemble tree model includes one or more of the decision tree depth from 1 to 16, the maximum number of trees from 25 to 300, and the L2 regularization coefficient from 1 to 100. indivual.
  • the prediction effect is better than using the default value.
  • the number of production parameters in the product data will also affect the initial parameters of the ensemble tree model.
  • the first root cause analysis model indicates at least two ensemble tree models.
  • each ensemble tree model corresponds to a weight.
  • the complexity of the model can be further increased by integrating at least two tree models to obtain better results.
  • the final output result can be obtained by averaging or voting.
  • the averaging method is to set different weights for each model; the voting method is to output results from multiple models separately, and predict the results based on voting rules similar to the minority obeying the majority.
  • the calculation result of a model algorithm (tree model (training or non-training), ensemble tree model, multiple basic number models, etc.) is used as the impact score of the final first impact factor to obtain the first impact factor.
  • the final result may still be inaccurate. Therefore, at least two impact factor score calculation methods (purity index calculation, production parameter weight calculation) can be weighted to obtain the impact factor score of the production parameter.
  • the influence factor score of the production parameter is the influence score of the first influence factor, thereby realizing the determination of the first influence factor information. For example, if a certain number of production parameters are selected in the order of impact factor scores from high to low, the selected production parameter will be the first impact factor. Correspondingly, the impact factor score of the selected production parameter will be the first impact factor. impact score.
  • the calculation method of the influence factor score of the production parameters also includes correlation analysis indicators, distance indicators, consistency indicators, etc.
  • the production parameters in the normal product data and the abnormal product data may first be dimensioned up or down. dimension processing. Then, the production parameters processed by dimensionality enhancement or dimensionality reduction are input into the first root cause analysis model.
  • factor synthesis processing can be performed on the production parameters based on the dimension-raising algorithm to obtain new synthetic parameters, that is, production parameters.
  • the production parameters can be combined with relevant factors based on the dimensionality reduction algorithm.
  • correlation factor combination processing indicates dimensionality reduction processing for production parameters with correlation.
  • dimensionality enhancement algorithms include one-hot encoding, feature cross (Featrue Cross) and other algorithms, which further explore their own data patterns to obtain new parameters.
  • the dimensionality reduction algorithm mainly maps the data points in the original high-dimensional space to the low-dimensional space to reduce the calculation cost, and at the same time considers the correlation between features, that is, the dimensionality reduction algorithm is used for the existence of correlation.
  • Perform dimensionality reduction processing on the production parameters that is, mutate multiple correlated production parameters to obtain a representative parameter. Therefore, when determining the first influencing factor, the multiple production parameters are no longer used, but only the corresponding multiple production parameters are used.
  • the representative parameters are enough.
  • the dimensionality reduction algorithm includes one of the principal component analysis (PCA) algorithm, the linear discriminant analysis (Latent Dirichlet Allocation, LDA) algorithm, the multidimensional scaling analysis (Multidimensional scaling, MDS) algorithm, and the manifold learning algorithm. or more.
  • PCA principal component analysis
  • LDA linear discriminant analysis
  • MDS multidimensional scaling analysis
  • manifold learning algorithm or more.
  • principal component analysis can maximize the retention of the intrinsic information of the data after dimensionality reduction, and measure the importance of this direction by measuring the size of the data variance in the projection direction.
  • the principal component uses a dimensionality reduction method to screen representative indicators, and combines multiple characteristic variables (i.e. production parameters) into a few principal components. These new comprehensive indicators contain most of the original information, that is, n Dimensional features are mapped to k dimensions (k ⁇ n). These k-dimensional features are called principal components, which are recombined k-dimensional features.
  • This multi-factor combination method can not only achieve the purpose of dimensionality reduction, but also take into account the correlation and joint influence between features.
  • OLED Organic Light-Emitting Diode
  • Methods such as decision trees establish the relationship between result variables and cause variables and transform them into effective data to support decision-making, thereby quickly locating the cause of OLED product anomalies, that is, the first influencing factor.
  • multi-variable large-scale production data provides a large amount of rich information, it also increases the complexity of analysis, and more importantly, there are interactive effects between many feature variables. Simply analyzing each feature variable is isolated and not comprehensive. , so PCA is used for multi-factor combination analysis. The specific process is as follows:
  • the first step is to remove the mean and center all features.
  • the second step is to find the covariance matrix.
  • the diagonal line is the variance corresponding to each production parameter, and the off-diagonal line is the covariance.
  • the covariance is a measure of the change degree of the two production parameters, that is, the simultaneous transformation of the characteristic variables. The greater the absolute value of the covariance, the greater the impact the two have on each other, and vice versa, so the correlation between the two production parameters can be determined.
  • the third step is to find the eigenvalues and eigenvectors of the covariance matrix.
  • the fourth step is to project and reduce the dimension into new features.
  • the selected k-dimensional new features correspond to That is, the load accounted for by N original features in each new feature. Select the original features with higher loads and combine them, which means that the new feature represents most of the information of these original features, and these original features are different from each other. There is a high similarity between them.
  • the linear discriminant analysis algorithm is a supervised learning dimensionality reduction technology, that is, each sample of its data set (ie, production parameter) has a category output.
  • the main principle is to minimize the intra-class variance and maximize the inter-class variance after projection.
  • the multidimensional scaling analysis algorithm is a statistical study that classifies production parameters with many dimensions based on their similarity (close distance) or dissimilarity (far distance, that is, by calculating their distance).
  • the research object is vividly represented in a low-dimensional (two-dimensional or three-dimensional) space with a perceptual map, and the relative relationship between each research object (for example, production parameters) is simply and clearly explained.
  • the manifold learning algorithm is a nonlinear dimensionality reduction method that maintains a certain "invariant feature quantity" of high-dimensional data and low-dimensional data to find a low-dimensional feature representation.
  • the invariant feature quantity includes one or more of Isomap geodesic distance, LLE local reconstruction coefficient, LE data domain relationship, and LTSA local tangent space alignment.
  • the dimensionality enhancement process can increase the number of production parameters, and the dimensionality reduction process can reduce the number of production parameters. Therefore, whether the above-mentioned processing of dimensionality enhancement or dimensionality reduction of production parameters can be determined based on the number of production parameters in the product data to be processed.
  • a dimension-enhancing algorithm is used to perform factor synthesis processing on the production parameters, that is, dimension-enhancing processing is performed on the production parameters.
  • the number of production parameters is greater than or equal to the first preset threshold, dimensionality reduction processing is performed on the production parameters.
  • the first root cause analysis model passes the corresponding upgrade.
  • the dimensional algorithm combines production parameters to obtain new parameters, that is, composite parameters, and when determining the first influence factor, it is jointly determined based on the production parameters and composite parameters.
  • the first root cause analysis model uses a quantity compression algorithm to determine the production parameters.
  • the parameters are processed by combination of related factors to reduce the number of production parameters that determine the first influencing factor.
  • a filtering algorithm can be used to filter the production parameters in the data to be processed to remove production parameters that have little impact on the detection results, that is, to remove them as the first influencing factor. production parameters with lower probability, thereby improving the efficiency of subsequent determination of the first influencing factor.
  • the filtering algorithm is selected (Filter) based on the correlation and evaluation index between each feature (i.e., production parameter) and the result variable (i.e., the first influencing factor).
  • the correlation coefficient as the importance of each dimensional feature, some unimportant production parameters can be eliminated based on setting the threshold or the number of features.
  • Evaluation indicators include correlation analysis indicators (for example, Pearson correlation coefficient (Pearson correlation coefficient), Spearman correlation coefficient, Maximal Information Coefficient (MIC), etc.), distance indicators, one of the purity indicators and consistency indicators, etc. kind or variety.
  • the Pearson correlation coefficient is used to measure the linear correlation between features and outcome variables.
  • the value range is [-1,1]. The closer to 1, the more positive the linear correlation. The closer to -1, the more negative the linear correlation. . It is suitable for situations where both feature and outcome variables are continuous numerical variables.
  • Nonlinearities such as exponential functions are calculated using the Spearman correlation coefficient.
  • Complex nonlinear functions such as periodic functions generally use the maximum information coefficient to measure the degree of correlation between two sets of variables, thereby removing features with less correlation, that is, production parameters.
  • a good feature set (the feature set includes at least one production parameter) should make the distance between products belonging to the same category as small as possible, and the distance between products belonging to different categories as far as possible. Therefore, based on the classification problem, that is, the result
  • the variables are discrete classified data.
  • Commonly used distance indicators, that is, the calculation methods corresponding to the similarity indicators include Euclidean distance, etc.
  • the rank test is a significance test to determine the index value corresponding to the corresponding consistency index. Assume that two or more sets of variables (i.e., production parameters) come from the same distribution, that is, test whether there is a significant difference, and obtain the result pValue, with a value range of [0,1]. When the pValue is smaller, it is considered that the greater the difference, the greater the impact of the feature on the outcome variable, that is, the stronger the importance of the feature.
  • both feature and outcome variables are categorical data, that is, when both are discrete variables
  • the chi-square test is used. This method counts the degree of deviation between the actual observed value of the sample (i.e. the target product) and the theoretically inferred value. The smaller the chi-square value, the smaller the deviation, and the closer it is to consistency. Likewise, the smaller the pValue, the null hypothesis is rejected, proving a significant correlation.
  • the purity of the result after feature classification is used as the feature importance, that is, the influence factor score, which is applicable to the case where both the feature and the result variable are classified data.
  • the first impact factor is also scored to obtain the impact score corresponding to the first impact factor.
  • the first impact factors are sorted according to the impact score corresponding to the first impact factor.
  • mapping relationship between the first influencing factor and the production parameter may also be displayed.
  • the production parameters corresponding to the first influence factor are determined, that is, the production parameters related to the first influence factor are determined, and the first influence factor is determined.
  • the mapping relationship between an influencing factor and its corresponding production parameter is displayed, so that relevant personnel can intuitively and quickly determine how to obtain the corresponding first influencing factor through the production parameter.
  • the mapping relationship represents how to determine the first impact factor through the production parameters corresponding to the first impact factor.
  • the mapping relationship can be a mathematical relationship, for example, a mathematical formula such as the Cartesian product, or it can be a related code, that is, when executing the relevant When writing the code, the production parameters can be processed accordingly to obtain the corresponding first impact factor.
  • Target products include LCD panels.
  • the manufacturing stages of the display panel at least include an array stage, a color filter (CF) stage, a cell stage and a module stage.
  • the array stage a thin film transistor array substrate is manufactured.
  • a layer of material is deposited and subjected to photolithography, for example, a photoresist is deposited on the layer of material, the photoresist is subjected to exposure and subsequent development. Subsequently, the material layer is etched and the remaining photoresist is removed (“stripping").
  • manufacturing the color filter substrate involves the following steps, including: coating, exposure and development.
  • the array substrate and the color filter substrate are assembled to form a unit.
  • the cell-forming stage includes several steps, including coating and rubbing the alignment layer, injecting liquid crystal material, cell sealant application, cell alignment under vacuum, cutting, grinding and cell inspection.
  • peripheral components and circuits are assembled onto the panel.
  • the module level includes several steps, including backlight assembly, printed circuit board assembly, polarizer attachment, chip-on-film assembly, integrated circuit assembly, burn-in, and final inspection.
  • Target products include organic light-emitting diode display panels.
  • the manufacturing of the display panel includes at least four equipment processes, including the array stage, OLED stage, EAC2 stage and module stage.
  • the array stage the backplane of the display panel is manufactured, for example, including manufacturing a plurality of thin film transistors.
  • a plurality of light-emitting elements eg, organic light-emitting diodes
  • an encapsulation layer is formed to encapsulate the plurality of light-emitting elements, and optionally, a protective film is formed on the encapsulation layer.
  • the large glass (glass) is first cut into half sheets of glass (hglass), and then further cut into panels.
  • inspection equipment is used to inspect the panels to detect defects in them, such as dark spots and bright lines.
  • flexible printed circuits are bonded to panels using chip-on-film technology.
  • a cover glass is formed on the surface of the panel.
  • further inspection is performed to detect defects in the panels.
  • the data produced in the above production steps can be divided into production data and inspection data.
  • the production data is the historical parameter data and processing parameter data of the target product during the production and processing process.
  • the resume parameters include product ID, basic product attributes, process sections that the product has experienced during the production process, site information, equipment models, etc.; processing parameters include the processing parameters of the target product in different process sections or equipment, such as pressure, temperature, and holding time. wait.
  • the product will be optically or electrically inspected to check whether the product quality meets the standard. Based on the test results of the inspection equipment, inspection data will be formed to identify whether the product has defects and what kind of defects occur.
  • data containing defects as negative samples that is, abnormal product data
  • data that do not contain defects as positive samples that is, normal product data. It can be understood that when a certain type of defect data is selected, the data of other defect types are also positive samples.
  • the sample is a display panel motherboard
  • the ratio of the total number of defective display panels belonging to the defective type among the multiple display panels of the display panel motherboard to the total number of multiple display panels is used as one of the relevant production parameters of the sample.
  • the ratio can be called the defective ratio of the sample; or, the total number of defective display panels belonging to the defective type among the multiple display panels of the display panel motherboard is used as the defective degree characterization value in the production parameters of the sample.
  • the greater the value representing the degree of defectiveness in the production parameters of the sample the greater the degree of defectiveness that is represented by the defective type.
  • the total number of display panels other than the defective display panels belonging to the defective type among the multiple display panels of the display panel motherboard is equal to the total number of the multiple display panels.
  • the ratio is used as the representative value of the degree of defectiveness in the production parameters of the sample; or, the total number of defective display panels other than the defective type among the multiple display panels of the display panel motherboard is used as the representative value of the degree of defectiveness in the production parameters of the sample.
  • the smaller the defective degree representation value in the production parameters of the sample is, the greater the degree of defectiveness belonging to the defective type is represented.
  • each production line includes multiple process stations, and each process station is used to perform certain processing (such as cleaning, deposition, etc.) on the products (including semi-finished products). Exposure, etching, box alignment, detection, etc.).
  • each process site usually has multiple sample production equipment (that is, process equipment) used to perform the same processing; of course, although the processing is theoretically the same, different process equipment has different models, statuses, etc., so in practice, The processing effects are not exactly the same.
  • the production process of each sample needs to go through multiple process stations, and different samples may pass through different process stations during the production process; and samples passing through the same process station may also be processed by different sample production equipment. Therefore, in a production line, each sample production equipment will participate in the production process of part of the sample, but not in the production process of the sample. That is, each sample production equipment participates in and only participates in the production process of part of the sample.
  • the production parameters can be other column dimension attributes in the fusion data table except the marked columns.
  • they include passing sites, equipment, equipment parameters, etc.
  • it can be all dimension attributes in the fusion data table, or it can be preliminary filtered based on the user's selection.
  • production parameters can be directly used as the first influencing factor to judge the root cause of the event.
  • the production parameters can be combined to form synthetic parameters, which can be used to determine the first impact factor.
  • the production parameters can be varied by methods such as taking log, square, etc. to determine the first influencing factor using the parameters obtained through processing.
  • the number of the first influencing factors can be more than the production parameters, or it can be the same as the production parameters, or it can be less than the number of the production parameters.
  • Figure 6 is a flow chart of yet another anomaly root cause analysis method shown in this application according to an exemplary embodiment, including the following steps:
  • Step 601 Obtain sample data to be processed corresponding to the target object.
  • Step 602 Determine the positive samples and negative samples in the sample data to be processed. Among them, both positive samples and negative samples include the first parameter.
  • Step 603 Input the positive samples and negative samples into the second root cause analysis model to obtain the second influencing factor information of the judgment result of the target object.
  • the second influencing factor information that leads to the determination result of the target object can be determined based on the to-be-processed sample data corresponding to the target object, where the determination result of the target object corresponds to the target object.
  • the judgment result indicates whether the operating status of the equipment is normal; for example, when the target object is a certain commodity, the judgment result indicates whether the sales volume of the commodity is normal.
  • the target object can be determined according to the actual usage scenario.
  • the data contained in the sample data to be processed corresponding to the target object can also be determined based on actual usage.
  • the process of determining the second impact factor information is similar to the process of determining the first impact factor information, and will not be described again here.
  • this application also provides embodiments of an abnormality root cause analysis device and electronic equipment to which it is applied.
  • the embodiments of the anomaly root cause analysis device of the present application can be applied to electronic equipment, such as servers or terminal equipment.
  • the embodiment of the anomaly root cause analysis device can be implemented by software, or can be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running it through its processor. From the hardware level, as shown in Figure 7, it is a hardware structure diagram of the electronic equipment in which the abnormality root cause analysis device of the present application is located.
  • the electronic equipment where the abnormality root cause analysis device 731 is located in the embodiment may also include other hardware based on the actual functions of the electronic equipment, which will not be described again.
  • Figure 8 is a block diagram of an abnormality root cause analysis device according to an exemplary embodiment of the present application.
  • the device includes:
  • the first data acquisition module 810 is used to acquire product data to be processed corresponding to the target product.
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameter.
  • the first data processing module 820 is used to determine normal product data and abnormal product data in the product data to be processed based on the detection data.
  • the first root cause determination module 830 is used to input normal product data and abnormal product data into the first root cause analysis model to obtain the first impact factor information of the detection results of the target product.
  • the first impact factor includes the first impact factor in the production data.
  • the first root cause determination module is also used to:
  • the first impact factor information is determined.
  • the first root cause determination module is also used to:
  • Normal product data and abnormal product data are input into the tree model to train the tree model.
  • the first influencing factor information is determined.
  • the production data includes production parameters.
  • Training the tree model indicates adjusting the number of production parameters and the corresponding weights of the production parameters.
  • the first influencing factor information is determined based on the weight of the production parameters.
  • the first root cause analysis model indicates at least one integrated tree model, and the integrated tree model is obtained by integrating multiple tree models.
  • the first root cause determination module is also used to:
  • Normal product data and abnormal product data are input into the ensemble tree model to train the ensemble tree model.
  • the first influencing factor information is determined.
  • the first root cause analysis model indicates at least two ensemble tree models.
  • the decision tree information of the ensemble tree model includes the range of the number of decision tree leaves from 2 to 500, the range of the number of decision trees from 25 to 325, and the range of the maximum depth of the decision tree. is 1 to 20, the L1 regular term coefficient is 1.00E-10 to 1.00E-01, and the L2 regular term system is one or more of 1.00E-10 to 1.00E-01.
  • the decision tree information of the integrated tree model includes one of the decision tree depth from 1 to 16, the maximum number of trees from 25 to 300, and the L2 regular term coefficient from 1 to 100. or more.
  • the first root cause determination module is also used to:
  • the first root cause determination module is also used to perform fusion processing of production parameters during the training process of the tree model.
  • the fusion process indicates performing feature crossover and/or mutation on the production parameters to obtain new production parameters.
  • the first data acquisition module is specifically used for:
  • One or more of manual import, batch import, and real-time import are provided.
  • the first data acquisition module is specifically used for:
  • the pending data corresponding to the target product is obtained.
  • Figure 9 is a block diagram of another anomaly root cause analysis device shown in this application according to an exemplary embodiment.
  • the device includes:
  • the second data acquisition module 910 is used to acquire sample data to be processed corresponding to the target object.
  • the second data processing module 920 is used to determine positive samples and negative samples in the sample data to be processed. Among them, both positive samples and negative samples include the first parameter.
  • the second root cause determination module 930 is used to input positive samples and negative samples into the second root cause analysis model to obtain the second influencing factor information of the judgment result of the target object.
  • the present application provides an anomaly root cause analysis system, including a data management server, an analysis server and a display.
  • a data management server configured to store data, and to extract, transform, or load data.
  • the data includes at least one of production data and inspection data.
  • the analysis server is configured to obtain the product data to be processed corresponding to the target product from the data management server when receiving the task request, and determine the normal product data and abnormal product data in the product data to be processed based on the detection data in the product data to be processed. And input the normal product data and abnormal product data into the first root cause analysis model to obtain the first influencing factor information of the test results of the target product.
  • the first influencing factor includes one or more in the production data, wherein the first influencing factor
  • the root cause analysis model indicates the tree model.
  • the product data to be processed is obtained by fusing the production data and detection data corresponding to the target product according to the first preset parameters.
  • the display is configured to display the first impact factor information through a visual interface.
  • the data management server includes data lake, data warehouse, NoSQL database and ETL modules.
  • ETL modules are configured to extract, transform or load data.
  • the data lake is configured to store a first set of data formed by extracting raw data from at least one data source by the ETL module, the first set of data having the same content as the raw data.
  • the data warehouse is configured to store a second set of data formed by cleaning and standardizing the first set of data by the ETL module.
  • the NoSQL database is configured to store the third set of data, which is formed by transforming the second set of data by the ETL module.
  • the data table of the third group of data includes multiple sub-data tables with index relationships formed by splitting the data table of the second group of data.
  • the multiple sub-data tables include a first sub-table, a second sub-table and a third sub-table.
  • the first sub-table includes data filtering options presented by the visual interface.
  • the second subtable includes the product serial number.
  • the third subtable includes data corresponding to the product serial number.
  • the plurality of sub-data tables also include a fourth sub-table, the fourth sub-table includes manufacturing site information and/or equipment information, and the third sub-table includes codes or abbreviations of the manufacturing site and/or equipment.
  • the present application provides a computer-readable storage medium in which computer-executable instructions are stored.
  • the processor executes the computer-executable instructions, the above-mentioned abnormal root cause is implemented.
  • Cause analysis method When the processor executes the computer-executable instructions, the above-mentioned abnormal root cause is implemented.
  • the present application provides a computer program product, including a computer program.
  • the computer program When the computer program is executed by a processor, the computer program implements the abnormal root cause analysis method as described above.
  • the device embodiment since it basically corresponds to the method embodiment, please refer to the partial description of the method embodiment for relevant details.
  • the device embodiments described above are only illustrative.
  • the modules described as separate components may or may not be physically separated.
  • the components shown as modules may or may not be physical modules, that is, they may be located in One place, or it can be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this application. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Factory Administration (AREA)

Abstract

一种异常根因分析方法及装置,所述方法包括:获取目标产品对应的待处理产品数据;其中,所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;根据所述检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型,保证确定产品的影响因子的准确性,并提高了确定效率。

Description

异常根因分析方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及异常根因分析方法及装置。
背景技术
随着技术发展,生产制造业得到了快速发展,企业生产的产品的数量日益提高,但企业生产得到的产品在生产或使用过程中经常会存在缺陷、受损或无法使用的情况,即不良品。为了保证企业的收益以及产品的质量,需要查找到造成产品异常的原因,以更好地进行改进。
目前,在查找造成产品异常的原因时,一般是由工程师对产品的相关数据(例如,生产数据)进行分析,确定产品异常原因。然而,由于此种方式需依靠工程师的经验进行确定,效果较差且时效较慢。
发明内容
为克服相关技术中存在的问题,本申请提供了异常根因分析方法及装置。
根据本申请的第一方面,提供一种异常根因分析方法,所述方法包括:
获取目标产品对应的待处理产品数据;其中,所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
根据所述检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;
将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型。
根据本申请的第二方面,提供一种异常根因分析方法,包括:
获取目标对象对应的待处理样本数据;
确定所述待处理样本数据中的正样本和负样本;其中,所述正样本和所述负样本均包括第一参数;
将所述正样本和所述负样本输入至第二根因分析模型中,得到所述目标对象的判定结果的第二影响因子信息。
根据本申请的第三方面,提供一种异常根因分析***,包括数据管理服务器、分析服务器和显示器;
所述数据管理服务器,被配置为存储数据,并且抽取、转换或加载数据;所述数据包括生产数据和检测数据中的至少一种;
分析服务器,被配置为接收到任务请求时从所述数据管理服务器获取目标产品对应的待处理产品数据,根据待处理产品数据中的检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;以及将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型;所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
所述显示器,被配置为通过可视化界面显示所述第一影响因子信息。
根据本申请的第四方面,提供一种异常根因分析装置,包括:
第一数据获取模块,用于获取目标产品对应的待处理产品数据;其中,所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
第一数据处理模块,用于根据所述检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;
第一根因确定模块,用于将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型。
根据本申请的第五方面,提供一种异常根因分析装置,包括:
第二数据获取模块,用于获取目标对象对应的待处理样本数据;
第二数据处理模块,用于确定所述待处理样本数据中的正样本和负样本;其中,所述正样本和所述负样本均包括第一参数;
第二根因确定模块,用于将所述正样本和所述负样本输入至第二根因分析模型中,得到所述目标对象的判定结果的第二影响因子信息。
根据本申请的第六方面,提供一种电子设备,包括:
存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如上第一方面以及第一方面各种可能的设计所述的报异常根因分析方法。
根据本申请的第七方面,提供一种电子设备,包括:
存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如上第二方面以及第二方面各种可能的设计所述的异常根因分析方法。
根据本申请的第八方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的异常根因分析方法。
根据本申请的第九方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第二方面以及第二方面各种可能的设计所述的异常根因分析方法。
根据本申请的第十方面,提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上第一方面以及第一方面各种可能的设计所述的异常根因分析方法。
根据本申请的第十一方面,提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上第二方面以及第二方面各种可能的设计所述的异常根因分析方法。
本申请的实施例提供的技术方案可以包括以下有益效果:
本申请中,基于各个目标产品对应的检测数据对目标产品对应的待处理产品数据进行分类,得到正常产品数据和异常产品数据,该正常产品数据指示检测结果正常的目标产品的待处理产品数据,该异常产品数据指示检测结果异常的目标产品的待处理产品数据,正常产品数据和异常产品数据均包括生产参数。通过利用第一根因分析模型对正常产品数据和异常产品数据进行分析,以利用生产数据确定影响目标产品的检测结果的影响因子信息,即第一影响因子信息,从而在检测结果为异常时,可以确定造成产品异常的原因,实现产品数据的自动分析,并实现产品异常原因,即根因的自动确定,无需依靠人工进行确定,保证确定产品的影响因子的准确性,并提高了确定效率。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本申请的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1是本申请根据一示例性实施例示出的一种分布式计算环境的示意图。
图2是本申请根据一示例性实施例示出的一种异常根因分析***中的软件模块的示意图。
图3是本申请根据一示例性实施例示出的一种数据管理服务器的示意图。
图4是本申请根据一示例性实施例示出的一种异常根因分析方法的流程图。
图5是本申请根据一示例性实施例示出的另一种异常根因分析方法的流程图。
图6是本申请根据一示例性实施例示出的又一种异常根因分析方法的流程图。
图7是本申请根据一示例性实施例示出的异常根因分析装置所在电子设备的一种硬件结构图。
图8本申请根据一示例性实施例示出的一种异常根因分析装置的框图。
图9本申请根据一示例性实施例示出的另一种异常根因分析装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本申请可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
在产品生产或者使用过程中,可能会出现缺陷、受损或无法继续使用等异常状况。相关人员需要对产品的相关数据(例如,生产数据)进行分析,以确定产品异常原因。然而,由于此种方式需依靠工程师的经验进行确定,效果较差且时效较慢。
以半导体或显示面板相关产品为例,在制造过程中可能现各种缺陷。缺陷的示例包括颗粒、残留物、线缺陷、孔洞、飞溅物、褶皱、变色和气泡。在半导体电子器件的制造中出现的缺陷难以跟踪。
尽管本申请以工业生产(尤其是面板显示生产)的具体场景进行方案说明,然 而本申请不限于此,事实上,本申请的实施例在经过恰当的调整和修改的情况下,也可以使用其他样本异常检测情况,而且能够进一步推广应用至通用的数据分析或机器学习平台中。
在一个方面,本申请提供了一种异常根因分析***。在一些实施例中,异常根因分析***包括分布式计算***,其包括被配置为并行执行以执行至少一个共同任务的一个或多个联网计算机;一个或多个计算机可读存储介质,其存储指令,所述指令使所述分布式计算***执行如下操作。在一些实施例中,所述分布式计算***包括:数据管理服务器,其被配置为存储数据并抽取、转换或加载数据,其中,数据包括生产数据和检测数据中的至少一种;分析服务器,其被配置为在接收到任务请求时从所述数据管理服务器获取数据,对数据执行算法分析以得到异常根因(即影响因子信息,也即第一影响因子信息和/或第二影响因子信息);以及显示器,其被配置为提供可视化界面显示异常根因分析结果。可选地,异常根因分析***用于显示面板制造中的缺陷分析。
如本文所使用的,术语“分布式计算***”通常指具有多个网络节点的互连计算机网络,所述多个网络节点将多个服务器或主机彼此连接或连接到外部网络(例如,因特网)。术语“网络节点”通常指物理网络装置。示例网络节点包括路由器、交换机、集线器、网桥、负载平衡器、安全网关或防火墙。“主机”通常指被配置为实现例如一个或多个虚拟机或其他合适的虚拟化组件的物理计算装置。例如,主机可以包括具有被配置为支持一个或多个虚拟机或其他合适类型的虚拟组件的管理程序的服务器。
图1示出了根据本申请的一些实施例中的分布式计算环境。参考图1,在分布式计算环境中,称为节点的多个自主计算机/工作站在例如LAN(局域网)的网络中彼此通信,以解决任务,例如执行应用。每个计算机节点通常包括其自己的(一个或多个)处理器、存储器和到其它节点的通信链路。计算机可以位于特定位置(例如,集群网络)内,或者可以通过诸如因特网的广域网(LAN)连接。在这样的分布式计算环境中,不同的应用可以共享信息和资源。
在分布式计算环境中的网络可以包括局域网(LAN)和广域网(WAN)。网络可以包括有线技术(例如,以太网)和无线技术(例如,
Figure PCTCN2022119262-appb-000001
码分多址(CDMA)、全球移动***(GSM)、通用移动电话服务(UMTS)、蓝牙、
Figure PCTCN2022119262-appb-000002
等)。
多个计算节点被配置为加入资源组,以提供分布式服务。在分布式网络中的计算节点可以包括任何计算装置,诸如计算装置或用户装置。计算节点还可以包括数据中心。如本文所使用的,计算节点可以指任何计算装置或多个计算装置(即,数据中心)。 软件模块可以在单个计算节点(例如,服务器)上执行,或者以任何合适的方式分布在多个节点上。
分布式计算环境还可以包括一个或多个存储节点,用于存储与软件模块的执行和/或由软件模块的执行生成的输出和/或其他功能相关的信息。一个或多个存储节点在网络中彼此通信,并且与网络中的一个或多个计算节点通信。
图2示出了根据本申请的一些实施例中的异常根因分析***架构。参考图2,异常根因分析***包括分布式计算***,该分布式计算***包括一个或多个联网计算机,该联网计算机被配置为并行执行,以执行至少一个共同任务;存储指令的一个或多个计算机可读存储介质,所述指令在由所述分布式计算***执行时使所述分布式计算***执行相应的操作步骤。在一些实施例中,分布式计算***包括数据管理服务器,其被配置为存储数据,并且抽取、转换或加载数据;分析服务器,其连接到数据管理服务器并被配置为在接收到任务请求时从数据管理服务器获取数据,并执行分析任务,以及显示器,其被配置为通过可视化界面显示分析任务结果。分析服务器包括多个业务服务器(类似于后端服务器)和多个算法服务器,多个算法服务器被配置为直接从数据管理服务器获得数据。在一些实施例中,分布式计算机***还包括查询引擎,其连接到数据管理服务器并被配置为直接从数据管理服务器获得数据。可选地,查询引擎是基于Impala技术的查询引擎。如本文所使用的,在本申请的上下文中,术语“连接到”是指具有从***的第一部件到第二部件和/或从***的第二部件到第一部件的直接信息或数据流的关系。
在一些实施例中,上述分析服务器可以在接收到任务请求时从上述数据管理服务器获取目标产品对应的待处理产品数据,根据待处理产品数据中的检测数据,确定待处理产品数据中的正常产品数据和异常产品数据;以及将该正常产品数据和异常产品数据输入至第一根因分析模型中,得到目标产品的检测结果的第一影响因子信息,第一影响因子包括生产数据中的一个或多个,其中,第一根因分析模型指示树模型;待处理产品数据是根据第一预设参数对目标产品对应的生产数据和检测数据进行融合得到的;相应的,上述显示器可以通过可视化界面显示该第一影响因子信息。
在一些实施例中,数据管理服务器包括ETL模块,其被配置为抽取、转换或加载来自至少一个数据源的数据到数据管理服务器的数据库中。在接收到分配的任务时,至少一个算法服务器被配置为直接从数据管理服务器获得待分析数据。在执行异常分析时,至少一个算法服务器被配置为对待分析数据进行计算分析并将结果数据发送到数据管理服务器中。至少一个算法服务器部署用于异常根因分析的各种通用算法, 例如基于大数据分析的算法,可以是基于特定机器学习模型的算法,如决策树、随机森林、GBDT、LGBM、XGBoost、CatBoost、朴素贝叶斯、支持向量机、Adaboost、神经网络模型等中的一个或多个,也可以是其他统计学算法模型,如WOE&IV、Apriori等。还包括下文提到的异常根因分析算法,在此不作限定。至少一个算法服务器被配置为分析数据以识别异常产生的原因。在另一个的实施例中,算法服务器还被配置为根据生产数据,对是否产生异常进行推理或预测。在此所使用的,术语“ETL模块”指的是被配置为提供诸如抽取、转换或加载数据的功能的计算机程序逻辑。在一些实施例中,ETL模块被存储在存储节点上,加载到存储器中,并且由处理器执行。在一些实施例中,ETL模块被存储在分布式网络中的一个或多个存储节点上,加载到分布式网络中的一个或多个存储器中,并且由分布式网络中的一个或多个处理器执行。
数据管理服务器存储用于异常根因分析***的数据。例如,数据管理服务器存储由算法服务器进行算法分析所需的数据。在另一个示例中,数据管理服务器存储算法分析的结果。为了算法分析和对用户的交互式显示,来自多个数据源的数据由ETL模块清洗并合并成待分析数据。用于异常根因分析的数据的示例包括产品履历数据、工艺参数数据、检测异常位置数据等。在制造过程(例如,显示面板的制造过程)中的数据量是巨大的,例如,在工厂站点中每天可能存在超过上百G的数据。为了满足用户对缺陷分析的需求,需要提高算法服务器读取生产数据的速度。在一个示例中,算法分析所需的数据存储在基于Apache Hbase技术的数据库中,以提高效率并节省存储空间。在另一示例中,算法分析的结果和其它辅助数据被存储在基于Apache Hive技术的数据仓库中。在另一个示例中,数据可以存储于Apache Beam技术(或称为Apache Beam模型)的数据库中。可以理解的是,数据管理服务器可以包括基于Hbase的数据库、基于Hive的数据仓库、基于Apahce Beam的数据库中的一种或多种。
Apache Hive是构建在Hadoop顶部的开源数据仓库***,其用于查询和分析Hadoop文件中存储的结构化和半结构化形式的大数据。Apache Hive主要用于批处理,因此被称为OLAP。
Apache Hbase是一种在Hadoop分布式文件***(HDFS)顶部运行的非关系的面向列的分布式数据库。此外,它是以列存储数据的NoSQL开源数据库。Apache Hbase主要用于事务处理,被称为OLTP。在Apache HbaseTM的情况下,实时处理是可能的。Apache Hbase是一种NoSQL数据库。
Apache Beam是一个开源的统一模型,用于定义批处理和流数据并行处理管道。使用其中的开源BeamSDK,可以构建定义管道程序。
在一个示例中,数据管理平台的各种组件(例如,数据湖、数据仓库)可以是例如基于Apache Hadoop、Apache Hive的分布式数据存储器的形式,也可以是基于Apache Beam的分布式数据存储器的形式。
图3示出了根据本申请的一些实施例中的数据管理服务器。参考图3,在一些实施例中,数据管理服务器包括分布式存储***(DFS),例如Hadoop分布式文件***(HDFS)。数据管理服务器被配置为存储从至少一个数据源收集的数据。数据源可以是工厂生产***中的数据库,也可以是其他数据源,在此不作限定。通常,工厂生产过程中生产的数据存储在关系型数据库中(如Oracle、Mysql等),但基于关系数据库管理***(RDBMS)网格计算的应用具有有限的硬件可扩展性。当数据量达到某个数量级时,硬盘的输入/输出瓶颈使得处理大量数据非常低效。分布式文件***的并行处理可以满足由增加数据存储和计算的需求所提出的挑战。在异常根因分析过程中,首先将数据源中的数据抽取到数据管理服务器中,大大加快了处理过程。数据管理服务器包括数据湖、数据仓库、NoSQL数据库。在一些实施例中,数据管理平台包括具有不同内容和/或存储结构的多组数据,本申请中,将每组数据定义为“数据层”,数据湖、数据仓库、NoSQL数据库即为数据管理服务器中不同的数据层。
数据湖被配置为存储第一组数据,第一组数据通过ETL模块从至少一个数据源抽取原始数据形成,第一组数据具有与原始数据相同内容。在一些实施例中,ETL模块首先将原始数据从至少一个数据源抽取到数据管理服务器中,形成第一数据层(例如,数据湖)。数据湖是被配置为存储任何结构或非结构数据的集中式HDFS或KUDU数据库。数据湖DL被配置为存储由ETL模块从至少一个数据源抽取的第一组数据。第一组数据和原始数据具有相同的内容。可选地,原始数据的维度和属性被保存在第一组数据中。在一些实施例中,存储在数据湖中的第一组数据包括动态更新的数据。可选地,动态更新的数据包括基于Kudu的数据库实时更新数据,或在Hadoop分布式文件***中的周期性更新数据。在一个示例中,存储在Hadoop分布式文件***中的周期性更新数据是存储在基于Apache Hive的存储器中的周期性更新数据。在一个示例中,动态更新的数据包括实时更新数据和周期性更新数据。在一个示例中,实时更新表示分钟级以下,而不包括分钟的更新;周期性更新表示分钟级以上且包括分钟的更新。可以理解的是,数据从数据源到第一数据层的过程是两个数据管理***之间数据内容的备份。
数据仓库被配置为存储第二组数据,第二组数据通过ETL模块对第一组数据进行清洗和标准化而形成。在一些实施例中,数据管理服务器包括第二数据层,例如 数据仓库DW。数据仓库DW包括内部存储***,该内部存储***被配置为以抽象方式提供数据,例如以表格格式(Table)或视图格式(View),而不暴露文件***。数据仓库DW可以基于Apache Hive。ETL模块ETLP配置成抽取、清洗、转换或加载第一组数据,以形成第二组数据。可选地,通过使第一组数据经过清洗和标准化,来形成第二组数据。数据仓库层中的数据可以理解为数据湖层的数据经过预处理后得到的数据。预处理包括对数据的清洗,如去空、去重、去除无用字段等。具体地,服务器识别缺失值(“NA”、“/”、“null”、“未知”),并转换为统一的缺失值形式。预处理还包括对数据的标准化,如服务器会检测不同时间字段格式,并进行统一的标准格式转换。
在一些实施例中,预处理还包括数据汇总和融合。即第二组数据还包括对第一组数据进行数据汇总和融合处理。数据汇总是指指对数据表中相同字段或记录的统计,如数量汇总、百分比计算等。例如,在显示面板制造过程中,对于一张基板(glass)的不良率,可通过计算一张基板中含有不良面板(panel)的数量/总面板数量。融合是指对数据表的融合。对于异常根因分析,异常内容数据往往与根因结果数据分别在两个数据表中生成,通过数据表融合,根据两个数据表中的相同标引字段可以将时间内容数据表和根因结果数据表融合成一张表。对于生产制造中,在数据湖层,可以将生产数据表、检测数据表根据相同的ID进行融合,形成完整的数据表供后续分析。进一步地,还可以基于不同的分析主题,进行不同数据的整合和拆分,以提高后续数据处理效率。可以理解的是,数据从第一数据层到第二数据层是对备份后的数据进行进一步处理以方便数据在数据管理服务器中的管理、展示等处理操作。需要说明的是,本申请中预处理的过程可以在数据管理服务器中进行,也可以在数据分析计算(分析服务器执行)的过程中完成对数据的预处理(如清洗、融合等)操作,对于预处理过程的执行时机不作限制。
NoSQL数据库被配置为存储第三组数据,所述第三组数据通过由所述ETL模块转换所述第二组数据而形成。在一些实施例中,第三组数据为键值型(key-value)数据。在一些实施例中,数据管理平台包括第三数据层(例如NoSQL数据库)。在一些实施例中,第三数据层是存储可用于计算处理的NoSQL类型的数据库,例如HBase、ClikHouse。ETL模块被配置为将第二数据层的第二组数据进行转换以形成第三组数据。可以理解的是,数据从第二数据层到第三数据层是数据存储结构的变化,形成了NoSQL类型的数据库结构,例如HBase等列式数据库结构。NoSQL数据库相比于Hive,在计算和查询上能够更快地与前端界面进行交互响应,同时可以更好地处理用户对实 时数据查询和计算的需求。因此,在一些实施例中,分析服务器(例如算法服务器)获取的数据为NoSQL数据库中的数据。
关于第二组数据转换形成第三组数据的过程。在一个示例中,在第三数据层中生成第一表,并且在第二数据层中生成第二表(例如,外部表)。第一表和第二表被配置为是同步的,以便当数据被写入第二表时,第一表将被同时更新以包括对应的数据。在另一示例中,分布式计算处理模块可以用于读取写入到第二数据层中的数据。Hadoop中的MapReduce模块可被用作分布式计算处理模块,以用于读取被写到第二数据层中的数据。然后,可以将写入到第二数据层中的数据写入到第三数据层中。在一个示例中,可以使用基于HBase API将数据写入Hbase数据库中。在另一示例中,MapReduce模块一旦读取被写到第二数据层中的数据,就可以生成HFile文件,批量加载(Bulkloaded)到第三数据层中。
本领域技术人员可以理解的是,第一组数据、第二组数据、第三组数据可以基于一张或多张数据表的形式进行数据的存储和查询。
可选地,第三组数据的数据表可以是与第二组数据相同的表,也可以将第二组数据的数据表拆分形成多个子表。多个子表可以是多个具有索引关系的子数据表。在本申请的一个示例中,第三组数据的数据表包括第二组数据的数据表拆分形成的具有索引关系的多个子数据表。可基于用户交互界面的筛选标准、第三组数据的键和/或值信息进行子数据表拆分。因此,多个索引关系中的第一索引对应于前端接口的筛选标准,例如,对应于与数据管理服务器通信的用户交互界面中的用户定义的分析范围或标准,从而促进更快速的数据查询和计算过程。在一些实施例中,所述多个子数据表包括第一子表、第二子表和第三子表;所述第一子表包括所述可视化界面所呈现的数据筛选选项;所述第二子表包括产品序列号;所述第三子表包括所述产品序列号对应的数据。
在一些实施例中,多个子数据表还包括第四子表,所述第四子表包括制造站点信息和/或设备信息,所述第三子表包括所述制造站点和/或设备的代码或缩写。
在一些实施例中,多个子表具有在多个子表的至少两个子表之间的索引关系。可选地,基于筛选标准、第三组数据的键和/或值信息对多个子表中的数据进行拆分。在一些实施例中,多个子表包括第一子表(例如,属性子表),其包括与数据管理服务器通信的用户交互界面中可视化界面所呈现的数据筛选选项(如生产时间、生产设备、生产工程等);第二子表,其包括产品序列号(例如,基板标识号或批次标识号);以及第三子表(例如,主子表),其包括第三组数据中与产品序列号对应的值。其中,本 文描述的环境因素包括:环境颗粒条件、设备温度和设备压力等。可选地,基于不同的主题,第二子表可以包括不同的指定键,例如基板标识号或批次标识号(例如,多个第二子表)。可选地,第三组数据中的值通过第三子表与第二子表之间的索引关系与基板标识号对应。可选地,多个子表还包括第五子表(例如,元数据子表),其包括第三组数据中与批次标识号对应的值。可选地,第二子表还包括批次标识号;可以通过第二子表与第五子表之间的索引关系来获得第三组数据中与批次标识号对应的值。可选地,多个子表还包括第四子表(例如,代码生成器子表),其包括制造站点信息和/或设备信息。可选地,第三子表包括制造站点和/或设备的代码或缩写,通过第三子表与第四子表之间的索引关系,可以从第四子表获得制造站点信息和/或设备信息。第三子表只存储制造站点和/或设备信息,能够减少数据存储量。
对于列式数据库(如Hbase),在一些实施例中,可以基于指定的键来执行对第三数据层中的数据进行查询,以快速定位要查询的数据(例如,值value)。因此,并且如下面更具体地讨论的,可以将存储在第三数据层中的表分成为至少三个子表。第一子表对应于用户交互界面中用于用户筛选或定义的数据范围选项(如生产时间、生产设备、生产工程等)。第二子表对应于指定的键(例如产品ID)。第三子表对应于值(例如,产品ID对应的生产数据和检测数据)。可以理解的是,通过第一子表可以确定用户需要分析的产品范围,从而基于第二子表中对应产品的序列号(键key)去查询第三子表中对应的数据(值value)。在一个示例中,第三数据层利用基于Hbase的NoSQL数据库;第二子表中的指定的键可以是行键(row key);并且第三子表中的融合数据(行键对应的列族数据)可以存储在列族数据模型中。可选地,第三子表中的值可以是生产数据和检测数据融合后的数据。此外,第三数据层还可以包括第四子表。第三子表中的某些字符可以例如由于其长度或其它原因而存储在代码中。第四子表包括对应于存储在第三子表中的这些代码的字符(例如,设备名称、制造站点)。第一子表、第二子表和第三子表之间的索引或查询可基于所述代码。第四子表可用于在结果被呈现给用户界面之前用字符替换代码。
在一些实施例中,本文描述了数据管理服务器的各种组件之间的数据流、数据转换和数据结构。在一些实施例中,由数据源收集的原始数据包括生产数据和检测数据。其中,生产数据包括履历数据和参数数据。履历数据信息包含产品(例如面板或基板)在制造期间经过的特定处理的信息。产品在制造期间经过的特定处理的示例包括工厂、工序、站点、设备、腔室、卡槽和操作者。参数数据信息包含产品(例如面板或基板)在制造期间经受的特定环境参数及其变化的信息。产品在制造期间经受的特定环 境参数及其变化的示例包括环境颗粒条件、设备温度和设备压力等。缺陷信息包含基于检查的产品质量的信息。示例产品质量信息包括缺陷类型、缺陷位置和缺陷尺寸等。
在一些实施例中,工厂产生的各种业务数据(例如,与半导体电子器件制造相关的数据)均集成到多个数据源(例如,Oracle数据库)中。ETL模块ETLP例如使用数栈工具、SQOOP工具、kettle工具、Pentaho工具或DataX工具,将来自多个数据源的数据抽取到数据湖中。然后,数据被清洗、转换并加载到数据仓库和NoSQL数据库中。数据湖、数据仓库、NoSQL数据库利用诸如Kudu、Hive和Hbase的工具存储大量数据和分析结果。
在制造过程的各个阶段中生成的信息由各种传感器和检查设备获得,并且随后被保存在多个数据源中。由根因分析***抽取存储至数据管理服务器中,并且根因分析***生成的计算和分析结果也被保存数据管理服务器中。通过ETL模块实现数据管理服务器的各个数据层(表)之间的数据同步(数据的流动)。例如,ETL模块被配置为获得同步过程的参数配置模板,包括网络许可和数据库端口配置、流入数据库名称和表名称、流出数据库名称和表名称、字段对应关系、任务类型、调度周期等。ETL模块基于参数配置模板配置同步过程的参数。ETL模块同步数据,并基于过程配置模板清洗同步的数据。ETL模块通过SQL语句来清洗数据,以移除空值、移除离群值,并建立相关表之间的相关性。数据同步任务包括多个数据源和数据管理服务器之间的数据同步,以及数据管理服务器的各个层之间的数据同步。
在另一示例中,可以实时地或离线地完成到数据湖的数据抽取。在离线模式中(可对应下文的批量导入),周期性地调度数据抽取任务。可选地,在离线模式中,所抽取的数据可以存储在基于Hadoop分布式文件***的存储装置(例如,基于Hive的数据库)中。在实时模式中(可对应下文的实时导入),数据抽取任务可以由OGG(Oracle GoldenGate)结合Apache Kafka来执行。可选地,在实时模式中,所抽取的数据可以存储在基于Kudu的数据库中。OGG读取多个数据源(例如,Oracle数据库)中的日志文件,以获得添加/删除数据。在另一示例中,主题信息由Flink读取,Json被选择为同步字段类型。利用JAR包对数据进行解析,并将解析后的信息发送到Kudu API,实现Kudu表数据的添加/删除。在一个示例中,前端接口可基于存储在基于Kudu的数据库中的数据来执行显示、查询和/或分析。在另一示例中,前端接口可基于存储在基于Kudu的数据库、Hadoop分布式文件***(例如,基于Apache Hive的数据库)和/或基于Apache Hbase的数据库中的任何一个或任何组合中的数据来执行显示、查询和/或分析。在另一示例中,(例如,在几个月内生成的)短期数据被存储在基于Kudu的数据 库中,而长期数据(例如,在所有周期中生成的全部数据)被存储在Hadoop分布式文件***(例如,基于Apache Hive的数据库)中。在另一示例中,ETL模块被配置为将存储在基于Kudu的数据库中的数据抽取到Hadoop分布式文件***(例如,基于Apache Hive的数据库)中。
在一些实施例中,对于第二数据层中的第二组数据,可以基于不同的主题执行数据融合。融合后的数据主题化程度高,聚合程度高,从而大大提高了查询速度。在一个示例中,可以使用数据仓库中的表来构建具有根据不同用户需要或不同主题而构造的相关性的表,根据表各自的用途来为表分配名称。各种主题可以对应于不同的数据分析需求。例如,主题可以对应于不同的分析需求。在一个示例中,主题可以对应于对归因于一个或多个制造节点组(例如,一个或多个设备)的异常分析,并且基于所述主题的数据融合可以包括关于制造过程的履历信息和缺陷信息的数据融合。在另一个示例中,主题可以对应于对归因于一个或多个参数类型的异常分析,并且基于所述主题的数据融合可以包括关于参数特征信息和缺陷信息的数据融合。在另一示例中,主题可以对应于对归因于一个或多个设备操作(例如,由相应设备执行相应操作的相应操作站点定义的设备)的异常分析,并且基于所述主题的数据融合可以包括关于参数特征信息、制造过程的履历信息和缺陷信息中的至少两类信息进行数据融合。在另一示例中,主题可以对应于对至少一种类型的参数信息的特征抽取以生成参数特征信息,其中,针对一种类型的参数信息抽取最大值、最小值、平均值和中值中的一个或多个。在本申请的一个示例中,至少一种类型的参数信息包括至少一种设备参数,如温度、湿度、压力等的数据,还包括环境颗粒度等数据。
在一些实施例中,数据管理服务器可以是基于Apache Beam的数据库,以实现数据的批、流并行处理。可选地,通过Apache Beam接收数据源实时产生的数据,并通过对接的分析服务器,通过预测算法对产品的质量进行实时预测或推理,并且实现用户在交互界面上的实时查询。可选地,利用Apache Beam对接数据源,批量抽取预设生产周期内的数据,存入hive、Hbase或ClickHouse等数据库中进行数据沉淀,用于分析服务器,通过分析算法进行异常根因分析,从而精准定位异常产生原因,进行及时追溯。
在一些实施例中,基于Apache Beam的数据库的实现方式包括,首先,分布式计算***接收BeamSDK类库组件;其次,构建数据管道(pipe line),定义键值对(key-value)的数据类型,可选地,键为样本(产品)ID,值为对应的生产数据和检测数据;再次,在管道内定义数据处理方法,可选地,对样本(产品)总数、异常数、 异常率、到达率进行计算,还可以将前文中ETL模块处理数据的相关方法在管道内定义;最后,定义管道末尾数据流方向,可选地数据可流向分析服务器(如业务服务器、算法服务器)。此外,用户可以通过可视化界面中的拖拽组件,对数据源合并、数据转换、数据计算算子进行编辑和流程配置。
在一些实施例中,软件模块还包括连接到分析服务器的负载均衡服务器。可选地,负载均衡服务器(例如,第一负载均衡服务器)被配置为接收任务请求并且被配置为将任务请求分配给多个业务服务器中的一个或多个,以实现多个业务服务器之间的负载均衡。可选地,负载均衡服务器(例如,第二负载均衡服务器)被配置为将任务从多个业务服务器分配到多个算法服务器中的一个或多个,以实现多个算法服务器之间的负载均衡。可选地,负载均衡服务器是基于Nginx技术的负载均衡服务器。下面介绍异常根因分析方法,可以理解的是,该方法中的全部或者部分步骤可以基于分布式计算***、或者分析服务器或者算法服务器实现。
如图4所示,图4是本申请根据一示例性实施例示出的一种异常根因分析方法的流程图,包括以下步骤:
步骤401、获取目标产品对应的待处理产品数据。其中,待处理产品数据是根据第一预设参数对目标产品对应的生产数据和检测数据进行融合得到的。
在本实施例中,当需要确定影响产品的检测结果的原因时,将该产品作为目标产品,并获取目标产品的生产数据和检测数据。
可选的,生产数据表示与生产目标产品相关的数据,例如,目标产品所经过的加工设备、目标产品的生产温度等。生产数据包括生产参数(即生产参数名称)和生产参数对应的参数值,例如,当生产数据包括生产温度及其对应的具体值,则生产温度为生产参数,生产温度对应的具体值为生产温度对应的参数值。
可选的,生产数据表示产品在生产加工过程中的履历信息和加工信息。履历参数包括生产过程中的产品ID、产品基本属性、产品经历的工艺段、工艺站点、设备型号等信息。加工参数包括产品在不同工艺段和/或设备型号对应的设备中的加工信息,例如,压力、温度、保压时间等。
可选的,检测数据表示与检测目标产品相关的数据,例如,检测数据指示经过生产加工后的目标产品的检测结果,该检测结果指示目标产品(即经过生产加工后的目标产品)是否异常。检测数据包括检测参数(即检测参数名称)和检测参数对应的参数值,例如,当检测数据包括目标产品的检测结果及其对应的具体值(例如目标产品是否异常或异常程度),则检测结果为检测参数,检测结果对应的具体值(例如目 标产品是否异常或异常程度),即目标产品检测结果对应的参数值。
可选的,当工艺段结束时,会对产品进行光学或电学检测,以检测产品质量是否达标,从而得到相应的检测结果,该检测结果的名称为检测参数,检测结果的具体值为检测参数对应的参数值。通过检测数据可以识别出产品是否产生不良,即是否存在异常,以及存在哪些不良。
具体的,利用生产数据和检测数据中均存在的第一预设参数,对目标产品的生产数据和检测数据进行融合,以得到目标产品的产品数据。目标产品的数量为至少一个,例如,第一预设参数为产品标识,每条生产数据均包括产品标识及其对应的参数值。每条检测数据均包括产品标识及其对应的参数值。对于每条生产数据(即一个目标产品对应的生产数据),获取该生产数据中的产品标识对应的参数值,并将参数值为该产品标识对应的参数值的检测数据作为待融合的检测数据,将该生产数据与该待融合的检测数据进行融合,即进行合并,得到该生产数据对应的目标产品的产品数据。例如,一个目标产品对应的生产数据包括参数1及其对应的参数值和参数2及其对应的参数值;参数1为第一预设参数,则在确定待融合的检测数据后,该检测数据包括参数1及其对应的参数值和参数3及其对应的参数值,融合得到产品数据为参数1及其对应的参数值、参数2及其对应的参数值和参数3及其对应的参数值。可以理解的是,通常生产数据和检测数据由两张表获取,通过第一预设参数将这两张表融合成一张表,可以方便后续数据处理和计算。
可选的,在获取目标产品对应的待处理产品数据时,可以通过基于上述Apache Beam模型构建的数据管道,获取该目标产品对应的待处理数据。
步骤402、根据检测数据,确定待处理产品数据中的正常产品数据和异常产品数据。
在本实施例中,在得到目标产品对应的待处理产品数据后,对于每个目标产品,根据该目标产品对应的待处理产品数据中的检测数据(即检测参数对应的参数值)确定该目标产品是否正常,当该目标产品正常时,将该目标产品对应的待处理产品数据作为正常产品数据;当该目标产品异常时,将该目标产品对应的待处理产品数据作为异常产品数据。例如,目标产品包括产品1,产品1对应的检测参数包括缺陷点检测结果,缺陷点检测结果对应的参数值为目标产品存在缺陷点,则确定该产品1为异常产品,相应的,该产品1对应的待处理产品数据为异常产品数据。又例如,当检测结果中的缺陷点的数量或缺陷占比达到预设阈值时,认为是异常产品,否则为正常产品。
步骤403、将正常产品数据和异常产品数据输入至第一根因分析模型中,得到 目标产品的检测结果的第一影响因子信息,第一影响因子包括生产数据中的一个或多个,其中,第一根因分析模型指示树模型。
在本实施例中,在得到待处理产品数据中的正常产品数据和异常产品数据后,将正常产品数据和异常产品数据均输入到第一根因分析模型(如树模型)中,以使第一根因分析模型对其进行分析,以利用生产数据中的生产参数确定影响目标产品的检测结果的影响因子信息,也即第一影响因子信息。
可选的,第一影响因子信息包括第一影响因子和/或第一影响因子对应的影响分数。其中,影响分数表示第一影响因子对目标产品的检测结果的影响程度,例如,检测结果指示目标产品异常,当第一影响因子对应的影响分数越高,表示该第一影响因子影响目标产品异常的程度越高,也即该第一影响因子越有可能为影响目标产品异常的原因。
应理解,对于导致产品异常结果的原因可以主要归结于生产过程中出现的异常,相应的,上述第一影响因子必然反映在生产数据中,其可能是生产数据中某些参数单独或者共同作用形成的结果。因此,该第一影响因子最终呈现的即为上述生产数据中的一个或多个参数。但由于产品的生产过程会产生大量数据,人工难以分辨究竟哪些数据或者哪些数据的组合(如下文提到的升维、降维、融合)导致了最终异常的产生,因此需要通过根因分析模型对导致出现异常的生产参数进行智能分析,得到各生产参数或者组合的影响分数,从而确定第一影响因子信息。
需描述第一影响因子包括生产数据中的一个或多个;第一根因分析模型指示树模型。
在本实施例中,将正常产品数据和异常产品数据均输入到第一根因分析模型即树模型中,得到第一影响因子信息的过程可以通过对树模型进行多轮训练最终得到影响因子,也可以不对树模型进行训练,利用树模型的计算原理得到影响因子。
在一些实施例中,将所述正常产品数据和所述异常产品数据输入至树模型中,得到所述目标产品的检测结果的第一影响因子信息,包括计算生产数据的纯度指标,基于纯度指标确定第一影响因子信息。本实施例中,树模型即为决策树模型,其原理为通过计算纯度指标从而确定决策树的根节点和多级子节点,从根节点到逐级子节点对应的影响因子重要程度逐渐降低,因此可以通过决策树算法(计算纯度指标)无需训练直接得到影响检测结果的第一影响因子信息。在本实施例中,可以通过树模型确定第一影响因子对应的影响分数,即纯度指标确定。当生产参数对应的纯度越高,不确定性越低,一致性也越高,也即影响分数越高。因此,可以将影响分数高于预设分 数阈值的生产参数作为第一影响因子。其中,当树模型模型是通过ID3算法构造的时,可以将信息增益指标作为纯度指标;当树模型是通过C4.5算法构造的时,可以将信息增益率指标作为纯度指标。当树模型是通过CART算法构造的时,可以将基尼***作为纯度指标。
在一些实施例中,将所述正常产品数据和所述异常产品数据输入至树模型中,得到所述目标产品的检测结果的第一影响因子信息包括对树模型进行训练,从而得到第一影响因子信息,详细步骤见下文。
从上述描述可知,基于各个目标产品对应的检测数据对目标产品对应的待处理产品数据进行分类,得到正常产品数据和异常产品数据,该正常产品数据指示检测结果正常的目标产品的待处理产品数据,该异常产品数据指示检测结果异常的目标产品的待处理产品数据,正常产品数据和异常产品数据均包括生产参数。通过利用第一根因分析模型对正常产品数据和异常产品数据进行分析,以利用生产参数确定影响目标产品的检测结果的影响因子信息,即第一影响因子信息,从而在检测结果为异常时,可以确定造成产品异常的原因,实现产品数据的自动分析,并实现产品异常原因,即根因的自动确定,无需依靠人工进行确定,保证确定产品的影响因子的准确性,并提高了确定效率。
如图5所示,图5是本申请根据一示例性实施例示出的另一种异常根因分析方法的流程图,本实施方式在前述实施例的基础上,描述了如何确定第一影响因子的过程,下面将结合一个具体实施例对此过程进行详细说明,如图5所示,该方法包括以下步骤:包括如下步骤:
步骤501、获取目标产品对应的待处理产品数据。其中,待处理产品数据是根据第一预设参数对目标产品对应的生产数据和检测数据进行融合得到的。
在本实施例中,在得到目标产品对应的待处理产品数据后,对待处理产品数据中的离散型数据进行编码,以使编码后的数据更便于后续数据分析,该编码过程具体为:获取生产参数对应的类型。在生产参数对应的类型为预设离散类型的情况下,对待处理产品数据中生产参数对应的参数值进行编码,并将编码结果作为生产参数的新的参数值。
可选的,待处理产品数据中的生存参数包括工艺站点名称,获取工艺站点名称对应的类型为名称类型,该名称类型属于预设离散类型,则确定需对产品站点名称对应的参数值进行编码,其中,工艺站点名称对应的参数值指示目标产品是否通过工艺站点名称对应的工艺站点,其具体过程为:从待处理产品数据中获取工艺站点名称对 应的参数值。在工艺站点名称对应的参数值指示目标产品通过工艺站点的情况下,将工艺站点名称对应的参数值更新为第一编码值,也即将第一编码值作为工艺站点名称对应的参数值。在工艺站点名称对应的参数值指示目标产品未通过工艺站点的情况下,将工艺站点名称对应的参数值更新为第二编码值,也即将第一编码值作为工艺站点名称对应的参数值。
其中,第一编码值和第二编码值可以根据实际需求进行设置,例如,第一编码值为0,第二编码值为1。
可选的,在获取生产参数对应的类型时,可以从预先设置的相关表中进行获取。
可选的,还可以对检测参数对应的参数值进行编码处理,其处理过程与对生产参数对应的参数值进行编码的过程类似,在此,不在对其进行赘述。
可选的,对于本申请中数据存储设备,可基于前文提到的数据管理服务器实现,也可以使用其他存储设备、数据库进行存储,本申请在此不作限定。
在一些实施例中,数据存储设备可以是分布式数据库。分布式数据库的并行处理可满足海量数据的存储和处理要求,用户可通过SQL查询处理简单数据,而复杂处理时可采用自定义函数来实现。因此,在对海量数据分析时,将数据抽取到分布式数据库中,一方面不会对原始数据造成破坏,另一方面提高了数据分析效率。
在本申请的实施例中,将数据抽取到存储设备中的方式即获取目标产品对应的待处理产品数据的方式包括如下几种方式中的一种或多种:1)手动导入,对于待分析的数据,用户可以通过交互界面一次性地完成数据的导入,从而使得存储设备获取待分析数据;2)批量导入,与手动导入类似,用户可以通过交互界面,调用分布式文件***HDFS的API接口或地址,批量导入大量数据;3)实时导入,通过建立原始数据库与分析***中存储设备的连接,基于如Kafka等技术,实现原始数据的实时导入。
步骤502、根据检测数据,确定待处理产品数据中的正常产品数据和异常产品数据。
步骤503、将正常产品数据和异常产品数据输入至树模型中,以对树模型进行训练。
步骤504、根据训练后的树模型,确定第一影响因子信息。
在本实施例中,将正常产品数据和异常产品数据输入至树模型中,以训练该树模型。训练后的树模型可以输出影响目标产品的检测结果的生产参数,也即得到第一影响因子信息。
其中,上述对树模型进行训练表示调整生产参数的个数以及生产参数对应的权 重。第一影响因子信息是根据所述生产参数的权重大小确定的。具体的,在训练过程中,增加或减少生产参数的个数,以及调整生产参数的权重。然后,按照权重由高到低的顺序,选取一定数量的生产参数,并将选取的生产参数作为第一影响因子。或者将权重高于预设权重阈值的生产参数作为第一影响因子。
其中,生产参数的权重可以理解为生产参数对应的影响因子分数。当该生产参数为第一影响因子时,可以将该生产参数对应的影响因子分数作为该第一影响因子的影响分数。
在一些实施例中,上述树模型基于预设训练算法进行训练。预设训练算法包括后向搜索算法、前向搜索算法、双向搜索算法和随机搜索算法中的一个或多个。
示例性的,后向搜索算法包括递归特征消除法(Recursive Feature Elimination),其使用一个树模型来进行多轮训练,每轮训练后,消除若干权值系数的特征,或设置阈值,剔除掉小于阈值的特征,再基于新的特征集进行下一轮训练,不断循环递归,直至剩余的特征数量达到所需的特征数量。这里通过后向搜索算法进行模型训练可以减少生产参数的个数。
其中,前向搜索算法是首先选择一个最优的
单特征子集作为第一轮特征子集,在此基础上,再加入一个特征,构成新的两个特征的子集,进行模型训练,选择最优的双特征子集,再不断增加迭代训练更新,直到找到最优的特征子集。同样该方法属于启发式搜索的贪心算法。这里,通过前向搜索算法进行模型训练可以增加生产参数的个数。
示例性的,双向搜索算法是指后向和前向搜索同时进行,直到两者搜寻到同一最优特征子集。
示例性的,随机搜索算法是随机产生特征子集,然后执行前向或后向搜索。
在本实施例中,在利用树模型确定第一影响因子时,在训练模型的同时,会得到相应的第一影响因子,即特征子集。
在一些实施例中,在对树训练的过程中,对树模型训练过程中进行生产参数的融合处理。其中,融合处理指示对生产参数进行特征交叉和/或变异,以得到新的特征,即生产参数。
其中,该融合处理包括特征交叉处理和/或基于遗传算法(CA)的融合处理。
其中,基于遗传算法进行融合处理过程为:首先随机产品一批特征集,每个特征集包括一个或多个特征(即生产参数)。树训练好后,根据模型测试效果作为评价指标进行首次评分。汇总特征选择结果,对各特征集进行交叉、变异等形式产生新的 特征集,不断迭代更新,优胜劣汰,最终得到评价最高的特征集,也即合成参数。
其中,特征交叉处理过程为:将单独的特征进行组合(相乘或求笛卡尔积)以形成的合成特征,该合成特征有助于表示非线性关系。通过采用随机梯度下降法,可以有效地训练线性模型。因此,在使用扩展的线性模型时,辅以特征组合是训练大规模数据集的有效方法,可以创建很多不同种类的特征组合。
其中,[A X B]:将两个特征的值相乘形成合成参数。[A x B x C x D x E]:将五个特征的值相乘形成的合成参数。[A x A]:对单个特征的值求平方形成的合成参数。
其中,变异处理指示采用取log,平方等方法基于生产参数本身进行变异以得到新的生产参数。
可选的,树模型是简单的机器学习模型,即复杂度较低,可以认为是基模型。可以通过多个基模型进行组合形成较复杂的集成模型。复杂度高低,可以理解为取决于基础模型的个数多少和或模型参数的多少。
在一些实施例中,第一根因分析模型还可以指示至少一个集成树模型,该集成树模型通过多个树模型通过集成的方式得到,也即集成树模型包括多个树模型。可以理解,集成树模型也是树模型的一种,只是相对于单一树模型其复杂度更高。将正常产品数据和异常产品数据输入至第一根因分析模型中,得到目标产品的检测结果的第一影响因子信息,包括:将正常产品数据和异常产品数据输入至集成树模型中,以对集成树模型进行训练。根据训练后的集成树模型,确定第一影响因子信息。
在本实施例中,在对集成树模型训练的过程中,调节各生产参数的权重。训练后的集成树模型可以输出影响目标产品的检测结果的生产参数,也即得到第一影响因子信息。该第一影响因子信息是根据生产参数的权重大小确定的。例如,按照权重由高到低的顺序,选取一定数量的生产参数,并将选取的生产参数作为第一影响因子。或者将权重高于预设权重阈值的生产参数作为第一影响因子。
其中,该权重可以为L1正则项。具体的,在训练过程中,可以基于L1正则化的调参实现特征选择。为了避免过拟合问题,一般会对损失函数引入惩罚项L1正则化,可以产生稀疏权值矩阵,即产生一个稀疏模型,以用于特征选择。具体的,只有少数特征对该稀疏模型有贡献,绝大部分特征是没有贡献的,或者贡献微小(因为它们前面的系数是0或者是很小的值,即使去掉对模型也没有什么影响),此时我们就可以只关注系数是非零值的特征,即生产参数。在一些实施例中,集成的方法包括boosting、bagging和stacking中的至少一种。
具体的,boosting方法通常针对同质弱学习器(即同质基模型),以一种高度 自适应的方法顺序地学习这些弱学习器(每个基模型都依赖于前一模型),并按照某种确定性的策略将他们组合起来。具体包括自适应提升AdaBoost(Adaptive Boosting)和梯度提升Gradient Boosting。自适应提升AdaBoost(Adaptive Boosting通过增加错误样本点的权重系数,同时减小正确样本点的权重系数来对误差函数进行影响,从而使得模型将学习重点放到权重系数更大的样本数据上。Gradient Boosting则是通过改变样本的目标值来进行计算,每次针对损失函数的负梯度来构建弱模型(也就是对该样本的负梯度值为新的目标值),然后将这个学习到的弱模型作为加法模型的最新一项集成到加法模型中来,顺序地构造弱模型,直到满足阈值或其它停止条件。
具体的,bagging的个体弱学习器的训练集是通过随机采样得到的。通过3次的随机采样,就可以得到3个采样集。对于这3个采样集,可以分别独立的训练出3个弱学习器,再对这3个弱学习器通过集合策略来得到最终的强学习器。对于随机采样,一般采用的是自助采样法(Bootstap sampling),即对于m个样本的原始训练集,每次先随机采集一个样本放入采样集,接着把该样本放回,也就是说下次采样时该样本仍有可能被采集到,这样采集m次,最终可以得到m个样本的采样集。由于是随机采样,这样每次的采样集是和原始训练集不同的,和其他采样集也是不同的,这样得到多个不同的弱学习器。
具体的,stacking方法通常考虑的是异质弱学习器(即异质基模型),通过并行地学习多个不同的基模型,并通过训练一个元模型将他们组合起来,根据不同弱模型的预测结果输出一个最终的预测结果。
在一些实施例中,集成树包括随机森林模型、LGBM模型、GBDT模型、XGBoost模型和CatBoost模型中的任意一种。
在一些实施例中,如果第一根因分析模型为基模型(如单一树模型),由于模型较为简单,为保证训练得到较为准确的效率,需要在训练过程中调整生产参数的个数以及生产参数对应的权重,从而得到较为准确的影响因子结果。而在另一实施例中,如果第一根因分析模型为集成模型(如集成树模型),由于其本身模型较为复杂,常规训练即可得到较为准确的训练效果,因此无需额外进行生产参数个数和权重的调整。可选的,在集成树模型为LGBM模型的情况下,集成树的决策树初始参数信息包括决策树叶子数的范围为2至500、决策树个数范围为25至325、决策树最大深度的范围为1至20、L1正则项系数为1.00E-10至1.00E-01和L2正则项***为1.00E-10至1.00E-01中的一个或多个。
具体的,当LGBM模型的决策树信息为决策树叶子数的范围为2至500、决策 树个数范围为25至325、决策树最大深度的范围为1至20、L1正则项系数为1.00E-10至1.00E-01和L2正则项***为1.00E-10至1.00E-0时,相较于采用默认数值,预测效果更好。
在集成树模型为CATBoost模型的情况下,集成树模型的决策树初始参数信息包括决策树深度为1至16、最大树数为25至300和L2正则项系数为1至100中的一个或多个。
具体的,当CATBoost模型的决策树初始参数信息为决策树叶子数的范围为2至500、决策树个数范围为25至325、决策树最大深度的范围为决策树深度为1至16、最大树数为25至300和L2正则项系数为1至100时,相较于采用默认数值,预测效果更好。
在一些实施例中,产品数据中的生产参数个数也会影响集成树模型的初始参数。输入至第一根因分析模型的生产参数个数越多(即维度越高),集成树模型中决策树叶子数、决策树个数、决策树深度、最大树数的数量也越大,这样预测效果更好。
在一些实施例中,第一根因分析模型指示至少两个集成树模型。
其中,每个集成树模型对应一个权重。通过至少两个集成树模型可以进一步增加模型的复杂度,从而得到较好的结果。
在一些实施例中,对于多个模型(基模型或者集成模型)输出的结果,可以通过平均法或投票法得到最终的输出结果。
具体的,平均法即给每个模型设置不同权重;投票法即多个模型分别输出结果,根据类似少数服从多数的投票规则对结果进行预测。
在一些实施例中,将一种模型算法的计算结果(树模型(训练或非训练)、集成树模型、多个基础数模型等)作为最终第一影响因子的影响分数得到第一影响因子的最终结果可能仍不够准确。因此,可以将至少两种影响因子分数计算方法(纯度指标计算、生产参数的权重计算)通过加权得到生产参数的影响因子分数。当确定生产参数为第一影响因子时,该生产参数的影响因子分数便为该第一影响因子的影响分数,从而实现第一影响因子信息的确定。例如,按照影响因子分数由高到低的顺序,选取一定数量的生产参数,该选取的生产参数便为第一影响因子,相应的,该选取的生产参数的影响因子分数便为第一影响因子的影响分数。
其中,生产参数的影响因子分数(即第一影响因子的影响分数)计算方法还包括相关性分析指标、距离指标、一致性指标等。
在一些实施例中,在将所述正常产品数据和所述异常产品数据输入至第一根因 分析模型中时,可以先对对正常产品数据和异常产品数据中的生产参数进行升维或降维处理。然后,利用升维或降维处理后的生产参数输入至第一根因分析模型。
其中,在对生产参数进行升维处理时,可以基于升维算法对生产参数进行因子合成处理,以得到新的合成参数,即生产参数。在对生产参数进行降维处理时,可以基于降维算法,对生产参数进行相关因子组合处理。其中,相关因子组合处理指示对存在相关性的生产参数进行降维处理。
可选的,升维算法包括独热编码、特征交叉(Featrue Cross)等算法,其进一步挖掘自身的数据规律,以得到新的参数。
可选的,降维算法主要是将原高维空间中的数据点映射到低纬度的空间中,降低计算成本,同时又考虑了特征之间的相关性,即降维算法用于存在相关性的生产参数进行降维处理,也即将存在相关性的多个生产参数变异得到一个代表参数,从而在确定第一影响因子时,不在利用该多个生产参数,而仅利用该多个生产参数对应的代表参数即可。
可选的,降维算法包括主成份分析(Principal Component Analysis,PCA)算法、线性判别分析(Latent Dirichlet Allocation,LDA)算法、多维尺度分析(Multidimensional scaling,MDS)算法、流形学习算法中的一个或多个。
可选的,主成份分析在降维之后能够最大化保持数据的内在信息,并通过衡量在投影方向上的数据方差的大小来衡量该方向的重要性。主成分是通过降维的方法对代表性指标进行筛选,将多个特征变量(即生产参数)组合成少数几个主成分,这几个新的综合指标包含了大部分的原始信息,即将n维特征映射到k维上(k<n),这k维特征称为主成分,是重新组合出来的k维特征。这种多因子组合方法既可以达到降维的目的,又考虑到了特征之间的相关性和共同影响。
例如,在OLED(有机发光二极管,Organic Light-Emitting Diode)产品的不良诊断分析,即在确定OLED产品异常的第一影响因子时,将整个工厂的所有制造工序或设备相关数据作为生产数据,通过决策树等方法建立结果变量与原因变量之间的关系,并将它们转化为支持决策的有效数据,从而快速定位到OLED产品异常的原因,即第一影响因子。然而,多变量大生产数据虽然提供了大量的丰富信息,但是也增加了分析的复杂性,而且更重要的是许多特征变量之间存在交互影响,单单分析每个特征变量是孤立的,不是综合的,因此利用PCA进行多因子组合分析。具体过程如下:
假设有M个目标产品{X 1,X 2,...,X M},每个目标产品有N维生产参 数
Figure PCTCN2022119262-appb-000003
具体计算流程如下:
第一步,去均值,对所有特征进行中心化。
对每个生产参数(即生产参数对应的参数值)求平均值,如
Figure PCTCN2022119262-appb-000004
然后对于每个生产参数,将每个目标产品对应的该生产参数所对应的参数值减去该生产参数的平均值,因此,得到的是去中心化后的新的参数值。
第二步,求协方差矩阵。
对于步骤二中的N维特征,即生产参数,分别求协方差矩阵。比如,若N=2时,x 1和x 2则求其协方差矩阵
Figure PCTCN2022119262-appb-000005
对角线上是各个生产参数对应的方差,非对角线是协方差,协方差是衡量两个生产参数,即特征变量同时变换的变化程度。协方差绝对值越大,两者对彼此的影响越大,反之越小,从而可以确定两个生产参数之间的相关性。
其中,协方差求解公式为:
Figure PCTCN2022119262-appb-000006
第三步,求协方差矩阵的特征值和特征向量。
求协方差矩阵的特征值和特征向量:Cu=λu;特征值λ会有N个,即每一个λ i对应一个特征向量u i
第四步,投影降维成新特征。
将特征值按从大到小的顺序排序,选取最大的前k个,以及其对应的特征向量,{(λ 1,u 1),…,(λ k,u k)}。接下来进行投影,即降维的过程,对于每个M个目标产品中的每个目标产品X i,其原来所对应的N维特征
Figure PCTCN2022119262-appb-000007
投影之后的新特征为:
Figure PCTCN2022119262-appb-000008
而且,选取的k维新特征,其对应的
Figure PCTCN2022119262-appb-000009
即N个原始特征在每一个新特征中所占的载荷,选取载荷较高的原始特征,将其组合起来,即表示该维新特征代表了这些原始特征的大部分信 息,且这些原始特征彼此之间有较高的相似性。
可选的,线性判别分析算法是一种监督学习的降维技术,即它的数据集的每个样本(即生产参数)是有类别输出的。主要原理是是投影后类内方差最小,类间方差最大。
可选的,多维尺度分析算法时根据具有很多维度的生产参数之间的相似性(距离近)或非相似性(距离远,即通过计算其距离)来对其进行分类的一种统计学研究方法。将研究对象在一个低维(二维或三维)的空间用感知图形象地表示出来,简单明了地说明各研究对象(例如,生产参数)之间的相对关系。
可选的,流形学习算法是一种非线性的维数约简方法,保持高维数据与低维数据的某个“不变特征量”而找到低维特征表示。其中,不变特征量包括Isomap测地距离,LLE局部重构系数,LE数据领域关系,LTSA局部切空间对齐中的一个或多个。
在一些实施例中,升维处理可以增加生产参数的数量,降维处理可以减少生产参数的数量。因此,上述对生产参数进行升维还是降维处理,可以根据待处理产品数据中的生产参数的数量确定。在生产参数的数量小于第一预设阈值的情况下,使用升维算法对生产参数进行因子合成处理,即对生产参数进行升维处理。在生产参数的数量大于或等于第一预设阈值的情况下,对生产参数进行降维处理。
具体的,当生产参数的数量小于第一预设阈值时,表明可利用的生产参数较少,从而影响确定的第一影响因子的数量、准确性,则第一根因分析模型通过相应的升维算法对生产参数进行组合,得到新的参数,也即合成参数,并在确定第一影响因子时,根据生产参数和合成参数共同确定。
当生产参数的数量大于或等于第一预设阈值的情况下,表明可利用的生产参数过多,可能会影响确定第一影响因子的效率,则第一根因分析模型利用数量压缩算法对生产参数进行相关因子组合处理,以减少确定第一影响因子的生产参数的数量。
在一些实施例中,在得到待处理产品数据后,可以利用过滤算法,对待处理产生数据中的生产参数进行进行筛选,以去除对检测结果影响不大的生产参数,即去除成为第一影响因子的概率较低的生产参数,从而提高后续确定第一影响因子的效率。
可选的,过滤算法根据各个特征(即生产参数)与结果变量(即第一影响因子)之间的相关性及评价指标进行选择(Filter)。将相关系数作为每一维特征的重要性程度,可根据设置阈值或特征个数来剔除掉部分不重要的生产参数。评价指标包括相关性分析指标(例如,皮尔森相关系数(Pearson correlation coefficient)、Spearman相关系 数、最大信息系数(Maximal Information Coefficient,MIC)等)、距离指标,纯度指标和一致性指标等中的一种或多种。
具体的,皮尔森相关系数用于衡量特征与结果变量之间的线性相关性,取值区间为[-1,1],越接近1表示越正线性相关,越接近-1表示越负线性相关。适用于特征与结果变量均为连续性数值变量的情况。指数函数等非线性采用Spearman相关系数来计算。周期函数等复杂非线性函数一般采用最大信息系数来衡量两组变量的关联程度,从而去除相关性较小的特征,即生产参数。
具体的,好的特征集(该特征集包括至少一个生产参数)应该使得属于同一类的产品距离尽可能小,属于不同类的产品之间的距离尽可能远,因此,基于分类问题,即结果变量为离散型定类数据,常用的距离指标,即相似度指标对应的计算方法有欧式距离等。
具体的,在特征与结果变量中的一项为离散型数值变量的情况下,或者将连续型数值变量离散化之后,采用ANOVA方差分析、T检验,或者非参数版本Kruskal-Wallis检验、Wilcoxon符号秩检验,即进行显著性校验,以确定相应的一致性指标对应的指标值。假设两组或多组变量(即生产参数)来自相同的分布,即检验是否有显著性差异,得到结果pValue,取值区间为[0,1]。当pValue越小,认为差异越大,该特征对结果变量的影响程度越大,即特征重要性越强。
当特征与结果变量均为定类数据时,即均为离散型变量时,用卡方检验。该方法统计样本(即目标产品)的实际观测值与理论推断值之间的偏离程度,卡方值越小,偏差越小,越趋于符合。同样,pValue越小,拒绝原假设,证明了显著的相关性。
可选的,根据特征划分后结果的纯度作为特征重要性,即影响因子分数,适用于特征与结果变量均为定类数据的情况。使用某特征对目标产品对应的待处理产品数据划分后,各数据子集的纯度越高,不确定性越低,一致性越高,说明该特征更重要。
在本实施例中,在利用过滤算法对生产参数进行筛选得到第一影响因子的同时,也实现对第一影响因子进行打分,得到第一影响因子对应的影响分数。
在一些实施例中,在得到第一影响因子后,为了更加直观地展示影响产品结果的原因,根据第一影响因子对应的影响分数,对第一影响因子进行排序。
在一些实施例中,还可以显示第一影响因子和生产参数之间的映射关系。
在本实施例中,在得到第一影响因子后,对于每个第一影响因子,确定第一影响因子对应的生产参数,也即确定与该第一影响因子相关的生产参数,并将该第一影响因子和其对应的生产参数之间的映射关系进行显示,以使相关人员可以直观快速确 定如何通过生产参数得到相应的第一影响因子的。
其中,映射关系表示如何通过第一影响因子对应的生产参数确定该第一影响因子的,该映射关系可以是数学关系,例如,笛卡尔积等数学公式,也可以是相关代码,即在执行相关代码时,可以对生产参数进行相应的处理,得到对应的第一影响因子。
下面将结合具体一个实例对上述过程进行描述,其具体为:
目标产品包括液晶显示面板。在制造液晶显示面板时,显示面板的制造阶段至少包括阵列(Array)阶段、彩膜(CF)阶段、成盒(cell)阶段和模组(module)阶段。在阵列阶段,制造薄膜晶体管阵列基板。在一个示例中,在阵列阶段,沉积材料层,使所述材料层经受光刻,例如,将光刻胶沉积在所述材料层上,使光刻胶经受曝光且随后显影。随后,蚀刻材料层并去除剩余的光刻胶(“剥离”)。在CF阶段,制造彩膜基板,涉及以下几个步骤,包括:涂覆、曝光和显影。在成盒阶段,组装阵列基板和彩膜基板,以形成单元。成盒阶段包括几个步骤,包括涂覆和摩擦取向层、注入液晶材料、单元密封剂涂覆、在真空下对盒、切割、研磨和单元检查。在模组阶段,***部件和电路被组装到面板上。在一个示例中,模块级包括若干步骤,包括背光的组装、印刷电路板的组装、偏光片附接、膜上芯片的组装、集成电路的组装、老化和最终检查。
目标产品包括有机发光二极管显示面板。在制造有机发光二极管显示面板时,显示面板的制造包括至少四个设备工艺,包括阵列阶段、OLED阶段、EAC2阶段和模组阶段。在阵列阶段,制造显示面板的背板,例如,包括制造多个薄膜晶体管。在OLED阶段中,制造多个发光元件(例如,有机发光二极管),形成封装层以封装多个发光元件,并且可选地,在封装层上形成保护膜。在EAC2阶段,大玻璃(glass)首先被切割成半片玻璃(hglass),然后进一步切割成面板(panel)。此外,在EAC2阶段,检查设备用于检查面板以检测其中的缺陷,例如暗点和亮线。在模组阶段,例如,使用膜上芯片技术将柔性印刷电路接合到面板。在面板的表面上形成盖玻璃。可选地,执行进一步检查以检测面板中的缺陷。
相应的,上述生产步骤中产出的数据可分为生产数据和检测数据。生产数据为目标产品在生产加工过程中的履历参数数据和加工参数数据。履历参数包括生产过程中产品ID、产品基本属性、产品经历的工艺段、站点信息、设备型号等;加工参数包括目标产品在不同工艺段或设备中的加工参数,如压力、温度、保压时间等。在各工艺段结束时会对产品进行光学或电学检测,检测产品质量是否达标,基于检测设备的检测结果,会形成检测数据,识别出产品是否产生不良以及产生何种不良。
可选的,定义包含缺陷的数据为负样本,即异常产品数据,不包含缺陷的数据为正样本,即正常产品数据。可以理解的是,当选定某一类型缺陷数据时,其他缺陷类型的数据也为正样本。
可选的,对于包含较大面积的显示面板来说,一定数量的缺陷点是可以容忍的。因此,通过计算不良程度表征值来区分正负样本。例如,在样本为显示面板母板的情况下,显示面板母板的多个显示面板中的属于不良类型的不良显示面板的总数与多个显示面板的总数的比值,作为样本的相关生产参数中的不良程度表征值,该比值可以称为样本的不良比例;或者,显示面板母板的多个显示面板中的属于不良类型的不良显示面板的总数作为样本的生产参数中的不良程度表征值。在此情况下,样本的生产参数中的不良程度表征值越大,表征的属于不良类型的不良程度越大。又示例性地,在样本为显示面板母板的情况下,显示面板母板的多个显示面板中除了属于不良类型的不良显示面板之外的显示面板的总数,与多个显示面板的总数的比值,作为样本的生产参数中的不良程度表征值;或者,显示面板母板的多个显示面板中除了属于不良类型的不良显示面板的总数,作为样本的生产参数中的不良程度表征值。在此情况下,样本的生产参数中的不良程度表征值越小,表征的属于不良类型的不良程度越大。
可以理解的是,许多产品(例如显示面板)都是通过生产线生产的,每条生产线包括多个工艺站点,每个工艺站点用于对产品(包括半成品)进行一定的处理(如清洗、沉积、曝光、刻蚀、对盒、检测等)。同时,每个工艺站点通常有多个用于进行同样处理的样本生产设备(也即工艺设备);当然,虽然理论上进行的处理相同,但不同工艺设备由于型号、状态等的不同,故实际的处理效果并不完全相同。在此情况下,每个样本的生产过程需要经过多个工艺站点,且不同样本在生产过程中经过的工艺站点可能不同;而经过同一工艺站点的样本也可能由其中的不同样本生产设备处理。因此,在一条生产线中,每个样本生产设备都会参与部分样本的生产过程,但不是参与样本的生产过程,即每个样本生产设备都参与且仅参与部分样本的生产过程。
可选的,生产参数,即待分析的生产参数可以是融合数据表中除标记列的其他列维度属性,在工厂生产中,包括经过的站点、设备、设备参数等。对于生产参数,可以是融合数据表中的全部维度属性,也可以根据用户的选择进行初步筛选。
可选的,可以直接将生产参数作为评判事件根因的第一影响因子。当生产参数较多时,某些生产参数之间存在相关性,可能会影响最终显著性评价,因此,可以通过对生产参数进行组合形成合成参数,以利用其确定第一影响因子。当生产参数较少,可以通过对生产参数的变异,如取log,平方等方法对生产参数进行变异,以利用处理 得到的参数确定第一影响因子。
其中,第一影响因子的数量可以比生产参数多,也可以跟生产参数相同,还可以比生产参数的数量少。
如图6所示,图6是本申请根据一示例性实施例示出的又一种异常根因分析方法的流程图,包括以下步骤:
步骤601、获取目标对象对应的待处理样本数据。
步骤602、确定待处理样本数据中的正样本和负样本。其中,正样本和负样本均包括第一参数。
步骤603、将正样本和负样本输入至第二根因分析模型中,得到目标对象的判定结果的第二影响因子信息。
在本实施例中,可以根据目标对象对应的待处理样本数据确定导致目标对象的判定结果的第二影响因子信息,其中,目标对象的判定结果与目标对象相对应,例如,当目标对象为设备(比如,生产设备、检测设备)时,则判定结果指示设备的运行状态是否正常;又例如,当目标对象为某个商品时,则判断结果指示该商品的销售量是否正常。
其中,目标对象可以根据实际使用场景进行确定。目标对象对应的待处理样本数据所包含的数据也可以根据实际使用情况进行确定。
其中,确定第二影响因子信息的过程与确定第一影响因子信息的过程类似,在此,不再对其进行赘述。
与前述方法的实施例相对应,本申请还提供了异常根因分析装置及其所应用的电子设备的实施例。
本申请异常根因分析装置的实施例可以应用在电子设备上,例如服务器或终端设备。异常根因分析装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图7所示,为本申请异常根因分析装置所在电子设备的一种硬件结构图,除了图7所示的处理器710、内存730、网络接口720、以及非易失性存储器740之外,实施例中异常根因分析装置731所在的电子设备,通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。
如图8所示,图8是本申请根据一示例性实施例示出的一种异常根因分析装置的框图,所述装置包括:
第一数据获取模块810,用于获取目标产品对应的待处理产品数据。其中,待处理产品数据是根据第一预设参数对目标产品对应的生产数据和检测数据进行融合得到的。
第一数据处理模块820,用于根据检测数据,确定待处理产品数据中的正常产品数据和异常产品数据。
第一根因确定模块830,用于将正常产品数据和异常产品数据输入至第一根因分析模型中,得到目标产品的检测结果的第一影响因子信息,第一影响因子包括生产数据中的一个或多个,其中,第一根因分析模型指示树模型。
可选的,第一根因确定模块还用于:
将正常产品数据和异常产品数据输入至第一根因分析模型中,计算得到生产数据的纯度指标。
基于纯度指标,确定第一影响因子信息。
可选的,第一根因确定模块还用于:
将正常产品数据和异常产品数据输入至树模型中,以对树模型进行训练。
根据训练后的树模型,确定第一影响因子信息。
可选的,生产数据包括生产参数。对树模型进行训练指示调整生产参数的个数以及生产参数对应的权重。
第一影响因子信息是根据生产参数的权重大小确定的。
可选的,第一根因分析模型指示至少一个集成树模型,集成树模型通过多个树模型通过集成的方式得到。
可选的,第一根因确定模块还用于:
将正常产品数据和异常产品数据输入至集成树模型中,以对集成树模型进行训练。
根据训练后的集成树模型,确定第一影响因子信息。
可选的,第一根因分析模型指示至少两个集成树模型。
可选的,在集成树模型为LGBM模型的情况下,集成树模型的决策树信息包括决策树叶子数的范围为2至500、决策树个数范围为25至325、决策树最大深度的范围为1至20、L1正则项系数为1.00E-10至1.00E-01和L2正则项***为1.00E-10至1.00E-01中的一个或多个。
可选的,在集成树模型为CATBoost模型的情况下,集成树模型的决策树信息包括决策树深度为1至16、最大树数为25至300和L2正则项系数为1至100中的一 个或多个。
可选的,第一根因确定模块还用于:
对正常产品数据和异常产品数据中的生产参数进行升维或降维处理,并将升维或降维处理后的生产参数输入至第一根因分析模型。
可选的,第一根因确定模块还用于:在对树模型训练过程中进行生产参数的融合处理。
可选的,融合处理指示对生产参数进行特征交叉和/或变异,以得到新的生产参数。
可选的,第一数据获取模块具体用于:
手动导入、批量导入、实时导入中的一种或者多种。
可选的,第一数据获取模块具体用于:
通过基于Apache Beam模型构建的数据管道,获取目标产品对应的待处理数据。
如图9所示,图9是本申请根据一示例性实施例示出的另一种异常根因分析装置的框图,装置包括:
第二数据获取模块910,用于获取目标对象对应的待处理样本数据。
第二数据处理模块920,用于确定待处理样本数据中的正样本和负样本。其中,正样本和负样本均包括第一参数。
第二根因确定模块930,用于将正样本和负样本输入至第二根因分析模型中,得到目标对象的判定结果的第二影响因子信息。
在另外一个实施例中,本申请提供一种异常根因分析***,包括数据管理服务器、分析服务器和显示器。
数据管理服务器,被配置为存储数据,并且抽取、转换或加载数据。数据包括生产数据和检测数据中的至少一种。
分析服务器,被配置为接收到任务请求时从数据管理服务器获取目标产品对应的待处理产品数据,根据待处理产品数据中的检测数据,确定待处理产品数据中的正常产品数据和异常产品数据。以及将正常产品数据和异常产品数据输入至第一根因分析模型中,得到目标产品的检测结果的第一影响因子信息,第一影响因子包括生产数据中的一个或多个,其中,第一根因分析模型指示树模型。待处理产品数据是根据第一预设参数对目标产品对应的生产数据和检测数据进行融合得到的。
显示器,被配置为通过可视化界面显示第一影响因子信息。
可选的,数据管理服务器包括数据湖、数据仓库、NoSQL数据库和ETL模块。
ETL模块被配置为抽取、转换或加载数据。
数据湖被配置为存储第一组数据,第一组数据通过由ETL模块从至少一个数据源抽取原始数据而形成,第一组数据具有与原始数据相同内容。
数据仓库被配置为存储第二组数据,第二组数据通过由ETL模块对第一组数据进行清洗和标准化而形成。
NoSQL数据库被配置为存储第三组数据,第三组数据通过由ETL模块转换第二组数据而形成。
可选的,第三组数据的数据表包括第二组数据的数据表拆分形成的具有索引关系的多个子数据表。
可选的,多个子数据表包括第一子表、第二子表和第三子表。
第一子表包括可视化界面所呈现的数据筛选选项。
第二子表包括产品序列号。
第三子表包括产品序列号对应的数据。
可选的,多个子数据表还包括第四子表,第四子表包括制造站点信息和/或设备信息,第三子表包括制造站点和/或设备的代码或缩写。
在另外一个实施例中,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上所述的异常根因分析方法。
在另外一个实施例中,本申请提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上所述的异常根因分析方法。
上述装置中各个模块的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本申请方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
上述对本申请特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺 序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
本领域技术人员在考虑说明书及实践这里申请的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未申请的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (24)

  1. 一种异常根因分析方法,其特征在于,包括:
    获取目标产品对应的待处理产品数据;其中,所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
    根据所述检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;
    将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型。
  2. 根据权利要求1所述的方法,其特征在于,将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息包括:
    将所述正常产品数据和所述异常产品数据输入至所述第一根因分析模型中,计算得到所述生产数据的纯度指标;
    基于所述纯度指标,确定所述第一影响因子信息。
  3. 根据权利要求1或2所述的方法,其特征在于,将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息包括:
    将所述正常产品数据和所述异常产品数据输入至所述树模型中,以对所述树模型进行训练;
    根据训练后的树模型,确定所述第一影响因子信息。
  4. 根据权利要求3所述的方法,其特征在于,所述生产数据包括生产参数;所述对所述树模型进行训练指示调整所述生产参数的个数以及所述生产参数对应的权重,
    所述第一影响因子信息是根据所述生产参数的权重大小确定的。
  5. 根据权利权利要求1所述的方法,其特征在于,所述第一根因分析模型指示至少一个集成树模型,所述集成树模型通过多个树模型通过集成的方式得到。
  6. 根据权利要求5所述的方法,其特征在于,将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,包括:
    将所述正常产品数据和所述异常产品数据输入至所述集成树模型中,以对所述集成树模型进行训练;
    根据训练后的集成树模型,确定所述第一影响因子信息。
  7. 根据权利要求5所述的方法,其特征在于,所述第一根因分析模型指示至少两个集成树模型。
  8. 根据权利要求5至7中任一项所述的方法,其特征在于,在所述集成树模型为LGBM模型的情况下,所述集成树模型的决策树信息包括决策树叶子数的范围为2至500、决策树个数范围为25至325、决策树最大深度的范围为1至20、L1正则项系数为1.00E-10至1.00E-01和L2正则项***为1.00E-10至1.00E-01中的一个或多个。
  9. 根据权利要求5至7中任一项所述的方法,其特征在于,在所述集成树模型为CATBoost模型的情况下,所述集成树模型的决策树信息包括决策树深度为1至16、最大树数为25至300和L2正则项系数为1至100中的一个或多个。
  10. 根据权利要求1所述的方法,其特征在于,将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,包括:
    对所述正常产品数据和所述异常产品数据中的生产参数进行升维或降维处理,并将升维或降维处理后的生产参数输入至所述第一根因分析模型。
  11. 根据权利要求1所述的方法,其特征在于,所述生产数据包括生产参数;所述方法还包括:
    在对所述树模型训练过程中进行生产参数的融合处理。
  12. 根据权利要求11所述的方法,其特征在于,所述融合处理指示对生产参数进行特征交叉和/或变异,以得到新的生产参数。
  13. 根据权利要求1所述的方法,其特征在于,所述获取目标产品对应的待处理产品数据包括:
    手动导入、批量导入、实时导入中的一种或者多种。
  14. 根据权利要求13所述的方法,其特征在于,所述获取目标产品对应的待处理产品数据,包括:
    通过基于Apache Beam模型构建的数据管道,获取目标产品对应的待处理数据。
  15. 一种异常根因分析方法,其特征在于,包括:
    获取目标对象对应的待处理样本数据;
    确定所述待处理样本数据中的正样本和负样本;其中,所述正样本和所述负样本均包括第一参数;
    将所述正样本和所述负样本输入至第二根因分析模型中,得到所述目标对象的判定结果的第二影响因子信息。
  16. 一种异常根因分析***,其特征在于,包括数据管理服务器、分析服务器和 显示器;
    所述数据管理服务器,被配置为存储数据,并且抽取、转换或加载数据;所述数据包括生产数据和检测数据中的至少一种;
    分析服务器,被配置为接收到任务请求时从所述数据管理服务器获取目标产品对应的待处理产品数据,根据待处理产品数据中的检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;以及将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型;所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
    所述显示器,被配置为通过可视化界面显示所述第一影响因子信息。
  17. 根据权利要求16所述的***,其特征在于,所述数据管理服务器包括数据湖、数据仓库、NoSQL数据库和ETL模块;
    所述ETL模块被配置为抽取、转换或加载数据;
    所述数据湖被配置为存储第一组数据,所述第一组数据通过由所述ETL模块从至少一个数据源抽取原始数据而形成,所述第一组数据具有与所述原始数据相同内容;
    所述数据仓库被配置为存储第二组数据,所述第二组数据通过由所述ETL模块对所述第一组数据进行清洗和标准化而形成;
    所述NoSQL数据库被配置为存储第三组数据,所述第三组数据通过由所述ETL模块转换所述第二组数据而形成。
  18. 根据权利要求17所述的***,其特征在于,所述第三组数据的数据表包括所述第二组数据的数据表拆分形成的具有索引关系的多个子数据表。
  19. 根据权利要求18所述的***,其特征在于,所述多个子数据表包括第一子表、第二子表和第三子表;
    所述第一子表包括所述可视化界面所呈现的数据筛选选项;
    所述第二子表包括产品序列号;
    所述第三子表包括所述产品序列号对应的数据。
  20. 根据权利要求19所述的***,其特征在于,所述多个子数据表还包括第四子表,所述第四子表包括制造站点信息和/或设备信息,所述第三子表包括所述制造站点和/或设备的代码或缩写。
  21. 一种异常根因分析装置,其特征在于,包括:
    第一数据获取模块,用于获取目标产品对应的待处理产品数据;其中,所述待处理产品数据是根据第一预设参数对所述目标产品对应的生产数据和检测数据进行融合得到的;
    第一数据处理模块,用于根据所述检测数据,确定所述待处理产品数据中的正常产品数据和异常产品数据;
    第一根因确定模块,用于将所述正常产品数据和所述异常产品数据输入至第一根因分析模型中,得到所述目标产品的检测结果的第一影响因子信息,所述第一影响因子包括所述生产数据中的一个或多个,其中,所述第一根因分析模型指示树模型。
  22. 一种异常根因分析装置,其特征在于,包括:
    第二数据获取模块,用于获取目标对象对应的待处理样本数据;
    第二数据处理模块,用于确定所述待处理样本数据中的正样本和负样本;其中,所述正样本和所述负样本均包括第一参数;
    第二根因确定模块,用于将所述正样本和所述负样本输入至第二根因分析模型中,得到所述目标对象的判定结果的第二影响因子信息。
  23. 一种电子设备,其特征在于,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如权利要求1至15中任一项所述的异常根因分析方法。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至15中任一项所述的异常根因分析方法。
PCT/CN2022/119262 2022-09-16 2022-09-16 异常根因分析方法及装置 WO2024055281A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/119262 WO2024055281A1 (zh) 2022-09-16 2022-09-16 异常根因分析方法及装置
CN202280003212.8A CN118056189A (zh) 2022-09-16 2022-09-16 异常根因分析方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/119262 WO2024055281A1 (zh) 2022-09-16 2022-09-16 异常根因分析方法及装置

Publications (1)

Publication Number Publication Date
WO2024055281A1 true WO2024055281A1 (zh) 2024-03-21

Family

ID=90273936

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/119262 WO2024055281A1 (zh) 2022-09-16 2022-09-16 异常根因分析方法及装置

Country Status (2)

Country Link
CN (1) CN118056189A (zh)
WO (1) WO2024055281A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111970157A (zh) * 2020-08-27 2020-11-20 广州华多网络科技有限公司 网络故障根因检测方法、装置、计算机设备及存储介质
CN112019932A (zh) * 2020-08-27 2020-12-01 广州华多网络科技有限公司 网络故障根因定位方法、装置、计算机设备及存储介质
CN113570000A (zh) * 2021-09-08 2021-10-29 南开大学 一种基于多模型融合的海洋单要素观测质量控制方法
CN114490303A (zh) * 2022-04-07 2022-05-13 阿里巴巴达摩院(杭州)科技有限公司 故障根因确定方法、装置和云设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111970157A (zh) * 2020-08-27 2020-11-20 广州华多网络科技有限公司 网络故障根因检测方法、装置、计算机设备及存储介质
CN112019932A (zh) * 2020-08-27 2020-12-01 广州华多网络科技有限公司 网络故障根因定位方法、装置、计算机设备及存储介质
CN113570000A (zh) * 2021-09-08 2021-10-29 南开大学 一种基于多模型融合的海洋单要素观测质量控制方法
CN114490303A (zh) * 2022-04-07 2022-05-13 阿里巴巴达摩院(杭州)科技有限公司 故障根因确定方法、装置和云设备

Also Published As

Publication number Publication date
CN118056189A (zh) 2024-05-17

Similar Documents

Publication Publication Date Title
US11138376B2 (en) Techniques for information ranking and retrieval
Fatima et al. Comparison of SQL, NoSQL and NewSQL databases for internet of things
US7565335B2 (en) Transform for outlier detection in extract, transfer, load environment
US11972548B2 (en) Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system
CN106067094A (zh) 一种动态评估方法及***
CN107077489A (zh) 用于多维数据的自动洞察
US20220179873A1 (en) Data management platform, intelligent defect analysis system, intelligent defect analysis method, computer-program product, and method for defect analysis
US11797557B2 (en) Data management platform, intelligent defect analysis system, intelligent defect analysis method, computer-program product, and method for defect analysis
WO2021142622A1 (zh) 确定不良原因的方法、电子设备、存储介质及***
Vo et al. Next generation business intelligence and analytics
Dill Big data
WO2024055281A1 (zh) 异常根因分析方法及装置
WO2009006028A2 (en) Explaining changes in measures thru data mining
Dong et al. Scene-based big data quality management framework
WO2022116111A1 (en) Computer-implemented method for defect analysis, computer-implemented method of evaluating likelihood of defect occurrence, apparatus for defect analysis, computer-program product, and intelligent defect analysis system
CN115879046A (zh) 基于改进特征选择和分层模型的物联网异常数据检测方法
US11989182B2 (en) Systems and method for dynamically updating materiality distributions and classifications
US20240004375A1 (en) Data processing method, and electronic device and storage medium
US20220182442A1 (en) Computer-implemented method for defect analysis, computer-implemented method of evaluating likelihood of defect occurrence, apparatus for defect analysis, computer-program product, and intelligent defect analysis system
US12032364B2 (en) Computer-implemented method for defect analysis, computer-implemented method of evaluating likelihood of defect occurrence, apparatus for defect analysis, computer-program product, and intelligent defect analysis system
CN114868092B (zh) 数据管理平台、缺陷分析***、缺陷分析方法、计算机存储介质和用于缺陷分析的方法
WO2022088084A1 (zh) 数据处理方法、装置及***、电子设备
Usman et al. Predictive Analysis on Large Data for Actionable Knowledge: Emerging Research and Opportunities: Emerging Research and Opportunities
CN113868322B (zh) 一种语义结构解析方法、装置、设备及虚拟化***、介质
WO2022198680A1 (zh) 数据处理方法及装置、电子设备、存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22958472

Country of ref document: EP

Kind code of ref document: A1