WO2018166113A1 - 随机森林模型训练的方法、电子装置及存储介质 - Google Patents

随机森林模型训练的方法、电子装置及存储介质 Download PDF

Info

Publication number
WO2018166113A1
WO2018166113A1 PCT/CN2017/091362 CN2017091362W WO2018166113A1 WO 2018166113 A1 WO2018166113 A1 WO 2018166113A1 CN 2017091362 W CN2017091362 W CN 2017091362W WO 2018166113 A1 WO2018166113 A1 WO 2018166113A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
model
random forest
forest model
variable
Prior art date
Application number
PCT/CN2017/091362
Other languages
English (en)
French (fr)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to US16/084,232 priority Critical patent/US20210081847A1/en
Priority to JP2018530893A priority patent/JP6587330B2/ja
Priority to AU2017404119A priority patent/AU2017404119A1/en
Priority to EP17897210.5A priority patent/EP3413212A4/en
Priority to KR1020187017282A priority patent/KR102201919B1/ko
Priority to SG11201809890PA priority patent/SG11201809890PA/en
Publication of WO2018166113A1 publication Critical patent/WO2018166113A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Definitions

  • the present invention relates to the field of machine learning technology, and in particular, to a method, an electronic device and a storage medium for training a random forest model.
  • a random forest is a classifier that uses multiple trees to train and predict sample data. It is a classifier that contains multiple decision trees.
  • the decision tree is a process of classifying data through a series of rules.
  • more and more enterprises that provide online services for example, remote insurance, remote claims, online wealth management, etc.
  • the object of the present invention is to provide a method, an electronic device and a medium for training random forest models, aiming at reducing the number of random forest model training, reducing system load and improving system efficiency.
  • a first aspect of the present invention provides a method for training a random forest model, and the method for training the random forest model includes:
  • the model training control system analyzes whether the condition of the model training has been met
  • the random forest model is used for corrective training by using sample data.
  • a second aspect of the present invention provides an electronic device including a processing device, a storage device, and a model training control system, wherein the model training control system is stored in the storage device, including at least one calculation Machine readable instructions executable by the processing device to:
  • the model training control system analyzes whether the condition of the model training has been met
  • the random forest model is used for corrective training by using sample data.
  • a third aspect of the invention provides a computer readable storage medium having stored thereon at least one computer readable instruction executable by a processing device to:
  • the model training control system analyzes whether the condition of the model training has been met
  • the random forest model is used for corrective training by using sample data.
  • the invention has the beneficial effects that: when the user of the online service is classified by using the random forest model, the invention can set or limit the conditions of the random forest model for training, and reduce the number of model trainings without affecting the development of the online service. And the type of model training can be further selected, that is, when the model training condition is satisfied, it is further confirmed whether the current random forest model is reconstructed or modified, and the selective training of the random forest model can greatly Reduce the burden on the system, improve the performance of the online business system, and facilitate the effective development of online business.
  • FIG. 1 is a schematic diagram of an application environment of a preferred embodiment of a method for training a random forest model according to the present invention
  • FIG. 2 is a schematic flow chart of a preferred embodiment of a method for training a random forest model according to the present invention
  • FIG. 3 is a schematic diagram showing the refinement process of step S4 shown in FIG. 2;
  • FIG. 4 is a schematic structural diagram of a preferred embodiment of a model training control system according to the present invention.
  • FIG. 5 is a schematic structural diagram of the second training module shown in FIG. 4.
  • FIG. 1 it is a schematic diagram of an application environment of a preferred embodiment of the method for implementing random forest model training according to the present invention.
  • the application environment diagram includes an electronic device 1 and a terminal device 2.
  • the electronic device 1 can perform data interaction with the terminal device 2 through a suitable technology such as a network or a near field communication technology.
  • the terminal device 2 includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, or an individual.
  • PDA Personal Digital Assistant
  • game console Internet Protocol Television (IPTV)
  • smart wearable device etc.
  • the electronic device 1 is an apparatus capable of automatically performing numerical calculation and/or information processing in accordance with an instruction set or stored in advance.
  • the electronic device 1 may be a computer, a single network server, a server group composed of multiple network servers, or a cloud-based cloud composed of a large number of hosts or network servers, where cloud computing is a type of distributed computing, A super virtual computer consisting of a loosely coupled set of computers.
  • the electronic device 1 includes, but is not limited to, a storage device 11, a processing device 12, and a network interface 13 that are communicably connected to each other through a system bus. It should be noted that FIG. 1 only shows the electronic device 1 having the components 11-13, but it should be understood that not all illustrated components are required to be implemented, and more or fewer components may be implemented instead.
  • the storage device 11 includes a memory and at least one type of readable storage medium.
  • the memory provides a cache for the operation of the electronic device 1;
  • the readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
  • the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1; in other embodiments, the non-volatile storage medium may also be external to the electronic device 1.
  • a storage device such as a plug-in hard disk equipped with an electronic device 1, a smart memory card (SMC), a Secure Digital (SD) card, a flash card, or the like.
  • SMC smart memory card
  • SD Secure Digital
  • the readable storage medium of the storage device 11 is generally used to store an operating system installed in the electronic device 1 and various types of application software, such as program code of the model training control system 10 in the preferred embodiment of the present application. Further, the storage device 11 can also be used to temporarily store various types of data that have been output or are to be output.
  • Processing device 12 may, in some embodiments, include one or more microprocessors, microcontrollers, digital processors, and the like.
  • the processing device 12 is generally used to control the operation of the electronic device 1, for example, to perform control and processing related to data interaction or communication with the terminal device 2.
  • the processing device 12 is configured to run program code or process data stored in the storage device 11, such as running the model training control system 10 and the like.
  • the network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the electronic device 1 and other electronic devices.
  • the network interface 13 is mainly used to connect the electronic device 1 with one or more terminal devices 2, and establish a data transmission channel and a communication connection between the electronic device 1 and one or more terminal devices 2.
  • the model training control system 10 includes at least one computer readable instructions stored in the storage device 11 that are executable by the processing device 12 to implement the method of random forest model training of various embodiments of the present application. As described later, the at least one computer readable instruction can be classified into different logic modules depending on the functions implemented by its various parts.
  • the model training control system 10 when executed by the processing device 12, the following operations are performed: the model training control system analyzes whether the condition of the model training has been satisfied; If the component is satisfied, it is determined whether reconfigurable training is needed for the random forest model; if reconfigurable training is required for the random forest model, the random forest model is reconstructed using the sample data; Reconstructive training is needed on the random forest model, and the random forest model is used for corrective training.
  • FIG. 2 is a schematic flowchart of a preferred embodiment of a method for training a random forest model according to the present invention.
  • the method for training a random forest model in this embodiment is not limited to the steps shown in the process, and is also shown in the flowchart. In the steps, some steps may be omitted, and the order between the steps may be changed.
  • the method of training the random forest model includes the following steps:
  • Step S1 the model training control system analyzes whether the condition of the model training has been satisfied
  • Model training includes reconstructive training and corrective training.
  • the conditions for model training are set on the model training control system, which can be flexibly set manually or by using the default conditions preset in the model training control system.
  • the conditions of the model training are based on the user service data (for example, the model training when the user service data reaches a certain number), or based on actual needs (for example, the staff of the model training control system will be based on actual conditions).
  • the demand sends a model training instruction to the model training control system for model training, or is timed by a timer, and after a model training is finished, the random forest model is trained every predetermined time, and so on.
  • Step S2 if the condition of the model training has been met, determine whether it is necessary to perform reconfigive training on the random forest model;
  • determining whether reconfigurable training for the random forest model is required may be based on the number of user service data between the two reconfigurable trainings (for example, the number of user service data between the two reconfigurable trainings) Reconstructive training when more than a certain number), or based on actual needs (for example, the staff of the model training control system will send instructions for reconfiguring training to the model training control system according to actual needs for reconstruction. Training) waiting.
  • determining whether reconfigurable training is required for the random forest model includes:
  • Obtaining a second quantity of new user service data in a time period from the time when the previous reconfiguration training is completed in the service system to the current time for example, the number of service data of the second number is 500 users, if the number The second number is greater than the second preset threshold, determining that the reconfigurable training needs to be performed on the random forest model, and if the second quantity is greater than the first preset threshold and less than the second threshold, corrective training is performed on the random forest model; or
  • a predetermined terminal for example, a suitable electronic terminal such as a mobile phone, a tablet computer, a computer, etc.
  • the model training control system further sends information to the predetermined terminal asking whether it performs reconfigurable training. If the terminal receives the confirmation instruction based on the inquiry request feedback, it is determined that the random forest model needs to be heavy.
  • the constructive training performs corrective training on the random forest model if the terminal receives a negative instruction based on the inquiry request feedback or does not receive feedback from the terminal within a predetermined time (for example, 3 minutes).
  • Step S3 if reconfigurable training is needed on the random forest model, reconstructive training is performed on the random forest model by using sample data;
  • step S4 if the reconfigive training of the random forest model is not required, the random forest model is used for corrective training.
  • the sample data data includes old sample data and newly added sample data.
  • Reconstructive training includes deterministic training of variables in random forest models and deterministic training of variable coefficients. Corrective training only includes deterministic training of variable coefficients of random forest models.
  • the variables of the random forest model include, for example, the type of the algorithm, the number of decision trees, the maximum depth of the decision tree, and various data of the leaf nodes of the decision tree. Reconstructive training uses more system resources than corrective training.
  • the present embodiment can set or limit the conditions of the random forest model for training, and reduce the model training without affecting the development of the online service.
  • the number of times, and the type of model training can be further selected, that is, when the model training condition is satisfied, it is further confirmed whether the current random forest model is reconstructed or modified, and the selective training of the random forest model can be performed.
  • the foregoing step S4 includes:
  • S42 Perform variable value calculation on each variable in a range of values of the corresponding variable coefficients, and perform corrective training on the random forest model according to the variable coefficient after the value.
  • variable of the random forest model may be associated with the variable coefficient value range in advance, and the associated mapping relationship may be stored (for example, stored in the form of a list).
  • the stored association mapping relationship is obtained to further obtain the range of values of the corresponding variable coefficients, and then the variable is changed.
  • the quantity coefficient is only valued within the obtained value range to ensure the accuracy of the model training, and the speed of the model training is effectively improved, and the coefficients of the variables of the random forest model are prevented from being trained from the full numerical range.
  • FIG. 4 is a functional block diagram of a preferred embodiment of the model training control system 10 of the present invention.
  • model training control system 10 may be partitioned into one or more modules, one or more modules being stored in a memory and executed by one or more processors to perform the present invention.
  • the model training control system 10 can be divided into a detection module 21, an identification module 22, a replication module 23, an installation module 24, and a startup module 25.
  • module refers to a series of computer program instruction segments capable of performing a specific function, and is more suitable than the program for describing the execution process of the model training control system 10 in an electronic device, wherein:
  • the analyzing module 101 is configured to analyze whether the condition of the model training has been met;
  • Model training includes reconstructive training and corrective training.
  • the condition of the model training is set on the model training control system 10, and can be flexibly set by manual or the default condition preset in the model training control system.
  • the conditions of the model training are based on the user service data (for example, the model training when the user service data reaches a certain number), or based on actual needs (for example, the staff of the model training control system will be based on actual conditions).
  • the demand sends a model training instruction to the model training control system for model training, or is timed by a timer, and after a model training is finished, the random forest model is trained every predetermined time, and so on.
  • the analysis module 10 is specifically configured to acquire a first quantity of new user service data in a time period from the time when the training of the previous model ends to the current time in the service system (for example, the first quantity is the service data of 200 users) If the first quantity is greater than the first preset threshold, the condition of the model training is satisfied (reconfigurable training or corrective training may be performed), and if the first quantity is less than or equal to the first preset threshold, the model The conditions of training are not met (no reconstructive training and corrective training); or
  • the determining module 102 is configured to determine whether a reconfigurable training of the random forest model is needed if the condition of the model training has been met;
  • determining whether reconfigurable training for the random forest model is required may be based on the number of user service data between the two reconfigurable trainings (for example, the number of user service data between the two reconfigurable trainings) Reconstructive training when more than a certain amount), or based on actual needs (for example, the staff of the model training control system will The instruction for performing reconfigurable training is sent to the model training control system according to actual needs to perform reconfigurable training) waiting.
  • the determining module 102 is specifically configured to acquire a second quantity of the newly added user service data in the time period from the time when the previous reconfigurable training ends in the service system to the current time (for example, the second quantity is the service of 500 users) If the second quantity is greater than the second preset threshold, determining that reconstruction training needs to be performed on the random forest model, if the second quantity is greater than the first preset threshold and less than the second threshold, Corrective training of random forest models; or
  • a predetermined terminal for example, a suitable electronic terminal such as a mobile phone, a tablet computer, a computer, etc.
  • the model training control system further sends information to the predetermined terminal asking whether it performs reconfigurable training. If the terminal receives the confirmation instruction based on the inquiry request feedback, it is determined that the random forest model needs to be heavy.
  • the constructive training performs corrective training on the random forest model if the terminal receives a negative instruction based on the inquiry request feedback or does not receive feedback from the terminal within a predetermined time (for example, 3 minutes).
  • the first training module 103 is configured to perform reconfigive training on the random forest model by using sample data if reconfigurable training is needed on the random forest model;
  • the second training module 104 is configured to perform corrective training on the random forest model by using sample data if reconfigurable training is not required for the random forest model.
  • the sample data data includes old sample data and newly added sample data.
  • Reconstructive training includes deterministic training of variables in random forest models and deterministic training of variable coefficients. Corrective training only includes deterministic training of variable coefficients of random forest models.
  • the variables of the random forest model include, for example, the type of the algorithm, the number of decision trees, the maximum depth of the decision tree, and various data of the leaf nodes of the decision tree. Reconstructive training uses more system resources than corrective training.
  • the present embodiment can set or limit the conditions of the random forest model for training, and reduce the model training without affecting the development of the online service.
  • the number of times, and the type of model training can be further selected, that is, when the model training condition is satisfied, it is further confirmed whether the current random forest model is reconstructed or modified, and the selective training of the random forest model can be performed.
  • the second training module 104 includes:
  • a determining unit 1041 configured to determine, according to a mapping relationship between a variable of the predetermined random forest model and a value range of the variable coefficient, a value range of the variable coefficient corresponding to each of the variables;
  • the training unit 1042 is configured to perform variable coefficient calculation on each variable in a range of values of the corresponding variable coefficients, and perform corrective training on the random forest model according to the variable coefficient after the value.
  • variable forest variable model and the variable coefficient value range may be pre-processed. Associate mappings and store their associated mappings (for example, stored as a list). Before training the random forest model, after determining the variables of the random forest model, the stored association mapping relationship is obtained to further obtain the range of values of the corresponding variable coefficients, and then the variable coefficients of the variable are only in the obtained values. The values are taken within the range to ensure the accuracy of the model training, and the speed of the model training is effectively improved, and the coefficients of the variables of the random forest model are avoided from the value range training.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Human Resources & Organizations (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Sewing Machines And Sewing (AREA)

Abstract

一种随机森林模型训练的方法、电子装置及存储介质,所述随机森林模型训练的方法包括:模型训练控制***分析模型训练的条件是否已满足(S1);若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练(S2);若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练(S3);若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练(S4)。该方法能够减少随机森林模型训练的次数,减轻***负担并提高***效能。

Description

随机森林模型训练的方法、电子装置及存储介质
优先权申明
本申请基于巴黎公约申明享有2017年3月13日递交的申请号为CN201710147698.3、名称为“随机森林模型训练的方法及模型训练控制***”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本发明涉及机器学习技术领域,尤其涉及一种随机森林模型训练的方法、电子装置及存储介质。
背景技术
在机器学习中,随机森林是利用多棵树对样本数据进行训练并预测的一种分类器,是一个包含多个决策树的分类器,决策树是通过一系列规则对数据进行分类的过程。目前越来越多的提供在线业务(例如,远程投保、远程理赔、在线理财等业务)的企业在业务***中采用随机森林对用户进行分类标签识别,进而根据识别结果对用户进行精准的业务推荐和办理。
然而,当有新的数据可供做样本数据进行迭代训练以提升模型识别的精确性时,现有的技术方案是同时使用旧的样本数据及新的样本数据重新对随机森林模型进行重构性训练,重构性训练指的是需要改变随机森林模型中决策树结构的训练。这种训练方案通常是一旦有新的样本数据就执行一次重构性训练,训练次数多,特别是在线业务的数据变动频繁的情况下,训练过于频繁,***负担过重,影响在线业务***的效能及在线业务的有效开展。
发明内容
本发明的目的在于提供一种随机森林模型训练的方法、电子装置及介质,旨在减少随机森林模型训练的次数,减轻***负担并提高***效能。
本发明第一方面提供一种随机森林模型训练的方法,所述随机森林模型训练的方法包括:
S1,模型训练控制***分析模型训练的条件是否已满足;
S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
本发明第二方面提供一种电子装置,包括处理设备、存储设备及模型训练控制***,该模型训练控制***存储于该存储设备中,包括至少一个计算 机可读指令,该至少一个计算机可读指令可被所述处理设备执行,以实现以下操作:
S1,模型训练控制***分析模型训练的条件是否已满足;
S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
本发明第三方面提供一种计算机可读存储介质,其上存储有至少一个可被处理设备执行以实现以下操作的计算机可读指令:
S1,模型训练控制***分析模型训练的条件是否已满足;
S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
本发明的有益效果是:本发明在使用随机森林模型对在线业务的用户进行分类时,可以设定或限制随机森林模型进行训练的条件,在不影响在线业务的开展的同时减少模型训练的次数,并且可以进一步选择模型训练的类型,即在满足模型训练条件时,进一步确认当前对随机森林模型是进行重构性训练还是进行修正性训练,通过对随机森林模型进行选择性的训练,可以大大减轻***的负担,提高在线业务***的效能,并有利于在线业务的有效开展。
附图说明
图1为本发明随机森林模型训练的方法较佳实施例的应用环境示意图;
图2为本发明随机森林模型训练的方法较佳实施例的流程示意图;
图3为图2所示步骤S4的细化流程示意图;
图4为本发明模型训练控制***较佳实施例的结构示意图;
图5为图4所示第二训练模块的结构示意图。
具体实施方式
以下结合附图对本发明的原理和特征进行描述,所举实例只用于解释本发明,并非用于限定本发明的范围。
参阅图1所示,是本发明实现随机森林模型训练的方法的较佳实施例的应用环境示意图。该应用环境示意图包括电子装置1及终端设备2。电子装置1可以通过网络、近场通信技术等适合的技术与终端设备2进行数据交互。
终端设备2包括,但不限于,任何一种可与用户通过键盘、鼠标、遥控器、触摸板或者声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、个人数字助理(Personal Digital Assistant,PDA),游戏机、交互式网络电视(Internet Protocol Television,IPTV)、智能式穿戴设备等。
电子装置1是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。电子装置1可以是计算机、也可以是单个网络服务器、多个网络服务器组成的服务器组或者基于云计算的由大量主机或者网络服务器构成的云,其中云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。
在本实施例中,电子装置1包括,但不仅限于,可通过***总线相互通信连接的存储设备11、处理设备12、及网络接口13。需要指出的是,图1仅示出了具有组件11-13的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
其中,存储设备11包括内存及至少一种类型的可读存储介质。内存为电子装置1的运行提供缓存;可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中,可读存储介质可以是电子装置1的内部存储单元,例如该电子装置1的硬盘;在另一些实施例中,该非易失性存储介质也可以是电子装置1的外部存储设备,例如电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。本实施例中,存储设备11的可读存储介质通常用于存储安装于电子装置1的操作***和各类应用软件,例如本申请较佳实施例中的模型训练控制***10的程序代码等。此外,存储设备11还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理设备12在一些实施例中可以包括一个或者多个微处理器、微控制器、数字处理器等。该处理设备12通常用于控制电子装置1的运行,例如执行与终端设备2进行数据交互或者通信相关的控制和处理等。在本实施例中,处理设备12用于运行存储设备11中存储的程序代码或者处理数据,例如运行模型训练控制***10等。
网络接口13可包括无线网络接口或有线网络接口,该网络接口13通常用于在电子装置1与其他电子设备之间建立通信连接。本实施例中,网络接口13主要用于将电子装置1与一个或多个终端设备2相连,在电子装置1与一个或多个终端设备2之间建立数据传输通道和通信连接。
模型训练控制***10包括至少一个存储在存储设备11中的计算机可读指令,该至少一个计算机可读指令可被处理设备12执行,以实现本申请各实施例的随机森林模型训练的方法。如后续所述,该至少一个计算机可读指令依据其各部分所实现的功能不同,可被划为不同的逻辑模块。
在一实施例中,模型训练控制***10被处理设备12执行时,实现以下操作:模型训练控制***分析模型训练的条件是否已满足;若模型训练的条 件已满足,则确定是否需要对随机森林模型进行重构性训练;若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
如图2所示,图2为本发明随机森林模型训练的方法较佳实施例的流程示意图,本实施例随机森林模型训练的方法并不限于流程中所示的步骤,此外流程图中所示步骤中,某些步骤可以省略、步骤之间的顺序可以改变。该随机森林模型训练的方法包括以下步骤:
步骤S1,模型训练控制***分析模型训练的条件是否已满足;
模型训练包括重构性训练及修正性训练。模型训练的条件在模型训练控制***上进行设定,可以由人工进行灵活设定或者采用模型训练控制系中预设的默认条件。
其中,对于在线业务,模型训练的条件以用户业务数据为判断基准(例如用户业务数据达到一定数量时进行模型训练),或者以实际需要为基准(例如,模型训练控制***的工作人员会根据实际需求向模型训练控制***发送进行模型训练的指令,以进行模型训练),或者由定时器进行定时,在一次模型训练结束后,每隔预定时间对随机森林模型进行模型训练,等等。
优选地,分析模型训练的条件是否已满足包括:
获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量(例如第一数量为200个用户的业务数据的数量),若所述第一数量大于第一预设阈值,则模型训练的条件已满足(可以进行重构性训练或者修正性训练),若第一数量小于等于第一预设阈值,则模型训练的条件不满足(不进行重构性训练及者修正性训练);或者
实时或定时(例如每隔10分钟)检测是否接收到模型训练指令,例如:由模型训练控制***的工作人员登录***,在进入模型训练的操作界面后,通过点击或触发模型训练的操作界面上的“模型训练”按钮,进而发出模型训练指令,在模型训练控制***接收到模型训练指令时,模型训练的条件已满足(可以进行重构性训练或者修正性训练),如果没有接收到模型训练指令,则模型训练的条件不满足(不进行重构性训练及者修正性训练)。
步骤S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
如果模型训练的条件已满足,则进一步确定是对随机森林模型进行重构性训练还是进行修正性训练。其中,确定是否需要对随机森林模型进行重构性训练可以以两次进行重构性训练之间的用户业务数据的数量为基准(例如两次进行重构性训练之间的用户业务数据的数量大于一定数量时进行重构性训练),或者以实际需要为基准(例如,模型训练控制***的工作人员会根据实际需求向模型训练控制***发送进行重构性训练的指令,以进行重构性训练)等待。
优选地,确定是否需要对随机森林模型进行重构性训练包括:
获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量(例如第二数量为500个用户的业务数据的数量),若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练,如果第二数量大于第一预设阈值且小于第二阈值,则对随机森林模型进行修正性训练;或者
向预定的终端(例如手机、平板电脑、计算机等适用的电子终端)发送是否需要对所述随机森林模型进行重构性训练的询问请求,例如:在进入模型训练的操作界面并发出“模型训练”的指令后,模型训练控制***会进一步向预定的终端发送询问其是否进行重构性训练的信息,若接收到终端基于询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练,若接收到终端基于询问请求反馈的否定的指令或者在预定的时间(例如3分钟)内未收到终端的反馈,则对随机森林模型进行修正性训练。
步骤S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
步骤S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
本实施例中,样本数据数据包括旧的样本数据及新增的样本数据。重构性训练包括随机森林模型的变量的确定性训练和变量系数的确定性训练,修正性训练仅包括随机森林模型的变量系数的确定性训练。其中,随机森林模型的变量例如包括算法的类型、决策树的数量、决策树的最大深度及决策树的叶子节点的各种数据等等。重构性训练所使用的***资源较修正性训练所使用的***资源要多。
与现有技术相比,本实施例在使用随机森林模型对在线业务的用户进行分类时,可以设定或限制随机森林模型进行训练的条件,在不影响在线业务的开展的同时减少模型训练的次数,并且可以进一步选择模型训练的类型,即在满足模型训练条件时,进一步确认当前对随机森林模型是进行重构性训练还是进行修正性训练,通过对随机森林模型进行选择性的训练,可以大大减轻***的负担,提高在线业务***的效能,并有利于在线业务的有效开展。
在一优选的实施例中,如图3所示,在上述图2的实施例的基础上,上述步骤S4包括:
S41,根据预定的随机森林模型的变量与变量系数取值范围的映射关系,确定各个所述变量对应的变量系数取值范围;
S42,对各个所述变量在对应的变量系数取值范围中进行变量系数取值,根据取值后的变量系数对所述随机森林模型进行修正性训练。
本实施例中,可以将随机森林模型的变量与变量系数取值范围预先进行关联映射,并将其关联映射关系进行存储(例如以列表的形式进行存储)。在对随机森林模型进行训练前,在确定随机森林模型的变量后,获取所存储的关联映射关系以进一步获取对应的变量系数的取值范围,然后该变量的变 量系数仅仅在所获取的取值范围内进行取值,以保证模型训练的准确性同时,有效提升模型训练的速度,避免随机森林模型的各个变量的系数从全数值域范围进行取值训练。
请参阅图4,图4是本发明模型训练控制***10较佳实施例的功能模块图。在本实施例中,模型训练控制***10可以被分割成一个或多个模块,一个或者多个模块被存储于存储器中,并由一个或多个处理器所执行,以完成本发明。例如,在图4,模型训练控制***10可以被分割成侦测模块21、识别模块22、复制模块23、安装模块24及启动模块25。本发明所称的模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述模型训练控制***10在电子装置中的执行过程,其中:
分析模块101,用于分析模型训练的条件是否已满足;
模型训练包括重构性训练及修正性训练。模型训练的条件在模型训练控制***10上进行设定,可以由人工进行灵活设定或者采用模型训练控制系中预设的默认条件。
其中,对于在线业务,模型训练的条件以用户业务数据为判断基准(例如用户业务数据达到一定数量时进行模型训练),或者以实际需要为基准(例如,模型训练控制***的工作人员会根据实际需求向模型训练控制***发送进行模型训练的指令,以进行模型训练),或者由定时器进行定时,在一次模型训练结束后,每隔预定时间对随机森林模型进行模型训练,等等。
优选地,分析模块10具体用于获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量(例如第一数量为200个用户的业务数据的数量),若所述第一数量大于第一预设阈值,则模型训练的条件已满足(可以进行重构性训练或者修正性训练),若第一数量小于等于第一预设阈值,则模型训练的条件不满足(不进行重构性训练及者修正性训练);或者
实时或定时(例如每隔10分钟)检测是否接收到模型训练指令,例如:由模型训练控制***的工作人员登录***,在进入模型训练的操作界面后,通过点击或触发模型训练的操作界面上的“模型训练”按钮,进而发出模型训练指令,在模型训练控制***接收到模型训练指令时,模型训练的条件已满足(可以进行重构性训练或者修正性训练),如果没有接收到模型训练指令,则模型训练的条件不满足(不进行重构性训练及者修正性训练)。
确定模块102,用于若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
如果模型训练的条件已满足,则进一步确定是对随机森林模型进行重构性训练还是进行修正性训练。其中,确定是否需要对随机森林模型进行重构性训练可以以两次进行重构性训练之间的用户业务数据的数量为基准(例如两次进行重构性训练之间的用户业务数据的数量大于一定数量时进行重构性训练),或者以实际需要为基准(例如,模型训练控制***的工作人员会 根据实际需求向模型训练控制***发送进行重构性训练的指令,以进行重构性训练)等待。
优选地,确定模块102具体用于获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量(例如第二数量为500个用户的业务数据的数量),若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练,如果第二数量大于第一预设阈值且小于第二阈值,则对随机森林模型进行修正性训练;或者
向预定的终端(例如手机、平板电脑、计算机等适用的电子终端)发送是否需要对所述随机森林模型进行重构性训练的询问请求,例如:在进入模型训练的操作界面并发出“模型训练”的指令后,模型训练控制***会进一步向预定的终端发送询问其是否进行重构性训练的信息,若接收到终端基于询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练,若接收到终端基于询问请求反馈的否定的指令或者在预定的时间(例如3分钟)内未收到终端的反馈,则对随机森林模型进行修正性训练。
第一训练模块103,用于若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
第二训练模块104,用于若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
本实施例中,样本数据数据包括旧的样本数据及新增的样本数据。重构性训练包括随机森林模型的变量的确定性训练和变量系数的确定性训练,修正性训练仅包括随机森林模型的变量系数的确定性训练。其中,随机森林模型的变量例如包括算法的类型、决策树的数量、决策树的最大深度及决策树的叶子节点的各种数据等等。重构性训练所使用的***资源较修正性训练所使用的***资源要多。
与现有技术相比,本实施例在使用随机森林模型对在线业务的用户进行分类时,可以设定或限制随机森林模型进行训练的条件,在不影响在线业务的开展的同时减少模型训练的次数,并且可以进一步选择模型训练的类型,即在满足模型训练条件时,进一步确认当前对随机森林模型是进行重构性训练还是进行修正性训练,通过对随机森林模型进行选择性的训练,可以大大减轻***的负担,提高在线业务***的效能,并有利于在线业务的有效开展。
在一优选的实施例中,如图5所示,在上述图4的实施例的基础上,上述第二训练模块104包括:
确定单元1041,用于根据预定的随机森林模型的变量与变量系数取值范围的映射关系,确定各个所述变量对应的变量系数取值范围;
训练单元1042,用于对各个所述变量在对应的变量系数取值范围中进行变量系数取值,根据取值后的变量系数对所述随机森林模型进行修正性训练。
本实施例中,可以将随机森林模型的变量与变量系数取值范围预先进行 关联映射,并将其关联映射关系进行存储(例如以列表的形式进行存储)。在对随机森林模型进行训练前,在确定随机森林模型的变量后,获取所存储的关联映射关系以进一步获取对应的变量系数的取值范围,然后该变量的变量系数仅仅在所获取的取值范围内进行取值,以保证模型训练的准确性同时,有效提升模型训练的速度,避免随机森林模型的各个变量的系数从全数值域范围进行取值训练。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (20)

  1. 一种随机森林模型训练的方法,其特征在于,所述随机森林模型训练的方法包括:
    S1,模型训练控制***分析模型训练的条件是否已满足;
    S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
    S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
    S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
  2. 根据权利要求1所述的随机森林模型训练的方法,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  3. 根据权利要求1所述的随机森林模型训练的方法,其特征在于,所述步骤S2包括:
    获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量,若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练;或者
    向预定的终端发送是否需要对所述随机森林模型进行重构性训练的询问请求,若接收到所述终端基于所述询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练。
  4. 根据权利要求1所述的随机森林模型训练的方法,其特征在于,所述重构性训练包括所述随机森林模型的变量的确定性训练和变量系数的确定性训练,所述修正性训练包括所述随机森林模型的变量系数的确定性训练。
  5. 根据权利要求4所述的随机森林模型训练的方法,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  6. 根据权利要求4所述的随机森林模型训练的方法,其特征在于,所述步骤S2包括:
    获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量,若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练;或者
    向预定的终端发送是否需要对所述随机森林模型进行重构性训练的询问请求,若接收到所述终端基于所述询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练。
  7. 根据权利要求4所述的随机森林模型训练的方法,其特征在于,所述步骤S4包括:
    S41,根据预定的随机森林模型的变量与变量系数取值范围的映射关系,确定各个所述变量对应的变量系数取值范围;
    S42,对各个所述变量在对应的变量系数取值范围中进行变量系数取值,根据取值后的变量系数对所述随机森林模型进行修正性训练。
  8. 一种电子装置,其特征在于,包括处理设备、存储设备及模型训练控制***,该模型训练控制***存储于该存储设备中,包括至少一个计算机可读指令,该至少一个计算机可读指令可被所述处理设备执行,以实现以下操作:
    S1,模型训练控制***分析模型训练的条件是否已满足;
    S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
    S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
    S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
  9. 根据权利要求8所述的电子装置,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  10. 根据权利要求8所述的电子装置,其特征在于,所述步骤S2包括:
    获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量,若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练;或者
    向预定的终端发送是否需要对所述随机森林模型进行重构性训练的询问请求,若接收到所述终端基于所述询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练。
  11. 根据权利要求8所述的电子装置,其特征在于,所述重构性训练包括所述随机森林模型的变量的确定性训练和变量系数的确定性训练,所述修正性训练包括所述随机森林模型的变量系数的确定性训练。
  12. 根据权利要求11所述的电子装置,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  13. 根据权利要求11所述的电子装置,其特征在于,所述步骤S2包括:
    获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量,若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练;或者
    向预定的终端发送是否需要对所述随机森林模型进行重构性训练的询问请求,若接收到所述终端基于所述询问请求反馈的确认指令,则确定需要对所述随机森林模型进行重构性训练。
  14. 根据权利要求11所述的电子装置,其特征在于,所述步骤S4包括:
    S41,根据预定的随机森林模型的变量与变量系数取值范围的映射关系,确定各个所述变量对应的变量系数取值范围;
    S42,对各个所述变量在对应的变量系数取值范围中进行变量系数取值,根据取值后的变量系数对所述随机森林模型进行修正性训练。
  15. 一种计算机可读存储介质,其特征在于,其上存储有至少一个可被处理设备执行以实现以下操作的计算机可读指令:
    S1,模型训练控制***分析模型训练的条件是否已满足;
    S2,若模型训练的条件已满足,则确定是否需要对随机森林模型进行重构性训练;
    S3,若需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行重构性训练;
    S4,若不需要对所述随机森林模型进行重构性训练,则利用样本数据对所述随机森林模型进行修正性训练。
  16. 根据权利要求15所述的存储介质,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  17. 根据权利要求15所述的存储介质,其特征在于,所述步骤S2包括:
    获取业务***中前一次重构性训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第二数量,若所述第二数量大于第二预设阈值,则确定需要对所述随机森林模型进行重构性训练;或者
    向预定的终端发送是否需要对所述随机森林模型进行重构性训练的询问请求,若接收到所述终端基于所述询问请求反馈的确认指令,则确定需要 对所述随机森林模型进行重构性训练。
  18. 根据权利要求15所述的存储介质,其特征在于,所述重构性训练包括所述随机森林模型的变量的确定性训练和变量系数的确定性训练,所述修正性训练包括所述随机森林模型的变量系数的确定性训练。
  19. 根据权利要求18所述的存储介质,其特征在于,所述步骤S1包括:
    获取业务***中前一次模型训练结束的时刻至当前时刻的时间段内新增的用户业务数据的第一数量,若所述第一数量大于第一预设阈值,则模型训练的条件已满足;或者
    实时或定时检测是否接收到模型训练指令,若接收到模型训练指令,则模型训练的条件已满足。
  20. 根据权利要求18所述的存储介质,其特征在于,所述步骤S4包括:
    S41,根据预定的随机森林模型的变量与变量系数取值范围的映射关系,确定各个所述变量对应的变量系数取值范围;
    S42,对各个所述变量在对应的变量系数取值范围中进行变量系数取值,根据取值后的变量系数对所述随机森林模型进行修正性训练。
PCT/CN2017/091362 2017-03-13 2017-06-30 随机森林模型训练的方法、电子装置及存储介质 WO2018166113A1 (zh)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US16/084,232 US20210081847A1 (en) 2017-03-13 2017-06-30 Method of training random forest model, electronic device and storage medium
JP2018530893A JP6587330B2 (ja) 2017-03-13 2017-06-30 ランダムフォレストモデルの訓練方法、電子装置及び記憶媒体
AU2017404119A AU2017404119A1 (en) 2017-03-13 2017-06-30 Random forest model training method, electronic apparatus and storage medium
EP17897210.5A EP3413212A4 (en) 2017-03-13 2017-06-30 RANDOM FOREST MODEL LEARNING METHOD, ELECTRONIC APPARATUS, AND INFORMATION CARRIER
KR1020187017282A KR102201919B1 (ko) 2017-03-13 2017-06-30 랜덤 포레스트 모델의 훈련 방법, 전자장치 및 저장매체
SG11201809890PA SG11201809890PA (en) 2017-03-13 2017-06-30 Method of training random forest model, electronic device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710147698.3A CN107632995B (zh) 2017-03-13 2017-03-13 随机森林模型训练的方法及模型训练控制***
CN201710147698.3 2017-03-13

Publications (1)

Publication Number Publication Date
WO2018166113A1 true WO2018166113A1 (zh) 2018-09-20

Family

ID=61099137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/091362 WO2018166113A1 (zh) 2017-03-13 2017-06-30 随机森林模型训练的方法、电子装置及存储介质

Country Status (8)

Country Link
US (1) US20210081847A1 (zh)
EP (1) EP3413212A4 (zh)
JP (1) JP6587330B2 (zh)
KR (1) KR102201919B1 (zh)
CN (1) CN107632995B (zh)
AU (1) AU2017404119A1 (zh)
SG (1) SG11201809890PA (zh)
WO (1) WO2018166113A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070128A (zh) * 2019-04-22 2019-07-30 深圳市绘云生物科技有限公司 一种基于随机森林模型的慢性肝病风险评估***
CN111091408A (zh) * 2019-10-30 2020-05-01 北京天元创新科技有限公司 用户识别模型创建方法、装置与识别方法、装置
CN111767958A (zh) * 2020-07-01 2020-10-13 武汉楚精灵医疗科技有限公司 基于随机森林算法的肠镜退镜时间的实时监测方法
CN113466713A (zh) * 2021-07-15 2021-10-01 北京工业大学 一种基于随机森林的锂电池安全度估算方法及装置
CN116759014A (zh) * 2023-08-21 2023-09-15 启思半导体(杭州)有限责任公司 基于随机森林的气体种类及浓度预测方法、***及装置

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377388B (zh) * 2018-09-13 2023-08-18 深圳平安医疗健康科技服务有限公司 医保投保方法、装置、计算机设备和存储介质
US11625640B2 (en) * 2018-10-05 2023-04-11 Cisco Technology, Inc. Distributed random forest training with a predictor trained to balance tasks
CN109886544A (zh) * 2019-01-17 2019-06-14 新奥数能科技有限公司 构建设备能效曲线模型的方法、装置、介质及电子设备
CN110175677A (zh) * 2019-04-16 2019-08-27 平安普惠企业管理有限公司 自动更新方法、装置、计算机设备及存储介质
CN110532445A (zh) * 2019-04-26 2019-12-03 长佳智能股份有限公司 提供类神经网络训练模型的云端交易***及其方法
CN110232154B (zh) * 2019-05-30 2023-06-09 平安科技(深圳)有限公司 基于随机森林的产品推荐方法、装置及介质
KR102223161B1 (ko) * 2019-10-11 2021-03-03 노주현 기상데이터에 기초한 시기성상품 예측 시스템 및 방법
KR20210050362A (ko) 2019-10-28 2021-05-07 주식회사 모비스 앙상블 모델 프루닝 방법, 유전자 가위를 검출하는 앙상블 모델 생성 방법 및 장치
KR102092684B1 (ko) * 2020-01-23 2020-03-24 주식회사 두두아이티 랜덤 포레스트 기법을 이용한 사이버 보안 훈련 장치 및 방법
WO2022059207A1 (ja) * 2020-09-18 2022-03-24 日本電信電話株式会社 判定装置、判定方法及び判定プログラム

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508907A (zh) * 2011-11-11 2012-06-20 北京航空航天大学 一种基于训练集优化的推荐***的动态推荐方法
CN105045819A (zh) * 2015-06-26 2015-11-11 深圳市腾讯计算机***有限公司 一种训练数据的模型训练方法及装置
CN105678567A (zh) * 2015-12-31 2016-06-15 宁波领视信息科技有限公司 一种基于大数据深度学习的精准预测***
CN105912500A (zh) * 2016-03-30 2016-08-31 百度在线网络技术(北京)有限公司 机器学习模型生成方法和装置

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080133434A1 (en) * 2004-11-12 2008-06-05 Adnan Asar Method and apparatus for predictive modeling & analysis for knowledge discovery
CN102221655B (zh) * 2011-06-16 2013-08-07 河南省电力公司济源供电公司 基于随机森林模型的电力变压器故障诊断方法
EP2562690A3 (en) * 2011-08-22 2014-01-22 Siemens Aktiengesellschaft Assigning a number of reference measurement data sets to an input measurement data set
JP5953151B2 (ja) * 2012-07-13 2016-07-20 日本放送協会 学習装置、及びプログラム
JP5946364B2 (ja) * 2012-08-21 2016-07-06 株式会社Nttドコモ 時系列データ処理システム、時系列データ処理方法
AU2013327396B2 (en) * 2012-10-03 2017-01-05 Iselect Ltd Systems and methods for use in marketing
US20140279754A1 (en) * 2013-03-15 2014-09-18 The Cleveland Clinic Foundation Self-evolving predictive model
US9324022B2 (en) * 2014-03-04 2016-04-26 Signal/Sense, Inc. Classifying data with deep learning neural records incrementally refined through expert input
CN104155596B (zh) * 2014-08-12 2017-01-18 北京航空航天大学 一种基于随机森林的模拟电路故障诊断***
US9836701B2 (en) * 2014-08-13 2017-12-05 Microsoft Technology Licensing, Llc Distributed stage-wise parallel machine learning
CN105718490A (zh) * 2014-12-04 2016-06-29 阿里巴巴集团控股有限公司 一种用于更新分类模型的方法及装置
CN105809707B (zh) * 2014-12-30 2018-11-27 江苏慧眼数据科技股份有限公司 一种基于随机森林算法的行人跟踪方法
CN106156809A (zh) * 2015-04-24 2016-11-23 阿里巴巴集团控股有限公司 用于更新分类模型的方法及装置
US20160350675A1 (en) * 2015-06-01 2016-12-01 Facebook, Inc. Systems and methods to identify objectionable content
CN105844300A (zh) * 2016-03-24 2016-08-10 河南师范大学 一种基于随机森林算法的优化分类方法及装置
CN105931224A (zh) * 2016-04-14 2016-09-07 浙江大学 基于随机森林算法的肝脏平扫ct图像病变识别方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508907A (zh) * 2011-11-11 2012-06-20 北京航空航天大学 一种基于训练集优化的推荐***的动态推荐方法
CN105045819A (zh) * 2015-06-26 2015-11-11 深圳市腾讯计算机***有限公司 一种训练数据的模型训练方法及装置
CN105678567A (zh) * 2015-12-31 2016-06-15 宁波领视信息科技有限公司 一种基于大数据深度学习的精准预测***
CN105912500A (zh) * 2016-03-30 2016-08-31 百度在线网络技术(北京)有限公司 机器学习模型生成方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3413212A4 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070128A (zh) * 2019-04-22 2019-07-30 深圳市绘云生物科技有限公司 一种基于随机森林模型的慢性肝病风险评估***
CN111091408A (zh) * 2019-10-30 2020-05-01 北京天元创新科技有限公司 用户识别模型创建方法、装置与识别方法、装置
CN111767958A (zh) * 2020-07-01 2020-10-13 武汉楚精灵医疗科技有限公司 基于随机森林算法的肠镜退镜时间的实时监测方法
CN113466713A (zh) * 2021-07-15 2021-10-01 北京工业大学 一种基于随机森林的锂电池安全度估算方法及装置
CN113466713B (zh) * 2021-07-15 2024-04-12 北京工业大学 一种基于随机森林的锂电池安全度估算方法及装置
CN116759014A (zh) * 2023-08-21 2023-09-15 启思半导体(杭州)有限责任公司 基于随机森林的气体种类及浓度预测方法、***及装置
CN116759014B (zh) * 2023-08-21 2023-11-03 启思半导体(杭州)有限责任公司 基于随机森林的气体种类及浓度预测方法、***及装置

Also Published As

Publication number Publication date
AU2017404119A1 (en) 2018-10-11
SG11201809890PA (en) 2018-12-28
US20210081847A1 (en) 2021-03-18
EP3413212A4 (en) 2019-04-03
JP6587330B2 (ja) 2019-10-09
CN107632995B (zh) 2018-09-11
EP3413212A1 (en) 2018-12-12
KR102201919B1 (ko) 2021-01-12
JP2019513246A (ja) 2019-05-23
KR20190022431A (ko) 2019-03-06
CN107632995A (zh) 2018-01-26
AU2017404119A9 (en) 2019-06-06

Similar Documents

Publication Publication Date Title
WO2018166113A1 (zh) 随机森林模型训练的方法、电子装置及存储介质
CN108829581B (zh) 应用程序测试方法、装置、计算机设备及存储介质
CN106980623B (zh) 一种数据模型的确定方法及装置
TW201820165A (zh) 用於雲端巨量資料運算架構之伺服器及其雲端運算資源最佳化方法
WO2018120720A1 (zh) 客户端程序的测试错误定位方法、电子装置及存储介质
US8832143B2 (en) Client-side statement cache
US11132362B2 (en) Method and system of optimizing database system, electronic device and storage medium
US11055454B1 (en) Configuring and deploying Monte Carlo simulation pipelines
US20200394448A1 (en) Methods for more effectively moderating one or more images and devices thereof
AU2021244852B2 (en) Offloading statistics collection
CN111177113A (zh) 数据迁移方法、装置、计算机设备和存储介质
WO2018120726A1 (zh) 基于数据挖掘的建模方法、***、电子装置及存储介质
CN115631273A (zh) 一种大数据的去重方法、装置、设备及介质
CN111104214A (zh) 一种工作流应用方法及装置
WO2019024238A1 (zh) 范围值数据统计方法、***、电子装置及计算机可读存储介质
US11226885B1 (en) Monte Carlo simulation monitoring and optimization
WO2023077815A1 (zh) 一种处理敏感数据的方法及装置
US11003690B1 (en) Aggregator systems for storage of data segments
US20120323840A1 (en) Data flow cost modeling
CN104182522A (zh) 一种基于循环位图模型的辅助索引方法及装置
US20240193432A1 (en) Systems and methods for federated validation of models
US20240202587A1 (en) Un-learning of training data for machine learning models
US10965659B2 (en) Real-time cookie format validation and notification
CN109977221B (zh) 基于大数据的用户验证方法及装置、存储介质、电子设备
CN117411922A (zh) 一种请求方识别方法、装置、设备以及存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018530893

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20187017282

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2017897210

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017404119

Country of ref document: AU

Date of ref document: 20170630

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017897210

Country of ref document: EP

Effective date: 20180828

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17897210

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE