WO2023123913A1 - Medical condition test model training method and apparatus, test method and apparatus, and electronic device - Google Patents

Medical condition test model training method and apparatus, test method and apparatus, and electronic device Download PDF

Info

Publication number
WO2023123913A1
WO2023123913A1 PCT/CN2022/100159 CN2022100159W WO2023123913A1 WO 2023123913 A1 WO2023123913 A1 WO 2023123913A1 CN 2022100159 W CN2022100159 W CN 2022100159W WO 2023123913 A1 WO2023123913 A1 WO 2023123913A1
Authority
WO
WIPO (PCT)
Prior art keywords
linear
data
sign data
multiple regression
regression model
Prior art date
Application number
PCT/CN2022/100159
Other languages
French (fr)
Chinese (zh)
Inventor
余晓填
蚁韩羚
王爱波
王孝宇
陈宁
Original Assignee
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术股份有限公司 filed Critical 深圳云天励飞技术股份有限公司
Publication of WO2023123913A1 publication Critical patent/WO2023123913A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the field of big data analysis, in particular to a disease detection model training, detection method, device and equipment.
  • the inspection results are often judged by human beings, which is very inaccurate.
  • more and more researchers began to conduct research on the detection of target diseases based on big data based on sign data.
  • a neural network with a large capacity is usually required for data training.
  • the detection process is a data In the process of running the entire neural network operation, the detection efficiency is also not high. Therefore, on the premise of ensuring the accuracy of detection, how to improve the efficiency of medical staff in identifying the condition of the target person based on the physical signs data is an urgent problem to be solved.
  • the embodiments of the present application provide a disease detection model training, detection method, device, and equipment, thereby improving the efficiency of identifying the disease of a target person based on sign data.
  • the present application provides a disease detection model training method, the method comprising: obtaining a training sample used to characterize the target disease, the training sample includes the sign data of a plurality of people and whether each person is sick Data label; use the training sample to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to perform linear dimension reduction on the sign data of the person to calculate whether the person has The result of the disease; when the preset condition is satisfied, the multiple regression model is output, and the disease detection model is constructed according to the multiple regression model.
  • the acquiring training samples used to characterize the target disease includes: acquiring a plurality of first sign data marked with disease labels, and using the first sign data as positive samples; acquiring a plurality of labeled There is second sign data with no disease label, and the second sign data is used as a negative sample; each positive sample and negative sample are standardized at the same sampling frequency; the normalized positive sample and the negative sample A sample set is formed to generate the training samples.
  • the method before training the preset multiple regression model, further includes: taking the sum of the first variable and the first noise parameter as a first linear expression, the first variable being the The product of the sign data and the preset first linear parameter, the first noise parameter is used to characterize the noise of the sign data; the sum of the second variable and the second noise parameter is used as a second linear expression, and the first The second variable is the product of the output result of the first linear expression and a preset second linear parameter, the second noise parameter is used to characterize the noise of the output result of the first linear expression, and the second linear expression
  • the output result of the formula is used to predict whether the person corresponding to the sign data is sick, and the preset first linear parameter and the preset second linear parameter are the linear parameters of the multiple regression model; the first The combination of the linear expression and the second linear expression constitutes the multiple regression model.
  • using the training samples to train the preset multiple regression model includes: respectively transforming the first linear expression and the second linear expression into forms representing the first noise parameter and the second noise parameter ; Establishing an objective function based on the transformed first linear expression and the second linear expression; Substituting the sign data and the data label into the objective function for iterative calculation, and performing an iterative calculation on the preset first linear parameter and The preset second linear parameter is adjusted so that the norm of the first noise parameter and the norm of the second noise parameter are decremented; output at least one set of adjusted linear parameters, and based on the adjusted The linear parameters of generate at least one corresponding multiple regression model to complete the training.
  • the constructing the disease detection model according to the multiple regression model includes: if using the training samples to train the preset multiple regression model to obtain multiple multiple regression models, then calculating multiple The expression of the mean value of the output results of the multiple regression model is used as the disease detection model.
  • the present application provides a disease detection method, the method comprising: acquiring the sign data of the target person; substituting the sign data into the disease detection model generated according to any optional implementation manner of the first aspect. Calculation: judging whether the target person suffers from the target disease according to the output result of the disease detection model.
  • the present application provides a disease detection model training device, the device includes: a data collection module, used to obtain training samples used to characterize the target disease, the training samples include a plurality of people's sign data and The data label of whether each person is sick; the model training module is used to use the training sample to train the preset multiple regression model, and the multiple regression model includes linear parameters, and the linear parameters are used for the sign data of the personnel Carry out linear dimensionality reduction to calculate the result of whether the person is sick; the model output module is used to output the multiple regression model when the preset conditions are met, and construct the disease detection model according to the multiple regression model.
  • the present application provides a disease detection device, the device includes: a second data collection module, which acquires the physical sign data of the target person; a calculation module, which is used to substitute the physical sign data into any The disease detection model generated in an optional implementation manner is used for calculation; the result output module is used to judge whether the target person suffers from the target disease according to the output result of the disease detection model.
  • the present application provides an electronic device, including: a memory and a processor, the memory and the processor are communicatively connected to each other, and computer instructions are stored in the memory, and the processor executes the The computer has instructions to execute the method described in the first aspect, the second aspect, or any optional implementation manner of the first aspect and the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the first aspect and the second aspect, Or the method described in any optional implementation manner of the first aspect and the second aspect.
  • the technical solution provided by this application obtains a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples, and then according to the data noise that can be measured from the collected data, establishes the multidimensional data through linear parameters.
  • Reduce the multiple regression model of the one-dimensional recognition result so as to reduce the data noise of the body temperature data, the data noise of the blood oxygen saturation data and the data noise of the heart rate data, and use the training samples to train the linear parameters of the multiple regression model. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick.
  • the calculation amount is far less than Machine learning models such as large-capacity neural networks, as a diagnostic model for preliminary diagnosis of diseases, can greatly improve diagnostic efficiency.
  • Fig. 1 shows a schematic diagram of the steps of a disease detection model training method in an embodiment of the present application
  • Figure 2 shows a schematic diagram of the steps of a disease detection method in an embodiment of the present application
  • Fig. 3 shows a schematic structural diagram of a disease detection model training device in an embodiment of the present application
  • Fig. 4 shows a schematic structural diagram of a disease detection device in an embodiment of the present application
  • Fig. 5 shows a schematic structural diagram of an electronic device in an embodiment of the present application.
  • a kind of disease detection model training method specifically comprises the following steps:
  • Step S101 Obtain a training sample used to characterize the target disease, the training sample includes the sign data of multiple people and the data label of whether each person is sick or not. Specifically, in this embodiment, at least based on the sign data of a large number of people, including but not limited to body temperature data, blood oxygen saturation data, heart rate data, and tags of whether each person is sick (for example, the tag of a sick person is " 1", the label of the person without the disease is "0") as the training sample of the multi-symptom dimension, so that when the big data analysis model established in the subsequent steps is trained, the external manifestations of the target person's disease are considered from various aspects, Thereby improving the training accuracy of the big data analysis model.
  • the sign data of a large number of people including but not limited to body temperature data, blood oxygen saturation data, heart rate data, and tags of whether each person is sick (for example, the tag of a sick person is " 1", the label of the person without the disease is "0") as the training sample of the multi-symptom dimension, so that when the big
  • Step S102 use the training samples to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to linearly reduce the dimension of the person's sign data to calculate whether the person is sick or not.
  • Step S103 When the preset condition is satisfied, a multiple regression model is output, and a disease detection model is constructed according to the multiple regression model.
  • this embodiment optimizes the model parameters based on the data noise that is easily detected by various types of sign data. objective function. Therefore, through the above steps, the establishment of a machine learning model based on the multiple regression algorithm is realized, so that the subsequent means of detecting human diseases based on physical signs data is essentially a simple linear calculation, which can greatly improve the efficiency of disease detection.
  • the preset conditions that can be met by the training process include but are not limited to: end the training when the number of training times reaches the preset number of times; end when the linear parameters of the multiple regression model tend to be stable, and the amount of change is less than the preset threshold train.
  • the multivariate regression model that has been trained is used as the disease detection model, and the detection of the target disease based on the multiple regression model can be realized.
  • step S101 specifically includes the following steps:
  • Step 1 Obtain multiple first sign data marked with disease labels, and use the first sign data as positive samples.
  • Step 2 Obtain a plurality of second sign data marked with undiseased labels, and use the second sign data as negative samples.
  • Step 3 Normalize each positive sample and negative sample with the same sampling frequency.
  • Step 4 Combine the normalized positive samples and negative samples into a sample set to generate training samples.
  • the collection of multiple sign data may have different data sparsity due to different sampling frequencies. If the data with different sparsity is not processed, the training effect will not be accurate. Sampling the data and normalizing the sampling frequency. Then, the standardized positive and negative samples are used to complete the positive and negative training of the multiple regression model, which makes the multiple regression model more accurate.
  • physical sign data includes the following three dimensions, 1. minute-level body temperature x t measured based on the thermometer of the wearable device, 2. ten-minute-level heart rate x h measured based on the smart bracelet, 3. measured based on the smart bracelet Hour-level blood oxygen saturation x b . Since the sign data of the three dimensions have different sampling frequencies, the average method is technically used for data sampling standardization.
  • the hour-level body temperature is the average temperature of the past sixty minutes of history, expressed as x T .
  • the hourly heart rate is the historical sixty-minute heart rate average, expressed as x H .
  • the sign data sample is a three-dimensional vector collected every hour.
  • step S102 the following steps are further included:
  • Step 5 The sum of the first variable and the first noise parameter is used as the first linear expression, the first variable is the product of the sign data and the preset first linear parameter, and the first noise parameter is used to represent the noise of the sign data.
  • Step 6 The sum of the second variable and the second noise parameter is used as the second linear expression, the second variable is the product of the output result of the first linear expression and the preset second linear parameter, and the second noise parameter is used to characterize The noise of the output result of the first linear expression, the output result of the second linear expression is used to predict whether the person corresponding to the sign data is sick, the preset first linear parameter and the preset second linear parameter are the multiple regression model Linear parameters.
  • Step 7 Combine the first linear expression and the second linear expression to form a multiple regression model.
  • Y ⁇ R 1 ⁇ 3 represents the hidden variable (that is, the output result of the first linear expression)
  • X ⁇ R W ⁇ 3 represents the input training samples (that is, sign data), A ⁇ R 1 ⁇ W is the first linear parameter, ⁇ Y ⁇ R 1 ⁇ 3 is the data of body temperature data, blood oxygen saturation data and heart rate data A noise parameter (ie, the first noise parameter).
  • y ⁇ R is used to indicate whether the person is sick or not (that is, the output result of the second linear expression)
  • B ⁇ R 3 ⁇ 1 is the second linear parameter
  • ⁇ y ⁇ R is used after the first linear calculation
  • AX is the first variable
  • YB is the second variable.
  • the actual collected sign data X contains data noise, which can be calculated by but not limited to the variance formula. If it is assumed that there are linear parameters A and B.
  • the useful components in the sign data X can be extracted, so that the dimensionally reduced one-dimensional data contains the most useful components and the least noise components. After two linear processes, the noise components at input and the noise at output The components can be expressed as:
  • step S102 specifically includes the following steps:
  • Step 8 respectively transforming the first linear expression and the second linear expression into forms representing the first noise parameter and the second noise parameter;
  • Step 9 Establishing an objective function based on the transformed first linear expression and the second linear expression
  • Step 10 Substituting the sign data and data labels into the objective function for iterative calculation, adjusting the preset first linear parameter and the preset second linear parameter, so that the norm of the first noise parameter and the norm of the second noise parameter The norm is decremented.
  • Step eleven output at least one set of adjusted linear parameters, and generate at least one corresponding multiple regression model based on the adjusted linear parameters to complete the training.
  • ⁇ ⁇ (0, 1) is an artificially adjustable hyperparameter
  • N is the number of negative samples
  • M is the number of positive samples
  • the output y of positive samples is 1, indicating disease
  • the output y of negative samples is 0,
  • 2 is the norm of the first noise parameter
  • 2 is the norm of the second noise parameter.
  • the optimization of the first linear parameter and the second linear parameter is realized through the above-mentioned optimization objective function.
  • the objective function is a linear parameter parameter in It is a random assignment. Even for the same training sample, after multiple rounds of training, it is possible to obtain multiple sets of different linear parameters that meet the constraints of the objective function.
  • the linear parameters of the first training may be more focused on body temperature data, and the linear parameters of the second training may be more focused on blood oxygen saturation data.
  • multiple rounds of training can be performed to generate multiple sets of linear parameters, and then the obtained multiple sets of linear parameters are substituted into the multiple regression model to obtain Multiple trained models that focus on different dimensions, and then input the sign data into each model to obtain multiple prediction results, and then average the prediction results, and combine them to obtain the final prediction result, so that the disease prediction based on the sign data is in Considering the impact of data in various dimensions in a deeper level makes the prediction probability of whether the target person is sick or not more accurate.
  • step S103 specifically includes the following steps:
  • Step 12 If multiple regression models are obtained by using the training samples to train the preset multiple regression model, the expression for calculating the average value of the output results of the multiple regression models is used as the disease detection model.
  • the disease detection model does not directly use a single multiple regression model, but averages the output values of multiple multiple regression models.
  • the detailed principle description of this step please refer to the relevant descriptions of the above steps 10 to 11, and will not be repeated here.
  • a disease detection method specifically includes the following steps:
  • Step S201 Obtain the physical sign data of the target person.
  • Step S202 Substituting the sign data into the disease detection model generated by the above training method for calculation.
  • Step S203 Determine whether the target person suffers from the target disease according to the output result of the disease detection model.
  • the technical solution provided by this application obtains a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples, and then establishes a linear parameter based on the data noise that can be measured from the collected data.
  • Reduce multidimensional data to a multiple regression model of one-dimensional recognition results so as to reduce the data noise of body temperature data, data noise of blood oxygen saturation data and data noise of heart rate data, and use training samples to linear parameters of multiple regression model to train. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick.
  • the calculation amount is far less than A large-capacity neural network, as a diagnostic model for the initial diagnosis of diseases, can greatly improve the efficiency of diagnosis.
  • the present embodiment also provides a disease detection model training device, the device includes:
  • the data acquisition module 101 is configured to acquire training samples used to characterize the target disease, the training samples include the sign data of multiple persons and the data label of whether each person is sick. For details, refer to the relevant description of step S101 in the above method embodiment, and details are not repeated here.
  • the model training module 102 is used to use the training samples to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to perform linear dimensionality reduction on the sign data of the person to calculate whether the person is sick result.
  • the multiple regression model includes linear parameters
  • the linear parameters are used to perform linear dimensionality reduction on the sign data of the person to calculate whether the person is sick result.
  • the model output module 103 is configured to output a multiple regression model when the preset condition is satisfied, and construct a disease detection model according to the multiple regression model. For details, refer to the relevant description of step S103 in the above method embodiment, and details are not repeated here.
  • the disease detection model training device provided in the embodiment of the present application is used to implement the disease detection model training method provided in the above embodiment, and its implementation method is the same as the principle. For details, please refer to the relevant description of the above method embodiment, and will not repeat.
  • this embodiment also provides a disease detection device, which includes:
  • the second data acquisition module 201 acquires the physical sign data of the target person. For details, refer to the relevant description of step S201 in the above method embodiment, and details are not repeated here.
  • the calculation module 202 is used to substitute the sign data into the disease detection model generated by the above disease detection model training method for calculation. For details, refer to the relevant description of step S202 in the above method embodiment, and details are not repeated here.
  • the result output module 203 is used for judging whether the target person suffers from the target disease according to the output result of the disease detection model. For details, refer to the relevant description of step S203 in the above method embodiment, and details are not repeated here.
  • the disease detection device provided in the embodiment of the present application is used to implement the disease detection method provided in the above embodiment.
  • the implementation method is the same as the principle. For details, please refer to the relevant description of the above method embodiment, and will not repeat them here.
  • the technical solution provided by this application can obtain a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples of physical sign data, and then can measure according to the collected data Data noise, establish a multiple regression model that reduces multidimensional data to one-dimensional recognition results through linear parameters, so as to reduce the data noise of body temperature data, data noise of blood oxygen saturation data and data noise of heart rate data, using training samples Train the linear parameters of a multiple regression model. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick. In the actual use process, the calculation amount is far less than A large-capacity neural network, as a diagnostic model for the initial diagnosis of diseases, can greatly improve the efficiency of diagnosis.
  • FIG. 5 shows an electronic device according to an embodiment of the present application.
  • the device includes a processor 901 and a memory 902, which may be connected through a bus or in other ways.
  • connection through a bus is taken as an example.
  • the processor 901 may be a central processing unit (Central Processing Unit, CPU).
  • the processor 901 can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate array (Field-Programmable Gate Array, FPGA) or Other chips such as programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations of the above-mentioned types of chips.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • Other chips such as programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations of the above-mentioned types of chips.
  • the memory 902 as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/modules corresponding to the methods in the above method embodiments.
  • the processor 901 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 902, that is, implements the methods in the above method embodiments.
  • the memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created by the processor 901 and the like.
  • the memory 902 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the storage 902 may optionally include storages that are remotely located relative to the processor 901, and these remote storages may be connected to the processor 901 through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • One or more modules are stored in the memory 902, and when executed by the processor 901, the methods in the foregoing method embodiments are executed.
  • the storage medium can be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a flash memory (Flash Memory), a hard disk (Hard Disk Drive) , abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above-mentioned types of memory.

Abstract

The present application discloses a medical condition test model training method and apparatus, a test method and apparatus, and a device. The medical condition test model training method comprises: acquiring training samples for representing a target disease, wherein the training samples comprise physical sign data of a plurality of persons and data labels about whether the persons are diseased; training a preset multivariate regression model by using the training samples, wherein the multivariate regression model comprises linear parameters, and the linear parameters are used to perform linear dimension reduction on physical sign data of a person so as to calculate a result about whether the person is diseased; and when a preset condition is satisfied, outputting a multivariate regression model, and constructing a medical condition test model according to the multivariate regression model. The technical solution provided in the present application improves the efficiency of performing medical condition identification on a target person on the basis of physical sign data.

Description

一种病情检测模型训练、检测方法、装置和电子设备A disease detection model training, detection method, device and electronic equipment
本申请要求于2021年12月31日提交中国专利局,申请号为202111683219.2、申请名称为“一种病情检测模型训练、检测方法、装置和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111683219.2 and the title of "a disease detection model training, detection method, device and electronic equipment" submitted to the China Patent Office on December 31, 2021, the entire content of which Incorporated in this application by reference.
技术领域technical field
本申请涉及大数据分析领域,具体涉及一种病情检测模型训练、检测方法、装置和设备。This application relates to the field of big data analysis, in particular to a disease detection model training, detection method, device and equipment.
背景技术Background technique
随着大数据技术的发展,在各行各业通过大数据进行人脸识别、语句分类的技术层出不穷,它们本质上通过大量数据和机器学习技术的结合,分析出目标的内在区别,实现了肉眼等感官做不到的识别与预测能力。通常,在医疗领域,患病的患者在做一系列检查时需要进行抽血、胸透等步骤,往往检查效率很低,检查结果输出很慢,尤其涉及到传染病时需要对大量人员进行检查,检查效率是一项至关重要的因素。为了提高检查效率,通常在检查前,医护人员通过体温等体征数据对目标人员进行一个初步检测,检测结果往往通过人为主观判断,很不准确。之后,越来越多的研究人员开始基于体征数据进行基于大数据检测目标病症的研究,但是,为了提高检测准确率,通常需要容量很大的神经网络进行数据训练,其检测过程即为一个数据运行整个神经网络运算的过程,检测效率同样不高。因此,如何在保证检测准确率的前提下,提高医护人员基于体征数据对目标人员进行病情识别的效率是亟待解决的问题。With the development of big data technology, technologies for face recognition and sentence classification through big data emerge in an endless stream in all walks of life. They essentially analyze the internal differences of targets through the combination of large amounts of data and machine learning technology, and realize the recognition of human eyes, etc. The ability to recognize and predict that the senses cannot. Usually, in the medical field, sick patients need to undergo a series of examinations such as blood draws and chest X-rays. The efficiency of the examination is often very low, and the output of the examination results is very slow, especially when it comes to infectious diseases, a large number of people need to be examined. , inspection efficiency is a crucial factor. In order to improve the inspection efficiency, usually before the inspection, medical staff conduct a preliminary inspection on the target person through body temperature and other physical signs data. The inspection results are often judged by human beings, which is very inaccurate. Afterwards, more and more researchers began to conduct research on the detection of target diseases based on big data based on sign data. However, in order to improve the detection accuracy, a neural network with a large capacity is usually required for data training. The detection process is a data In the process of running the entire neural network operation, the detection efficiency is also not high. Therefore, on the premise of ensuring the accuracy of detection, how to improve the efficiency of medical staff in identifying the condition of the target person based on the physical signs data is an urgent problem to be solved.
申请内容application content
有鉴于此,本申请实施方式提供了一种病情检测模型训练、检测方法、装置和设备,从而提高了基于体征数据对目标人员进行病情识别的效率。In view of this, the embodiments of the present application provide a disease detection model training, detection method, device, and equipment, thereby improving the efficiency of identifying the disease of a target person based on sign data.
根据第一方面,本申请提供了一种病情检测模型训练方法,所述方法包括:获取用于表征目标病症的训练样本,所述训练样本包括多个人员的体征数据和各个人员是否患病的数据标签;利用所述训练样本对预设的多元回归模型进行训练,所述多元回归模型包括线性参数,所述线性参数用于对人员的体征数据进行线性降维,以计算出该人员是否患病的结果;当满足预设条件时,输出所述多元回归模型,并根据所述多元回归模型构建所述病情检测模型。According to the first aspect, the present application provides a disease detection model training method, the method comprising: obtaining a training sample used to characterize the target disease, the training sample includes the sign data of a plurality of people and whether each person is sick Data label; use the training sample to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to perform linear dimension reduction on the sign data of the person to calculate whether the person has The result of the disease; when the preset condition is satisfied, the multiple regression model is output, and the disease detection model is constructed according to the multiple regression model.
可选地,所述获取用于表征目标病症的训练样本,包括:获取多个标有患病标签的第一体征数据,并将所述第一体征数据作为正样本;获取多个标有未患病标签的第二体征数据,并将所述第二体征数据作为负样本;以相同采样频率对各个正样本和负样本进行标准化;将标准化之后的所述正样本和所述负样本组成样本集合,生成所述训练样本。Optionally, the acquiring training samples used to characterize the target disease includes: acquiring a plurality of first sign data marked with disease labels, and using the first sign data as positive samples; acquiring a plurality of labeled There is second sign data with no disease label, and the second sign data is used as a negative sample; each positive sample and negative sample are standardized at the same sampling frequency; the normalized positive sample and the negative sample A sample set is formed to generate the training samples.
可选地,在对所述预设的多元回归模型进行训练之前,所述方法还包括:将第一变量和第一噪声参数的和作为第一线性表达式,所述第一变量是所述体征数据与预设的第一线性参数的乘积,所述第一噪声参数用于表征所述体征数据的噪声;将第二变量和第二噪声参数的和作为第二线性表达式,所述第二变量是所述第一线性表达式输出结果与预设的第二线性参数的乘积,所述第二噪声参数用于表征所述第一线性表达式输出结果的噪声,所述第二线性表达式的输出结果用于预测所述体征数据对应的人员是否患病,所述预设的第一线性参数和预设的第二线性参数为多元回归模型的所述线性参数;将所述第一线性表达式和所述第二线性表达式组合,构成所述多元回归模型。Optionally, before training the preset multiple regression model, the method further includes: taking the sum of the first variable and the first noise parameter as a first linear expression, the first variable being the The product of the sign data and the preset first linear parameter, the first noise parameter is used to characterize the noise of the sign data; the sum of the second variable and the second noise parameter is used as a second linear expression, and the first The second variable is the product of the output result of the first linear expression and a preset second linear parameter, the second noise parameter is used to characterize the noise of the output result of the first linear expression, and the second linear expression The output result of the formula is used to predict whether the person corresponding to the sign data is sick, and the preset first linear parameter and the preset second linear parameter are the linear parameters of the multiple regression model; the first The combination of the linear expression and the second linear expression constitutes the multiple regression model.
可选地,所述利用所述训练样本对预设的多元回归模型进行训练,包括:分别将第一线性表达式和第二线性表达式变换为表征第一噪声参数和第二噪声参数的形式;基于变换后的第一线性表达式和第二线性表达式建立目标函数;将所述体征数据和所述数据标签代入所述目标函数进行迭代计算,对所述预设的第一线性参数和所述预设的第二线性参数进行调整,以使所述第一噪声参数的范数和第二噪声参数的范数进行递减;输出至少一组调整后的线性参数,并基于所述调整后的线性参数生成至少一个对应的多元回归模型,以完成训练。Optionally, using the training samples to train the preset multiple regression model includes: respectively transforming the first linear expression and the second linear expression into forms representing the first noise parameter and the second noise parameter ; Establishing an objective function based on the transformed first linear expression and the second linear expression; Substituting the sign data and the data label into the objective function for iterative calculation, and performing an iterative calculation on the preset first linear parameter and The preset second linear parameter is adjusted so that the norm of the first noise parameter and the norm of the second noise parameter are decremented; output at least one set of adjusted linear parameters, and based on the adjusted The linear parameters of generate at least one corresponding multiple regression model to complete the training.
可选地,所述根据所述多元回归模型构建所述病情检测模型,包括:若 利用所述训练样本对预设的多元回归模型进行训练得到多个所述多元回归模型,则将计算多个所述多元回归模型输出结果的平均值的表达式作为所述病情检测模型。Optionally, the constructing the disease detection model according to the multiple regression model includes: if using the training samples to train the preset multiple regression model to obtain multiple multiple regression models, then calculating multiple The expression of the mean value of the output results of the multiple regression model is used as the disease detection model.
根据第二方面,本申请提供了一种病情检测方法,所述方法包括:获取目标人员的体征数据;将所述体征数据代入根据第一方面任意一项可选实施方式生成的病情检测模型进行计算;根据所述病情检测模型的输出结果判断所述目标人员是否患有所述目标病症。According to the second aspect, the present application provides a disease detection method, the method comprising: acquiring the sign data of the target person; substituting the sign data into the disease detection model generated according to any optional implementation manner of the first aspect. Calculation: judging whether the target person suffers from the target disease according to the output result of the disease detection model.
根据第三方面,本申请提供了一种病情检测模型训练装置,所述装置包括:数据采集模块,用于获取用于表征目标病症的训练样本,所述训练样本包括多个人员的体征数据和各个人员是否患病的数据标签;模型训练模块,用于利用所述训练样本对预设的多元回归模型进行训练,所述多元回归模型包括线性参数,所述线性参数用于对人员的体征数据进行线性降维,以计算出该人员是否患病的结果;模型输出模块,用于当满足预设条件时,输出所述多元回归模型,并根据所述多元回归模型构建所述病情检测模型。According to a third aspect, the present application provides a disease detection model training device, the device includes: a data collection module, used to obtain training samples used to characterize the target disease, the training samples include a plurality of people's sign data and The data label of whether each person is sick; the model training module is used to use the training sample to train the preset multiple regression model, and the multiple regression model includes linear parameters, and the linear parameters are used for the sign data of the personnel Carry out linear dimensionality reduction to calculate the result of whether the person is sick; the model output module is used to output the multiple regression model when the preset conditions are met, and construct the disease detection model according to the multiple regression model.
根据第四方面,本申请提供了一种病情检测装置,所述装置包括:第二数据采集模块,获取目标人员的体征数据;计算模块,用于将所述体征数据代入根据第一方面任意一项可选实施方式生成的病情检测模型进行计算;结果输出模块,用于根据所述病情检测模型的输出结果判断所述目标人员是否患有所述目标病症。According to the fourth aspect, the present application provides a disease detection device, the device includes: a second data collection module, which acquires the physical sign data of the target person; a calculation module, which is used to substitute the physical sign data into any The disease detection model generated in an optional implementation manner is used for calculation; the result output module is used to judge whether the target person suffers from the target disease according to the output result of the disease detection model.
根据第五方面,本申请提供了一种电子设备,包括:存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行第一方面、第二方面,或者第一方面以及第二方面任意一种可选实施方式中所述的方法。According to a fifth aspect, the present application provides an electronic device, including: a memory and a processor, the memory and the processor are communicatively connected to each other, and computer instructions are stored in the memory, and the processor executes the The computer has instructions to execute the method described in the first aspect, the second aspect, or any optional implementation manner of the first aspect and the second aspect.
根据第六方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使所述计算机执行第一方面、第二方面,或者第一方面以及第二方面任意一种可选实施方式中所述的方法。According to a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the first aspect and the second aspect, Or the method described in any optional implementation manner of the first aspect and the second aspect.
本申请提供的技术方案,具有如下优点:The technical scheme provided by the application has the following advantages:
本申请提供的技术方案,获取大量的体征数据,包括但不限于体温数据、血氧饱和度数据、心率数据作为训练样本,然后根据采集数据可以测得的数 据噪声,建立通过线性参数将多维数据降低为一维识别结果的多元回归模型,从而以降低体温数据的数据噪声、血氧饱和度数据的数据噪声和心率数据的数据噪声为目标,使用训练样本对多元回归模型的线性参数进行训练。从而在线性参数训练完成之后,即可使用基于简单线性计算的多元回归模型对目标人员的体征数据进行计算,从而判定目标人员是否患病的结果,其在实际使用过程中,计算量远远小于大容量的神经网络等机器学习模型,作为初步诊断病症的诊断模型,能够大大提高诊断效率。The technical solution provided by this application obtains a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples, and then according to the data noise that can be measured from the collected data, establishes the multidimensional data through linear parameters. Reduce the multiple regression model of the one-dimensional recognition result, so as to reduce the data noise of the body temperature data, the data noise of the blood oxygen saturation data and the data noise of the heart rate data, and use the training samples to train the linear parameters of the multiple regression model. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick. In the actual use process, the calculation amount is far less than Machine learning models such as large-capacity neural networks, as a diagnostic model for preliminary diagnosis of diseases, can greatly improve diagnostic efficiency.
附图说明Description of drawings
通过参考附图会更加清楚的理解本申请的特征和优点,附图是示意性的而不应理解为对本申请进行任何限制,在附图中:The features and advantages of the present application will be more clearly understood by referring to the accompanying drawings, which are schematic and should not be construed as limiting the application in any way. In the accompanying drawings:
图1示出了本申请一个实施方式中一种病情检测模型训练方法的步骤示意图;Fig. 1 shows a schematic diagram of the steps of a disease detection model training method in an embodiment of the present application;
图2示出了本申请一个实施方式中一种病情检测方法的步骤示意图;Figure 2 shows a schematic diagram of the steps of a disease detection method in an embodiment of the present application;
图3示出了本申请一个实施方式中一种病情检测模型训练装置的结构示意图;Fig. 3 shows a schematic structural diagram of a disease detection model training device in an embodiment of the present application;
图4示出了本申请一个实施方式中一种病情检测装置的结构示意图;Fig. 4 shows a schematic structural diagram of a disease detection device in an embodiment of the present application;
图5示出了本申请一个实施方式中一种电子设备的结构示意图。Fig. 5 shows a schematic structural diagram of an electronic device in an embodiment of the present application.
具体实施方式Detailed ways
请参阅图1,在一个实施方式中,一种病情检测模型训练方法,具体包括以下步骤:Please refer to Fig. 1, in one embodiment, a kind of disease detection model training method specifically comprises the following steps:
步骤S101:获取用于表征目标病症的训练样本,训练样本包括多个人员的体征数据和各个人员是否患病的数据标签。具体地,在本实施例中,至少基于大量人员的体征数据,包括但不限于体温数据、血氧饱和度数据、心率数据,以及各个人员是否患病的标签(例如患病人员的标签为“1”,没有患病的人员的标签为“0”)作为多体征维度的训练样本,从而对后续步骤建立的大数据分析模型进行训练时,从多方面考虑的目标人员的病症外在表现,从而提高大数据分析模型的训练准确度。Step S101: Obtain a training sample used to characterize the target disease, the training sample includes the sign data of multiple people and the data label of whether each person is sick or not. Specifically, in this embodiment, at least based on the sign data of a large number of people, including but not limited to body temperature data, blood oxygen saturation data, heart rate data, and tags of whether each person is sick (for example, the tag of a sick person is " 1", the label of the person without the disease is "0") as the training sample of the multi-symptom dimension, so that when the big data analysis model established in the subsequent steps is trained, the external manifestations of the target person's disease are considered from various aspects, Thereby improving the training accuracy of the big data analysis model.
步骤S102:利用训练样本对预设的多元回归模型进行训练,多元回归模型包括线性参数,线性参数用于对人员的体征数据进行线性降维,以计算出 该人员是否患病的结果。Step S102: use the training samples to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to linearly reduce the dimension of the person's sign data to calculate whether the person is sick or not.
步骤S103:当满足预设条件时,输出多元回归模型,并根据多元回归模型构建病情检测模型。Step S103: When the preset condition is satisfied, a multiple regression model is output, and a disease detection model is constructed according to the multiple regression model.
具体地,考虑到现有神经网络模型的容量(模型层数和神经元个数),通常为了满足一定的精确度设置的比较庞大,其实际检测时的模型计算过程实质上可以视为不调整模型参数的一次训练过程,从而导致其计算时间较长,在大量人员均需要进行体征检测时,会严重影响检测效率。因此,在本申请实施例中,基于将多维的体征数据转变为一维的是否患病结果的目的,建立多元回归模型,通过设置用于降维的线性参数实现线性多元回归(矩阵形式的多维参数),在本质上通过线性计算,实现多维数据得到一维结果的功能。为了保证线性参数选取合适,本实施例结合各类体征数据很容易检测到的数据噪声,对模型参数进行优化,例如:基于使各类数据的噪声越小、合理数据部分越多的目标建立优化目标函数。从而通过上述步骤实现基于多元回归算法的机器学***稳,其变化量小于预设阈值时结束训练。在本实施例中,将训练结束的多元回归模型作为病情检测模型,即可实现基于多元回归模型对目标病症的检测。Specifically, considering the capacity of the existing neural network model (the number of model layers and the number of neurons), it is usually relatively large to meet a certain accuracy setting, and the model calculation process during actual detection can be regarded as essentially no adjustment A training process of model parameters leads to a long calculation time, which will seriously affect the detection efficiency when a large number of people need to perform sign detection. Therefore, in the embodiment of the present application, based on the purpose of converting the multidimensional sign data into a one-dimensional disease result, a multiple regression model is established, and linear multiple regression is realized by setting linear parameters for dimension reduction (multidimensional Parameters), in essence, realize the function of obtaining one-dimensional results from multi-dimensional data through linear calculation. In order to ensure that the selection of linear parameters is appropriate, this embodiment optimizes the model parameters based on the data noise that is easily detected by various types of sign data. objective function. Therefore, through the above steps, the establishment of a machine learning model based on the multiple regression algorithm is realized, so that the subsequent means of detecting human diseases based on physical signs data is essentially a simple linear calculation, which can greatly improve the efficiency of disease detection. In this embodiment, the preset conditions that can be met by the training process include but are not limited to: end the training when the number of training times reaches the preset number of times; end when the linear parameters of the multiple regression model tend to be stable, and the amount of change is less than the preset threshold train. In this embodiment, the multivariate regression model that has been trained is used as the disease detection model, and the detection of the target disease based on the multiple regression model can be realized.
具体地,在一实施例中,上述步骤S101,具体包括如下步骤:Specifically, in one embodiment, the above step S101 specifically includes the following steps:
步骤一:获取多个标有患病标签的第一体征数据,并将第一体征数据作为正样本。Step 1: Obtain multiple first sign data marked with disease labels, and use the first sign data as positive samples.
步骤二:获取多个标有未患病标签的第二体征数据,并将第二体征数据作为负样本。Step 2: Obtain a plurality of second sign data marked with undiseased labels, and use the second sign data as negative samples.
步骤三:以相同采样频率对各个正样本和负样本进行标准化。Step 3: Normalize each positive sample and negative sample with the same sampling frequency.
步骤四:将标准化之后的正样本和负样本组成样本集合,生成训练样本。Step 4: Combine the normalized positive samples and negative samples into a sample set to generate training samples.
具体地,多种体征数据的采集可能由于采样频率不同,从而数据的稀疏度不同,不同稀疏度的数据如果不进行处理,其训练效果并不准确,从而在本实施例中,针对多维度的采样数据,进行采样频率的标准化。再由标准化后的正样本和负样本对多元回归模型进行完备的正向和负向的训练,使得多 元回归模型的准确度更高。例如:体征数据包含以下三个维度,1.基于可穿戴设备的体温计测量的分钟级体温x t,2.基于智能手环测得的十分钟级心率x h,3.基于智能手环测得的小时级血氧饱和度x b。由于三个维度的体征数据是不同采样频率的,技术上采用平均方式进行数据采样标准化。我们以小时级数据进行统一数据采集标准。因此,小时级的体温是过去历史六十分钟的体温平均值,表示为x T。小时级的心率是过去历史六十分钟的心率平均值,表示为x H。在采样数据标准化之后,以某一个小时为采样数据单位,我们有体征数据样本 Specifically, the collection of multiple sign data may have different data sparsity due to different sampling frequencies. If the data with different sparsity is not processed, the training effect will not be accurate. Sampling the data and normalizing the sampling frequency. Then, the standardized positive and negative samples are used to complete the positive and negative training of the multiple regression model, which makes the multiple regression model more accurate. For example: physical sign data includes the following three dimensions, 1. minute-level body temperature x t measured based on the thermometer of the wearable device, 2. ten-minute-level heart rate x h measured based on the smart bracelet, 3. measured based on the smart bracelet Hour-level blood oxygen saturation x b . Since the sign data of the three dimensions have different sampling frequencies, the average method is technically used for data sampling standardization. We use hourly data to unify data collection standards. Therefore, the hour-level body temperature is the average temperature of the past sixty minutes of history, expressed as x T . The hourly heart rate is the historical sixty-minute heart rate average, expressed as x H . After the sampling data is standardized, we have a sample data sample of a certain hour as the sampling data unit
x=[x T;x H;x b] x=[x T ; x H ; x b ]
即,体征数据样本是每个小时采集一次的三维的向量。体征数据按照时间序列展开,我们得到体征数据样本集,D={x 1,x 2,...} That is, the sign data sample is a three-dimensional vector collected every hour. The sign data is expanded according to time series, we get the sample set of sign data, D={x 1 , x 2 ,...}
具体地,在一实施例中,上述步骤S102之前,还包括如下步骤:Specifically, in an embodiment, before the above step S102, the following steps are further included:
步骤五:将第一变量和第一噪声参数的和作为第一线性表达式,第一变量是体征数据与预设的第一线性参数的乘积,第一噪声参数用于表征体征数据的噪声。Step 5: The sum of the first variable and the first noise parameter is used as the first linear expression, the first variable is the product of the sign data and the preset first linear parameter, and the first noise parameter is used to represent the noise of the sign data.
步骤六:将第二变量和第二噪声参数的和作为第二线性表达式,第二变量是第一线性表达式输出结果与预设的第二线性参数的乘积,第二噪声参数用于表征第一线性表达式输出结果的噪声,第二线性表达式的输出结果用于预测体征数据对应的人员是否患病,预设的第一线性参数和预设的第二线性参数为多元回归模型的线性参数。Step 6: The sum of the second variable and the second noise parameter is used as the second linear expression, the second variable is the product of the output result of the first linear expression and the preset second linear parameter, and the second noise parameter is used to characterize The noise of the output result of the first linear expression, the output result of the second linear expression is used to predict whether the person corresponding to the sign data is sick, the preset first linear parameter and the preset second linear parameter are the multiple regression model Linear parameters.
步骤七:将第一线性表达式和第二线性表达式组合,构成多元回归模型。Step 7: Combine the first linear expression and the second linear expression to form a multiple regression model.
具体地,上述步骤五至步骤七的多元回归模型表达式如下所示:Specifically, the expression of the multiple regression model of the above steps five to seven is as follows:
Y=AX+ε Y Y=AX+ εY
y=YB+ε y y=YB+ε y
其中,Y∈R 1×3表示隐变量(即第一线性表达式的输出结果), Among them, Y∈R 1×3 represents the hidden variable (that is, the output result of the first linear expression),
X∈R W×3表示输入的训练样本(即体征数据),A∈R 1×W为第一线性参数,ε Y∈R 1×3为体温数据、血氧饱和度数据和心率数据的数据噪声参数(即第一噪声参数)。y∈R,用于表示人员是否患病的输出结果(即第二线性表达式的输出结果),B∈R 3×1为第二线性参数,ε y∈R为第一次线性计算后用于表示输出的噪声参数(即第二噪声参数),AX即 第一变量,YB即第二变量。 X ∈ R W × 3 represents the input training samples (that is, sign data), A ∈ R 1 × W is the first linear parameter, ε Y ∈ R 1 × 3 is the data of body temperature data, blood oxygen saturation data and heart rate data A noise parameter (ie, the first noise parameter). y∈R is used to indicate whether the person is sick or not (that is, the output result of the second linear expression), B∈R 3×1 is the second linear parameter, ε y ∈ R is used after the first linear calculation In order to represent the output noise parameter (ie, the second noise parameter), AX is the first variable, and YB is the second variable.
具体地,对矩阵形式的多维体征数据X进行降维,通常采用的线性处理手段是在其前后乘上对应的降维矩阵(即第一线性参数A和第二线性参数B),即y=AXB,从而将高纬体征数据转换得到1维输出y,在实际采集的体征数据X中,是包含数据噪声的,该数据噪声可通过但不限于方差公式计算得到,若假设存在线性参数A和B,可以提取体征数据X中的有用分量,使得降维后的一维数据,包含最多有用分量,并包含最少噪声分量,则经过两次线性处理,其输入时的噪声分量和输出时的噪声分量可分别表示为:Specifically, to reduce the dimensionality of the multidimensional sign data X in the form of a matrix, the usual linear processing method is to multiply the corresponding dimensionality reduction matrix (ie, the first linear parameter A and the second linear parameter B) before and after it, that is, y= AXB, so as to convert the high-latitude sign data to obtain a 1-dimensional output y. The actual collected sign data X contains data noise, which can be calculated by but not limited to the variance formula. If it is assumed that there are linear parameters A and B. The useful components in the sign data X can be extracted, so that the dimensionally reduced one-dimensional data contains the most useful components and the least noise components. After two linear processes, the noise components at input and the noise at output The components can be expressed as:
ε Y=Y-AX和ε y=y-YB,即训练样本乘以线性参数提取有用分量后剩余的分量。通过以噪声最小为目标,即可实现对上述线性参数A、B的优化,从而对多元回归模型实现机器学习训练,以找到最合适的线性参数,通过输入多体征数据,计算出目标人员是否患病的二分类结果。 ε Y =Y-AX and ε y =y-YB, that is, the remaining components after the training samples are multiplied by the linear parameters to extract the useful components. By aiming at the minimum noise, the optimization of the above linear parameters A and B can be realized, so as to implement machine learning training for the multiple regression model to find the most suitable linear parameters, and calculate whether the target person is suffering from Disease classification results.
具体地,在一实施例中,基于上述步骤五至步骤七,步骤S102,具体包括如下步骤:Specifically, in one embodiment, based on the above steps five to seven, step S102 specifically includes the following steps:
步骤八:分别将第一线性表达式和第二线性表达式变换为表征第一噪声参数和第二噪声参数的形式;Step 8: respectively transforming the first linear expression and the second linear expression into forms representing the first noise parameter and the second noise parameter;
步骤九:基于变换后的第一线性表达式和第二线性表达式建立目标函数;Step 9: Establishing an objective function based on the transformed first linear expression and the second linear expression;
步骤十:将体征数据和数据标签代入目标函数进行迭代计算,对预设的第一线性参数和预设的第二线性参数进行调整,以使第一噪声参数的范数和第二噪声参数的范数进行递减。Step 10: Substituting the sign data and data labels into the objective function for iterative calculation, adjusting the preset first linear parameter and the preset second linear parameter, so that the norm of the first noise parameter and the norm of the second noise parameter The norm is decremented.
步骤十一:输出至少一组调整后的线性参数,并基于调整后的线性参数生成至少一个对应的多元回归模型,以完成训练。Step eleven: output at least one set of adjusted linear parameters, and generate at least one corresponding multiple regression model based on the adjusted linear parameters to complete the training.
具体地,上述步骤八至步骤九的目标函数表达式如下所示:Specifically, the objective function expressions of the above steps 8 to 9 are as follows:
Figure PCTCN2022100159-appb-000001
Figure PCTCN2022100159-appb-000001
其中,λ∈(0,1)为人工可调超参数,N为负样本个数,M为正样本个数,正样本的输出y为1,表示患病,负样本的输出y为0,表示未患病,||Y-AX|| 2即为第一噪声参数的范数,||y-YB|| 2即为第二噪声参数的范数。 Among them, λ ∈ (0, 1) is an artificially adjustable hyperparameter, N is the number of negative samples, M is the number of positive samples, the output y of positive samples is 1, indicating disease, and the output y of negative samples is 0, Indicates no disease, ||Y-AX|| 2 is the norm of the first noise parameter, and ||y-YB|| 2 is the norm of the second noise parameter.
在本实施例中,通过上述寻优目标函数,实现第一线性参数和第二线性参数的优化,上述公式的具体计算方法为:首先随机为隐变量Y和第二线性 参数B赋值,由于训练样本的实际输出y是已知的(即数据标签),因此可以对ε y=y-YB进行计算,之后利用随机赋值的隐变量和随机赋值的第一线性参数A求取ε Y=Y-AX,其中体征数据X是已知的,并且由于体征数据对应的实际噪声ε Y是可以实际测得的,因此基于实际测得的输入噪声分量和计算的输入噪声分量之间的误差,判断随机选取的第一线性参数和隐变量是否合适,若误差较大(例如误差高于预设阈值)则第一线性参数和隐变量选取不合适,若误差较小(例如误差低于预设阈值)则认为当前的第一线性参数、隐变量、第二线性参数是可以接受的,此时记录当前的第一线性参数和第二线性参数,并记录当前的两个数据噪声。之后交替进行多次迭代计算,并每次记录数据噪声,最后选取所有数据噪声情况中最小的两个数据噪声,从而得到与当前数据噪声对应的第一线性参数和第二线性参数,最终实现对线性参数的优化工作。之后,将优化得到的线性参数代入忽略噪声参数的模型y=AXB中,即可实现基于高纬体征数据得到一维诊断结果的功能,且当前的第一线性参数和第二线性参数能够最大限度的提取体征数据中的有用分量,同时最大限度的忽略噪声分量。例如:将目标人员的体征数据输入训练后的多元回归模型y=AXB,输出y=0.7,在本实施例中,输出大于预设阈值0.5时认为当前目标人员患有目标病症,若输出小于预设阈值0.5时认为当前目标人员未患有目标病症。通过上述步骤,使基于体征数据判断目标人员病症的目标能够快速计算实现,大大提高医护人员初步检测目标人员是否患病的效率。 In this embodiment, the optimization of the first linear parameter and the second linear parameter is realized through the above-mentioned optimization objective function. The specific calculation method of the above-mentioned formula is as follows: first randomly assign values to the hidden variable Y and the second linear parameter B, and due to the training The actual output y of the sample is known (that is, the data label), so ε y =y-YB can be calculated, and then the randomly assigned hidden variable and the randomly assigned first linear parameter A are used to obtain ε Y =Y- AX, where the sign data X is known, and since the actual noise ε Y corresponding to the sign data can be actually measured, based on the error between the actually measured input noise component and the calculated input noise component, the random Whether the selected first linear parameter and hidden variable are appropriate, if the error is large (for example, the error is higher than the preset threshold), then the selection of the first linear parameter and hidden variable is not appropriate, if the error is small (for example, the error is lower than the preset threshold) Then it is considered that the current first linear parameter, hidden variable, and second linear parameter are acceptable. At this time, the current first linear parameter and the second linear parameter are recorded, and the current two data noises are recorded. Afterwards, multiple iterative calculations are alternately performed, and the data noise is recorded each time. Finally, the two smallest data noises among all data noises are selected to obtain the first linear parameter and the second linear parameter corresponding to the current data noise. Optimization work on linear parameters. Afterwards, the optimized linear parameters are substituted into the model y=AXB which ignores the noise parameters, and the function of obtaining one-dimensional diagnosis results based on the high-latitude sign data can be realized, and the current first and second linear parameters can maximize Extract the useful components in the sign data while ignoring the noise components to the greatest extent. For example: input the sign data of the target person into the multivariate regression model y=AXB after training, output y=0.7, in this embodiment, when the output is greater than the preset threshold 0.5, it is considered that the current target person suffers from the target disease, if the output is less than the preset threshold When the threshold is set to 0.5, it is considered that the current target person does not suffer from the target disease. Through the above steps, the goal of judging the target person's illness based on the sign data can be quickly calculated and realized, which greatly improves the efficiency of the medical staff in initially detecting whether the target person is sick.
具体地,通过上述步骤八至步骤九的目标函数,虽然可以通过随机赋值交叉迭代运算,在一轮训练结束时得到符合目标函数条件的一组线性参数,但是由于目标函数是中的线性参数参数是随机赋值,即便针对相同的训练样本,若经过多轮训练也可能得到多组不同的线性参数均符合目标函数的约束条件,从本质上来看,各个不同的线性参数其侧重维度存在差别,例如第一次训练的线性参数可能更侧重体温数据,第二次训练的线性参数可能更侧重血氧饱和浓度数据。因此,在本实施例中,步骤十和步骤十一为了进一步提高多元回归模型的预测精度,可以进行多轮训练生成多组线性参数,然后分别将得到的多组线性参数代入多元回归模型,得到多个侧重不同维度的训练后的模型,之后将体征数据分别输入各个模型得到多个预测结果,然后将各 个预测结果进行平均计算,合并得到最终的预测结果,从而使基于体征数据的病症预测在更深程度上考虑各个维度的数据影响,使得目标人员是否患病的预测概率更加准确。Specifically, through the above-mentioned objective function from step 8 to step 9, although a set of linear parameters that meet the conditions of the objective function can be obtained at the end of a round of training through random assignment and cross-iterative operations, since the objective function is a linear parameter parameter in It is a random assignment. Even for the same training sample, after multiple rounds of training, it is possible to obtain multiple sets of different linear parameters that meet the constraints of the objective function. The linear parameters of the first training may be more focused on body temperature data, and the linear parameters of the second training may be more focused on blood oxygen saturation data. Therefore, in this embodiment, in order to further improve the prediction accuracy of the multiple regression model in step 10 and step 11, multiple rounds of training can be performed to generate multiple sets of linear parameters, and then the obtained multiple sets of linear parameters are substituted into the multiple regression model to obtain Multiple trained models that focus on different dimensions, and then input the sign data into each model to obtain multiple prediction results, and then average the prediction results, and combine them to obtain the final prediction result, so that the disease prediction based on the sign data is in Considering the impact of data in various dimensions in a deeper level makes the prediction probability of whether the target person is sick or not more accurate.
具体地,在一实施例中,基于上述步骤八至步骤十一,步骤S103具体包括如下步骤:Specifically, in one embodiment, based on the above step eight to step eleven, step S103 specifically includes the following steps:
步骤十二:若利用训练样本对预设的多元回归模型进行训练得到多个多元回归模型,则将计算多个多元回归模型输出结果的平均值的表达式作为病情检测模型。具体地,在本实施例中,病情检测模型不直接使用单个多元回归模型,而是对多个多元回归模型的输出值求平均结果。本步骤的详细原理描述可参考上述步骤十至步骤十一的相关描述,在此不再赘述。Step 12: If multiple regression models are obtained by using the training samples to train the preset multiple regression model, the expression for calculating the average value of the output results of the multiple regression models is used as the disease detection model. Specifically, in this embodiment, the disease detection model does not directly use a single multiple regression model, but averages the output values of multiple multiple regression models. For the detailed principle description of this step, please refer to the relevant descriptions of the above steps 10 to 11, and will not be repeated here.
请参阅图2,在一个实施方式中,一种病情检测方法,具体包括如下步骤:Please refer to Figure 2. In one embodiment, a disease detection method specifically includes the following steps:
步骤S201:获取目标人员的体征数据。Step S201: Obtain the physical sign data of the target person.
步骤S202:将体征数据代入上述训练方法生成的病情检测模型进行计算。Step S202: Substituting the sign data into the disease detection model generated by the above training method for calculation.
步骤S203:根据病情检测模型的输出结果判断目标人员是否患有目标病症。Step S203: Determine whether the target person suffers from the target disease according to the output result of the disease detection model.
具体地,基于体征数据识别目标病症的详细具体原理可参考上述病情检测模型训练方法实施例的相关描述,在此不再赘述。Specifically, for the detailed principle of identifying the target disease based on the sign data, reference may be made to the relevant description of the above-mentioned embodiment of the disease detection model training method, which will not be repeated here.
通过上述步骤,本申请提供的技术方案,获取大量体征数据,包括但不限于体温数据、血氧饱和度数据、心率数据作为训练样本,然后根据采集数据可以测得的数据噪声,建立通过线性参数将多维数据降低为一维识别结果的多元回归模型,从而以降低体温数据的数据噪声、血氧饱和度数据的数据噪声和心率数据的数据噪声为目标,使用训练样本对多元回归模型的线性参数进行训练。从而在线性参数训练完成之后,即可使用基于简单线性计算的多元回归模型对目标人员的体征数据进行计算,从而判定目标人员是否患病的结果,其在实际使用过程中,计算量远远小于大容量的神经网络,作为初步诊断病症的诊断模型,能够大大提高诊断效率。Through the above steps, the technical solution provided by this application obtains a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples, and then establishes a linear parameter based on the data noise that can be measured from the collected data. Reduce multidimensional data to a multiple regression model of one-dimensional recognition results, so as to reduce the data noise of body temperature data, data noise of blood oxygen saturation data and data noise of heart rate data, and use training samples to linear parameters of multiple regression model to train. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick. In the actual use process, the calculation amount is far less than A large-capacity neural network, as a diagnostic model for the initial diagnosis of diseases, can greatly improve the efficiency of diagnosis.
如图3所示,本实施例还提供了一种病情检测模型训练装置,装置包括:As shown in Figure 3, the present embodiment also provides a disease detection model training device, the device includes:
数据采集模块101,用于获取用于表征目标病症的训练样本,训练样本包括多个人员的体征数据和各个人员是否患病的数据标签。详细内容参见上 述方法实施例中步骤S101的相关描述,在此不再进行赘述。The data acquisition module 101 is configured to acquire training samples used to characterize the target disease, the training samples include the sign data of multiple persons and the data label of whether each person is sick. For details, refer to the relevant description of step S101 in the above method embodiment, and details are not repeated here.
模型训练模块102,用于利用训练样本对预设的多元回归模型进行训练,多元回归模型包括线性参数,线性参数用于对人员的体征数据进行线性降维,以计算出该人员是否患病的结果。详细内容参见上述方法实施例中步骤S102的相关描述,在此不再进行赘述。The model training module 102 is used to use the training samples to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to perform linear dimensionality reduction on the sign data of the person to calculate whether the person is sick result. For details, refer to the relevant description of step S102 in the above method embodiment, and details are not repeated here.
模型输出模块103,用于当满足预设条件时,输出多元回归模型,并根据多元回归模型构建病情检测模型。详细内容参见上述方法实施例中步骤S103的相关描述,在此不再进行赘述。The model output module 103 is configured to output a multiple regression model when the preset condition is satisfied, and construct a disease detection model according to the multiple regression model. For details, refer to the relevant description of step S103 in the above method embodiment, and details are not repeated here.
本申请实施例提供的一种病情检测模型训练装置,用于执行上述实施例提供的一种病情检测模型训练方法,其实现方式与原理相同,详细内容参见上述方法实施例的相关描述,不再赘述。The disease detection model training device provided in the embodiment of the present application is used to implement the disease detection model training method provided in the above embodiment, and its implementation method is the same as the principle. For details, please refer to the relevant description of the above method embodiment, and will not repeat.
如图4所示,本实施例还提供了一种病情检测装置,该装置包括:As shown in Figure 4, this embodiment also provides a disease detection device, which includes:
第二数据采集模块201,获取目标人员的体征数据。详细内容参见上述方法实施例中步骤S201的相关描述,在此不再进行赘述。The second data acquisition module 201 acquires the physical sign data of the target person. For details, refer to the relevant description of step S201 in the above method embodiment, and details are not repeated here.
计算模块202,用于将体征数据代入上述病情检测模型训练方法生成的病情检测模型进行计算。详细内容参见上述方法实施例中步骤S202的相关描述,在此不再进行赘述。The calculation module 202 is used to substitute the sign data into the disease detection model generated by the above disease detection model training method for calculation. For details, refer to the relevant description of step S202 in the above method embodiment, and details are not repeated here.
结果输出模块203,用于根据病情检测模型的输出结果判断目标人员是否患有目标病症。详细内容参见上述方法实施例中步骤S203的相关描述,在此不再进行赘述。The result output module 203 is used for judging whether the target person suffers from the target disease according to the output result of the disease detection model. For details, refer to the relevant description of step S203 in the above method embodiment, and details are not repeated here.
本申请实施例提供的一种病情检测装置,用于执行上述实施例提供的一种病情检测方法,其实现方式与原理相同,详细内容参见上述方法实施例的相关描述,不再赘述。The disease detection device provided in the embodiment of the present application is used to implement the disease detection method provided in the above embodiment. The implementation method is the same as the principle. For details, please refer to the relevant description of the above method embodiment, and will not repeat them here.
通过上述各个组成部分的协同合作,本申请提供的技术方案,获取大量体征数据,包括但不限于体温数据、血氧饱和度数据、心率数据作为体征数据训练样本,然后根据采集数据可以测得的数据噪声,建立通过线性参数将多维数据降低为一维识别结果的多元回归模型,从而以降低体温数据的数据噪声、血氧饱和度数据的数据噪声和心率数据的数据噪声为目标,使用训练样本对多元回归模型的线性参数进行训练。从而在线性参数训练完成之后,即可使用基于简单线性计算的多元回归模型对目标人员的体征数据进行计 算,从而判定目标人员是否患病的结果,其在实际使用过程中,计算量远远小于大容量的神经网络,作为初步诊断病症的诊断模型,能够大大提高诊断效率。Through the collaborative cooperation of the above-mentioned components, the technical solution provided by this application can obtain a large amount of physical sign data, including but not limited to body temperature data, blood oxygen saturation data, and heart rate data as training samples of physical sign data, and then can measure according to the collected data Data noise, establish a multiple regression model that reduces multidimensional data to one-dimensional recognition results through linear parameters, so as to reduce the data noise of body temperature data, data noise of blood oxygen saturation data and data noise of heart rate data, using training samples Train the linear parameters of a multiple regression model. Therefore, after the linear parameter training is completed, the multiple regression model based on simple linear calculation can be used to calculate the sign data of the target person, so as to determine whether the target person is sick. In the actual use process, the calculation amount is far less than A large-capacity neural network, as a diagnostic model for the initial diagnosis of diseases, can greatly improve the efficiency of diagnosis.
图5示出了本申请实施例的一种电子设备,该设备包括处理器901和存储器902,可以通过总线或者其他方式连接,图5中以通过总线连接为例。FIG. 5 shows an electronic device according to an embodiment of the present application. The device includes a processor 901 and a memory 902, which may be connected through a bus or in other ways. In FIG. 5, connection through a bus is taken as an example.
处理器901可以为中央处理器(Central Processing Unit,CPU)。处理器901还可以为其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等芯片,或者上述各类芯片的组合。The processor 901 may be a central processing unit (Central Processing Unit, CPU). The processor 901 can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate array (Field-Programmable Gate Array, FPGA) or Other chips such as programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations of the above-mentioned types of chips.
存储器902作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块,如上述方法实施例中的方法所对应的程序指令/模块。处理器901通过运行存储在存储器902中的非暂态软件程序、指令以及模块,从而执行处理器的各种功能应用以及数据处理,即实现上述方法实施例中的方法。The memory 902, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/modules corresponding to the methods in the above method embodiments. The processor 901 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 902, that is, implements the methods in the above method embodiments.
存储器902可以包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需要的应用程序;存储数据区可存储处理器901所创建的数据等。此外,存储器902可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中,存储器902可选包括相对于处理器901远程设置的存储器,这些远程存储器可以通过网络连接至处理器901。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created by the processor 901 and the like. In addition, the memory 902 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the storage 902 may optionally include storages that are remotely located relative to the processor 901, and these remote storages may be connected to the processor 901 through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
一个或者多个模块存储在存储器902中,当被处理器901执行时,执行上述方法实施例中的方法。One or more modules are stored in the memory 902, and when executed by the processor 901, the methods in the foregoing method embodiments are executed.
上述电子设备具体细节可以对应参阅上述方法实施例中对应的相关描述和效果进行理解,此处不再赘述。Specific details of the foregoing electronic device may be understood by correspondingly referring to corresponding relevant descriptions and effects in the foregoing method embodiments, and details are not repeated here.
本领域技术人员可以理解,实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,实现的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的 流程。其中,存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)、随机存储记忆体(Random Access Memory,RAM)、快闪存储器(Flash Memory)、硬盘(Hard Disk Drive,缩写:HDD)或固态硬盘(Solid-State Drive,SSD)等;存储介质还可以包括上述种类的存储器的组合。Those skilled in the art can understand that realizing all or part of the processes in the methods of the above embodiments can be completed by instructing related hardware through computer programs, and the implemented programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium can be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a flash memory (Flash Memory), a hard disk (Hard Disk Drive) , abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above-mentioned types of memory.
虽然结合附图描述了本申请的实施例,但是本领域技术人员可以在不脱离本申请的精神和范围的情况下作出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiment of the application has been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the application, and such modifications and variations all fall into the scope defined by the appended claims. within the limited range.

Claims (10)

  1. 一种病情检测模型训练方法,其中,所述方法包括:A method for training a disease detection model, wherein the method includes:
    获取用于表征目标病症的训练样本,所述训练样本包括多个人员的体征数据和各个人员是否患病的数据标签;Obtaining training samples used to characterize the target disease, the training samples include the sign data of multiple people and the data labels of whether each person is sick;
    利用所述训练样本对预设的多元回归模型进行训练,所述多元回归模型包括线性参数,所述线性参数用于对人员的体征数据进行线性降维,以计算出该人员是否患病的结果;Using the training samples to train a preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to linearly reduce the dimension of the person's sign data to calculate whether the person is sick or not ;
    当满足预设条件时,输出所述多元回归模型,并根据所述多元回归模型构建所述病情检测模型。When the preset condition is satisfied, the multiple regression model is output, and the disease detection model is constructed according to the multiple regression model.
  2. 根据权利要求1所述的方法,其中,所述获取用于表征目标病症的训练样本,包括:The method according to claim 1, wherein said obtaining training samples for characterizing the target disease comprises:
    获取多个标有患病标签的第一体征数据,并将所述第一体征数据作为正样本;Obtaining a plurality of first sign data marked with a disease label, and using the first sign data as a positive sample;
    获取多个标有未患病标签的第二体征数据,并将所述第二体征数据作为负样本;Obtaining a plurality of second sign data marked with an undiseased label, and using the second sign data as a negative sample;
    以相同采样频率对各个正样本和负样本进行标准化;Normalize each positive and negative sample with the same sampling frequency;
    将标准化之后的所述正样本和所述负样本组成样本集合,生成所述训练样本。Composing the normalized positive samples and the negative samples into a sample set to generate the training samples.
  3. 根据权利要求1所述的方法,其中,在对所述预设的多元回归模型进行训练之前,所述方法还包括:The method according to claim 1, wherein, before the preset multiple regression model is trained, the method also includes:
    将第一变量和第一噪声参数的和作为第一线性表达式,所述第一变量是所述体征数据与预设的第一线性参数的乘积,所述第一噪声参数用于表征所述体征数据的噪声;The sum of the first variable and the first noise parameter is used as the first linear expression, the first variable is the product of the sign data and the preset first linear parameter, and the first noise parameter is used to characterize the Noise of sign data;
    将第二变量和第二噪声参数的和作为第二线性表达式,所述第二变量是所述第一线性表达式输出结果与预设的第二线性参数的乘积,所述第二噪声参数用于表征所述第一线性表达式输出结果的噪声,所述第二线性表达式的输出结果用于预测所述体征数据对应的人员是否患病,所述预设的第一线性参数和预设的第二线性参数为多元回归模型的所述线性参数;The sum of the second variable and the second noise parameter is used as the second linear expression, the second variable is the product of the output result of the first linear expression and the preset second linear parameter, and the second noise parameter Used to characterize the noise of the output result of the first linear expression, the output result of the second linear expression is used to predict whether the person corresponding to the sign data is sick, the preset first linear parameter and the preset The second linear parameter of setting is the described linear parameter of multiple regression model;
    将所述第一线性表达式和所述第二线性表达式组合,构成所述多元回归模型。Combining the first linear expression and the second linear expression to form the multiple regression model.
  4. 根据权利要求3所述的方法,其中,所述利用所述训练样本对预设的多元回归模型进行训练,包括:The method according to claim 3, wherein said utilizing said training samples to train a preset multiple regression model, comprising:
    分别将第一线性表达式和第二线性表达式变换为表征第一噪声参数和第二噪声参数的形式;Transforming the first linear expression and the second linear expression into forms representing the first noise parameter and the second noise parameter, respectively;
    基于变换后的第一线性表达式和第二线性表达式建立目标函数;establishing an objective function based on the transformed first linear expression and the second linear expression;
    将所述体征数据和所述数据标签代入所述目标函数进行迭代计算,对所述预设的第一线性参数和所述预设的第二线性参数进行调整,以使所述第一噪声参数的范数和第二噪声参数的范数进行递减;Substituting the sign data and the data label into the objective function for iterative calculation, adjusting the preset first linear parameter and the preset second linear parameter, so that the first noise parameter The norm of and the norm of the second noise parameter are decremented;
    输出至少一组调整后的线性参数,并基于所述调整后的线性参数生成至少一个对应的多元回归模型,以完成训练。Outputting at least one set of adjusted linear parameters, and generating at least one corresponding multiple regression model based on the adjusted linear parameters to complete the training.
  5. 根据权利要求4所述的方法,其中,所述根据所述多元回归模型构建所述病情检测模型,包括:The method according to claim 4, wherein said constructing said disease detection model according to said multiple regression model comprises:
    若利用所述训练样本对预设的多元回归模型进行训练得到多个所述多元回归模型,则将计算多个所述多元回归模型输出结果的平均值的表达式作为所述病情检测模型。If the training sample is used to train the preset multiple regression model to obtain multiple multiple regression models, the expression for calculating the average value of output results of multiple multiple regression models is used as the disease detection model.
  6. 一种病情检测方法,其中,所述方法包括:A disease detection method, wherein the method comprises:
    获取目标人员的体征数据;Obtain the physical sign data of the target person;
    将所述体征数据代入根据权利要求1-5任意一项所述的方法生成的病情检测模型进行计算;Substituting the sign data into the disease detection model generated by the method according to any one of claims 1-5 for calculation;
    根据所述病情检测模型的输出结果判断所述目标人员是否患有所述目标病症。and judging whether the target person suffers from the target disease according to the output result of the disease detection model.
  7. 一种病情检测模型训练装置,其中,所述装置包括:A disease detection model training device, wherein the device includes:
    数据采集模块,用于获取用于表征目标病症的训练样本,所述训练样本包括多个人员的体征数据和各个人员是否患病的数据标签;The data acquisition module is used to obtain training samples for characterizing the target disease, the training samples include the sign data of multiple personnel and the data labels of whether each personnel is sick;
    模型训练模块,用于利用所述训练样本对预设的多元回归模型进行训练,所述多元回归模型包括线性参数,所述线性参数用于对人员的体征数据进行线性降维,以计算出该人员是否患病的结果;The model training module is used to use the training samples to train the preset multiple regression model, the multiple regression model includes linear parameters, and the linear parameters are used to linearly reduce the dimension of the physical sign data of the personnel to calculate the the result of whether the person is sick;
    模型输出模块,用于当满足预设条件时,输出所述多元回归模型,并根据所述多元回归模型构建所述病情检测模型。The model output module is used to output the multiple regression model when the preset condition is satisfied, and construct the disease detection model according to the multiple regression model.
  8. 一种病情检测装置,其中,所述装置包括:A disease detection device, wherein the device includes:
    第二数据采集模块,获取目标人员的体征数据;The second data acquisition module acquires the physical sign data of the target person;
    计算模块,用于将所述体征数据代入根据权利要求1-5任意一项所述的方法生成的病情检测模型进行计算;A calculation module, used to substitute the sign data into the disease detection model generated by the method according to any one of claims 1-5 for calculation;
    结果输出模块,用于根据所述病情检测模型的输出结果判断所述目标人员是否患有所述目标病症。A result output module, configured to judge whether the target person suffers from the target disease according to the output result of the disease detection model.
  9. 一种电子设备,其中,包括:An electronic device, comprising:
    存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行如权利要求1-6任一项所述的方法。A memory and a processor, the memory and the processor are connected in communication with each other, and computer instructions are stored in the memory, and the processor executes the computer instructions to perform any one of claims 1-6. the method described.
  10. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使所述计算机从而执行如权利要求1-6任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the method according to any one of claims 1-6.
PCT/CN2022/100159 2021-12-31 2022-06-21 Medical condition test model training method and apparatus, test method and apparatus, and electronic device WO2023123913A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111683219.2A CN116434897A (en) 2021-12-31 2021-12-31 Illness state detection model training and detecting method and device and electronic equipment
CN202111683219.2 2021-12-31

Publications (1)

Publication Number Publication Date
WO2023123913A1 true WO2023123913A1 (en) 2023-07-06

Family

ID=86997341

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/100159 WO2023123913A1 (en) 2021-12-31 2022-06-21 Medical condition test model training method and apparatus, test method and apparatus, and electronic device

Country Status (2)

Country Link
CN (1) CN116434897A (en)
WO (1) WO2023123913A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174152A1 (en) * 2009-01-02 2010-07-08 Cerner Innovation, Inc. Predicting neonatal hyperbilirubinemia
CN106202988A (en) * 2016-10-11 2016-12-07 哈尔滨工业大学深圳研究生院 The Stepwise multiple-regression model of a kind of predictive disease life cycle and application
CN109872011A (en) * 2019-03-18 2019-06-11 重庆邮电大学 Livestock physiological status prediction technique and system based on multivariate logistic regression model
CN110957043A (en) * 2018-09-26 2020-04-03 金敏 Disease prediction system
CN112801224A (en) * 2021-03-26 2021-05-14 平安科技(深圳)有限公司 Diabetes typing probability prediction method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174152A1 (en) * 2009-01-02 2010-07-08 Cerner Innovation, Inc. Predicting neonatal hyperbilirubinemia
CN106202988A (en) * 2016-10-11 2016-12-07 哈尔滨工业大学深圳研究生院 The Stepwise multiple-regression model of a kind of predictive disease life cycle and application
CN110957043A (en) * 2018-09-26 2020-04-03 金敏 Disease prediction system
CN109872011A (en) * 2019-03-18 2019-06-11 重庆邮电大学 Livestock physiological status prediction technique and system based on multivariate logistic regression model
CN112801224A (en) * 2021-03-26 2021-05-14 平安科技(深圳)有限公司 Diabetes typing probability prediction method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116434897A (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN109009102B (en) Electroencephalogram deep learning-based auxiliary diagnosis method and system
Huang et al. Multivariate time series early classification using multi-domain deep neural network
CN110731773A (en) abnormal electrocardiogram screening method based on fusion of global and local depth features of electrocardiogram
Jin et al. Classification of normal and abnormal ECG records using lead convolutional neural network and rule inference
CN112294341A (en) Sleep electroencephalogram spindle wave identification method and system based on light convolutional neural network
Rezaee et al. Graph convolutional network‐based deep feature learning for cardiovascular disease recognition from heart sound signals
Chen et al. Edge2Analysis: a novel AIoT platform for atrial fibrillation recognition and detection
Zhang et al. Automatic screening method for atrial fibrillation based on lossy compression of the electrocardiogram signal
Liu et al. A review of arrhythmia detection based on electrocardiogram with artificial intelligence
CN116842330B (en) Health care information processing method and device capable of comparing histories
Saroja et al. Data‐Driven Decision Making in IoT Healthcare Systems—COVID‐19: A Case Study
CN113116300A (en) Physiological signal classification method based on model fusion
WO2023123913A1 (en) Medical condition test model training method and apparatus, test method and apparatus, and electronic device
CN115862897A (en) Syndrome monitoring method and system based on clinical data
CN116129182A (en) Multi-dimensional medical image classification method based on knowledge distillation and neighbor classification
Ganesh et al. Diabetes Prediction using Logistic Regression and Feature Normalization
El Zein et al. Multi-classification model for covid-19 prediction using imbalanced x-ray dataset based on transfer learning and class weighting-smote method
Hammad et al. 2D ECG classification system based on machine learning and LBP
Utomo et al. Classification based on compressive multivariate time series
Tang et al. Explainable and efficient deep early warning system for cardiac arrest prediction from electronic health records
Desai et al. Hybrid Model of Machine Learning Algorithms for Prediction of Cardiovascular Disease
Huang et al. PhysioVec: A Multi-stage Deep-Learning Framework for Searching Online Health Information with Breath Sound
Jiahao et al. An end-end arrhythmia diagnosis model based on deep learning neural network with multi-scale feature extraction
Sakl et al. DL Methods for Skin Lesions Automated Diagnosis In Smartphone Images
Ali et al. Detection of crackle and wheeze in lung sound using machine learning technique for clinical decision support system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913202

Country of ref document: EP

Kind code of ref document: A1