WO2021228148A1 - 保护个人数据隐私的特征提取方法、模型训练方法及硬件 - Google Patents

保护个人数据隐私的特征提取方法、模型训练方法及硬件 Download PDF

Info

Publication number
WO2021228148A1
WO2021228148A1 PCT/CN2021/093367 CN2021093367W WO2021228148A1 WO 2021228148 A1 WO2021228148 A1 WO 2021228148A1 CN 2021093367 W CN2021093367 W CN 2021093367W WO 2021228148 A1 WO2021228148 A1 WO 2021228148A1
Authority
WO
WIPO (PCT)
Prior art keywords
image sequence
frame image
sample object
feature data
encrypted feature
Prior art date
Application number
PCT/CN2021/093367
Other languages
English (en)
French (fr)
Inventor
杨成平
赵凯
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021228148A1 publication Critical patent/WO2021228148A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/44Secrecy systems

Definitions

  • This document relates to the field of data processing technology, in particular to a feature extraction method, model training method and hardware for protecting personal data privacy.
  • Deep learning models have been used more and more widely by virtue of their ability to process information mechanically. Face recognition is a common business form in the field of deep learning. The principle of face recognition is based on a deep learning model to approximate the facial features of the user to be identified with the facial features of the sample, so as to determine the identity of the user to be identified. Obviously, there is a risk of leakage of sample facial features that belong to personal data, and it cannot effectively protect privacy.
  • the purpose of the embodiments of this specification is to provide a feature extraction method, model training method, and hardware for protecting personal data privacy, which can protect personal data privacy in the field of deep learning.
  • a feature extraction method for protecting personal data privacy including: acquiring a multi-frame image sequence presenting sample objects; using nonlinear transformation as an encryption method , Performing feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, wherein the feature data of the sample object presented in the multi-frame image sequence belongs to the sample object ⁇ personal data; integrated learning of the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, to obtain the target encrypted feature data corresponding to the sample object.
  • a model training method for protecting personal data privacy includes: obtaining a multi-frame image sequence presenting sample objects; using nonlinear conversion as an encryption method to perform feature representation on the obtained multi-frame image sequence, Obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, wherein the feature data of the sample object presented in the multi-frame image sequence belongs to the personal data of the sample object; the sample object corresponds to the multi-frame image sequence Perform integrated learning on the initial encrypted feature data of the sample object to obtain the target encrypted feature data corresponding to the sample object; train a preset learning model based on the target encrypted feature data corresponding to the sample object and the model classification label of the sample object.
  • a feature extraction device for protecting privacy data includes: an image sequence acquisition module that acquires a multi-frame image sequence presenting sample objects; a feature encryption representation module that uses nonlinear conversion as an encryption method to perform The feature representation of the multi-frame image sequence is performed to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, wherein the feature data of the sample object presented in the multi-frame image sequence belongs to the personal data of the sample object; feature integration The learning module performs integrated learning on the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object to obtain the target encrypted feature data corresponding to the sample object.
  • an electronic device including: a memory, a processor, and a computer program stored on the memory and capable of being run on the processor, the computer program being executed by the processor: obtaining a sample present A multi-frame image sequence of the object; using nonlinear conversion as an encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, wherein the multi-frame image sequence
  • the characteristic data of the sample object presented by the image sequence belongs to the personal data of the sample object; the integrated learning is performed on the initial encrypted characteristic data of the multi-frame image sequence corresponding to the sample object to obtain the target encrypted characteristic data corresponding to the sample object.
  • a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the following steps are implemented: Obtain a multi-frame image sequence presenting a sample object Use nonlinear conversion as the encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, wherein the sample object presented by the multi-frame image sequence The characteristic data of belongs to the personal data of the sample object; the integrated learning is performed on the initial encrypted characteristic data of the multi-frame image sequence corresponding to the sample object to obtain the target encrypted characteristic data corresponding to the sample object.
  • the feature integrated learning module performs integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object to obtain the target encrypted feature data corresponding to the sample object; the model training module is based on the target encrypted feature corresponding to the sample object
  • the model classification labels of the data and sample objects, and the preset learning model is trained.
  • an electronic device including: a memory, a processor, and a computer program stored in the memory and capable of being run on the processor, the computer program being executed by the processor: obtaining a sample A multi-frame image sequence of the object; using nonlinear conversion as an encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, wherein the multi-frame image sequence
  • the characteristic data of the sample object presented by the image sequence belongs to the personal data of the sample object; the integrated learning is performed on the initial encrypted characteristic data of the multi-frame image sequence corresponding to the sample object to obtain the target encrypted characteristic data corresponding to the sample object;
  • the target encrypted feature data corresponding to the sample object and the model classification label of the sample object are trained on a preset learning model.
  • a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the following steps are implemented: Obtain a multi-frame image sequence presenting a sample object Use nonlinear conversion as the encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, wherein the sample object presented by the multi-frame image sequence The feature data of the sample object belongs to the personal data of the sample object; the integrated learning is performed on the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object to obtain the target encrypted feature data corresponding to the sample object; based on the corresponding sample object The target encrypted feature data and the model classification label of the sample object, and the preset learning model is trained.
  • the solution of the embodiment of this specification adopts a non-linear conversion encryption method to perform feature encryption extraction on a multi-frame image sequence presenting a sample object to obtain the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object.
  • the frame image sequence is integrated with the initial encrypted feature data to obtain high-level target encrypted feature data. Since the entire scheme relies on encrypted image feature data, if the encrypted image feature data is sampled, if leakage occurs, the personal data of the sample object will not be exposed, thus achieving privacy protection.
  • the target encrypted feature data is obtained by integrating the initial encrypted feature data of the multi-frame image sequence, which can effectively correct the loss caused by the image feature encryption, and can obtain better model performance when it is subsequently used for model training.
  • Fig. 1 is a schematic flowchart of a feature extraction method provided by an embodiment of the specification.
  • Fig. 2 is a schematic flow chart of a model training method provided in an embodiment of this specification.
  • Fig. 3 is a schematic structural diagram of a feature extraction device provided by an embodiment of the specification.
  • Fig. 4 is a schematic structural diagram of a model training device provided by an embodiment of this specification.
  • Fig. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the specification.
  • the principle of face recognition is based on the deep learning model to approximate the facial features of the user to be identified and the facial features of the sample to determine the identity of the user to be identified.
  • the training of the deep learning model relies on sample face images. These sample facial images belong to the user's personal data, and keeping the bottom will risk privacy leakage.
  • this document aims to propose a technical solution that can protect personal data privacy in the field of deep learning.
  • Fig. 1 is a flowchart of a feature extraction method for protecting personal data privacy according to an embodiment of this specification.
  • the method shown in FIG. 1 may be executed by the corresponding device below, and includes: step S102, acquiring a multi-frame image sequence presenting a sample object.
  • a multi-frame image sequence can be intercepted from the video presenting the sample object.
  • video shooting of a sample object is performed through the camera of the terminal device, and a multi-frame image sequence showing the sample object is cut out according to a preset frame rate.
  • Step S104 using nonlinear conversion as the encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the sample object corresponding to the multi-frame image sequence, where the feature data of the sample object presented in the multi-frame image sequence belongs to Personal data of sample subjects.
  • the ratio of the change of the output value (initial encrypted feature data) to the change of the corresponding input value (multi-frame image sequence) is not a constant conversion, which has an encryption effect.
  • a local sensitive hash algorithm may be used to perform hash conversion on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the sample object corresponding to the multi-frame image sequence.
  • a convolutional neural network model can also be used to perform feature encryption extraction on a multi-frame image sequence.
  • the convolutional neural network model may include: a convolutional layer, which performs convolution processing on the obtained multi-frame image sequence to obtain an output feature set of the convolutional layer; a pooling layer, based on a maximum pooling algorithm and/or an average pooling The algorithm is to perform pooling processing on the output feature set of the convolutional layer to obtain the output feature set of the pooling layer; the fully connected layer converts the output feature set of the pooling layer into initial encrypted feature data of a specified dimension.
  • the initial encrypted feature data corresponding to the sample object output by the convolutional neural network model can be obtained.
  • Step S106 Perform integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object to obtain target encrypted feature data corresponding to the sample object.
  • ensemble learning is an existing machine learning method. It itself completes the learning task by constructing and combining multiple individual learners. Individual learners are usually generated from training data by an existing learning algorithm, such as C4.5 decision algorithm, BP neural network algorithm, etc. At this time, the ensemble contains only the same type of individual learners, such as "decision tree integration" The whole is a decision tree, and the "neural network ensemble” is all a neural network. Such an ensemble is “homogeneous.” Individual learners in homogeneous integration are also called “base learners”. The corresponding learning algorithm is called “base learning algorithm”. An ensemble can also include different types of individual learners, for example, a decision tree and a neural network are included at the same time. Such an ensemble is called “heterogeneous”.
  • the individual learners in heterogeneous ensemble are generated by different learning algorithms. At this time, there is no base learning algorithm, which is often called “component learner” or directly called individual learner.
  • individual learners can be assembled through conventional collective learning strategies (for example, averaging method, voting method, and learning method) to realize the selection and integration of initial encrypted feature data, so as to obtain higher-level target encrypted feature data.
  • the initial encrypted feature data and/or target encrypted feature data obtained in this step can be used as model training data. Therefore, it is only necessary to reserve samples of the initial encrypted feature data and/or the target encrypted feature data.
  • the previously obtained multi-frame image sequence can be deleted after the feature representation is completed, that is, "burn after use".
  • the feature method in the embodiment of this specification adopts a non-linear conversion encryption method to perform feature encryption extraction on a multi-frame image sequence presenting a sample object to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object.
  • the multi-frame image sequence is integrated with the initial encrypted feature data to obtain high-level target encrypted feature data. Since the entire scheme relies on encrypted image feature data, if the encrypted image feature data is sampled, if leakage occurs, the personal data of the sample object will not be exposed, thus achieving privacy protection.
  • the target encrypted feature data is obtained by integrating the initial encrypted feature data of the multi-frame image sequence, which can effectively correct the loss caused by the image feature encryption, and can obtain better model performance when it is subsequently used for model training.
  • the process includes the following steps:
  • Step 1 Take a video shot of the face of the sample object, and intercept a multi-frame face image sequence showing the face of the sample object from the video.
  • Step 2 Using nonlinear conversion as the encryption method, perform feature expression on the obtained multi-frame face image sequence to obtain the initial encrypted feature data.
  • Step 3 Delete the multi-frame face image sequence of the sample object intercepted from the video.
  • Step 4 Store the initial encrypted feature data corresponding to the sample object in a feature database in association.
  • Step 5 Perform integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object to obtain the target encrypted feature data corresponding to the sample object.
  • Step 6 Associatively store the target encrypted feature data corresponding to the sample object to the feature database.
  • the initial encrypted feature data and/or target encrypted feature data corresponding to the object can be sampled from the feature library, and the user identification model can be trained.
  • FIG. 2 is a flowchart of a model training method according to an embodiment of this specification. The method shown in Figure 2 can be executed by the following corresponding devices, including:
  • Step S202 Obtain a multi-frame image sequence presenting sample objects.
  • Step S204 using nonlinear conversion as the encryption method to perform feature representation on the obtained multi-frame image sequence to obtain the initial encrypted feature data of the sample object corresponding to the multi-frame image sequence, where the feature data of the sample object presented in the multi-frame image sequence belongs to Personal data of the sample subject.
  • Step S206 Perform integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object to obtain target encrypted feature data corresponding to the sample object.
  • Step S208 training a preset learning model based on the target encrypted feature data corresponding to the sample object and the model classification label of the sample object.
  • the target encrypted feature data corresponding to the sample object is used as the input data of the preset learning model, and the model classification label corresponding to the sample user is preset as the output data of the learning model.
  • the training result given by the preset learning model can be obtained.
  • This training result is the predicted classification result of the preset learning model for the sample user, and may be different from the true value classification result indicated by the model classification label of the sample user.
  • the embodiment of this specification can calculate the error value between the predicted classification result and the true value classification result based on the loss function derived by the maximum likelihood estimation, and for the purpose of reducing the error value, the parameters in the preset learning model can be calculated Adjust (for example, the weight value of the underlying vector) to achieve the training effect.
  • the model training method in the embodiment of this specification adopts a non-linear conversion encryption method to perform feature encryption extraction on a multi-frame image sequence presenting a sample object to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object, and then adopt an integrated learning method Integrate the initial encrypted feature data of the multi-frame image sequence to obtain high-level target encrypted feature data. Since the entire scheme relies on encrypted image feature data, if the encrypted image feature data is sampled, if leakage occurs, the personal data of the sample object will not be exposed, thus achieving privacy protection. At the same time, the target encrypted feature data is obtained by integrating the initial encrypted feature data of the multi-frame image sequence, which can effectively correct the loss caused by the image feature encryption, and after training the preset learning model, better model performance can be obtained.
  • pre-trained learning model can be used for prediction and identification, so as to provide data support for related business decisions.
  • the preset learning model in the embodiment of this specification can be applied to a face payment service.
  • a multi-frame image sequence showing the payment object to be verified can be collected; then, the same non-linear conversion as described above is used as an encryption method to characterize the multi-frame image sequence of the payment object Indicates that the initial encrypted feature data of the payment object is obtained.
  • perform integrated learning on the initial encrypted feature data of the payment object obtain the target encrypted feature data of the payment object, and input the target encrypted feature data of the payment object into the preset learning model, and the preset learning model determines whether the payment object is Authorized users (target users) for payment.
  • the recognition result of the preset learning model is used to determine whether to initiate facial payment.
  • the embodiment of this specification also provides a feature extraction device for protecting private data.
  • 3 is a schematic diagram of the structure of the feature extraction device 300 of the embodiment of this specification, including: an image sequence acquisition module 310, which acquires a multi-frame image sequence presenting sample objects; a feature encryption representation module 320, which uses a nonlinear transformation as an encryption method to The obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the feature data of the sample object presented in the multi-frame image sequence belongs to the individual of the sample object Data; a feature integrated learning module 330, which performs integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, to obtain the target encrypted feature data corresponding to the sample object.
  • the feature extraction device of the embodiment of this specification adopts a non-linear conversion encryption method to perform feature encryption extraction on a multi-frame image sequence presenting a sample object to obtain the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, and then through integrated learning
  • the method integrates the initial encrypted feature data of the multi-frame image sequence to obtain high-level target encrypted feature data. Since the entire scheme relies on encrypted image feature data, if the encrypted image feature data is sampled, if leakage occurs, the personal data of the sample object will not be exposed, thus achieving privacy protection.
  • the target encrypted feature data is obtained by integrating the initial encrypted feature data of the multi-frame image sequence, which can effectively correct the loss caused by the image feature encryption, and can obtain better model performance when it is subsequently used for model training.
  • the feature encryption representation module 320 specifically inputs the obtained multi-frame image sequence into a preset convolutional neural network model to obtain the initial encrypted feature data of the sample object corresponding to the multi-frame image sequence.
  • the convolutional neural network model includes: a convolutional layer, which performs convolution processing on the obtained multi-frame image sequence to obtain an output feature set of the convolutional layer; a pooling layer, which is based on a maximum pooling algorithm and/or an average value The pooling algorithm performs pooling processing on the output feature set of the convolutional layer to obtain the output feature set of the pooling layer; the fully connected layer converts the output feature set of the pooling layer into initial encrypted feature data of a specified dimension.
  • the feature encryption representation module 320 may also perform a hash conversion on the obtained multi-frame image sequence based on a locally sensitive hash algorithm to obtain the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object.
  • the feature extraction device 300 of the embodiment of the present specification may further include: a storage module for associative storage of the sample object, the corresponding initial encrypted feature data and/or the target encrypted feature data.
  • the feature extraction device 300 of the embodiment of the present specification may further include: a deletion module, which deletes the obtained multi-frame image sequence after obtaining the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object.
  • a deletion module which deletes the obtained multi-frame image sequence after obtaining the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object.
  • the embodiment of this specification also provides a model training device for protecting the privacy of personal data.
  • 4 is a schematic diagram of the structure of the model training device 400 according to the embodiment of this specification, including: an image sequence acquisition module 410, which acquires a multi-frame image sequence presenting sample objects; and a feature encryption representation module 420, which uses a nonlinear transformation as an encryption method to The obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the feature data of the sample object presented in the multi-frame image sequence belongs to the individual of the sample object Data; feature integration learning module 430, which performs integrated learning on the initial encrypted feature data corresponding to the multi-frame image sequence of the sample object, to obtain the target encrypted feature data corresponding to the sample object; model training module 440, based on the corresponding sample object The target encryption feature data and the model classification label of the sample object are used to train the preset learning model.
  • Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present specification.
  • the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory.
  • the memory may include memory, such as high-speed random access memory (Random-Access Memory, RAM), or may also include non-volatile memory (non-volatile memory), such as at least one disk storage.
  • RAM Random-Access Memory
  • non-volatile memory such as at least one disk storage.
  • the electronic device may also include hardware required by other services.
  • the processor, network interface, and memory can be connected to each other through an internal bus.
  • the internal bus can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus or an EISA (Extended) bus. Industry Standard Architecture, extended industry standard structure) bus, etc.
  • the bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one bidirectional arrow is used to indicate in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, and the program code includes computer operation instructions.
  • the memory may include memory and non-volatile memory, and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it to form the above-mentioned feature extraction device on a logical level.
  • the processor executes a program stored in the memory, and is specifically configured to perform the following operations: acquiring a multi-frame image sequence presenting a sample object.
  • the obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the sample object presented by the multi-frame image sequence
  • the characteristic data belongs to the personal data of the sample object.
  • the processor reads the corresponding computer program from the non-volatile memory to the memory and then runs it to form the aforementioned model training device on a logical level.
  • the processor executes a program stored in the memory, and is specifically configured to perform the following operations: acquiring a multi-frame image sequence presenting a sample object.
  • the obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the sample object presented by the multi-frame image sequence
  • the characteristic data belongs to the personal data of the sample object.
  • the above-mentioned feature extraction method disclosed in the embodiment shown in FIG. 1 of this specification or the model training method disclosed in the embodiment shown in FIG. 2 may be applied to a processor or implemented by the processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method can be completed by an integrated logic circuit of hardware in the processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (DSP), a dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of this specification can be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the electronic device of the embodiment of this specification can realize the function of the above-mentioned feature extraction apparatus in the embodiment shown in FIG. 1 or realize the function of the above-mentioned model training apparatus in the embodiment shown in FIG. 2. Since the principle is the same, this article will not go into details.
  • the electronic equipment in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, etc. That is to say, the execution body of the following processing flow is not limited to each logic unit. It can also be a hardware or logic device.
  • the embodiment of this specification also proposes a computer-readable storage medium that stores one or more programs, and the one or more programs include instructions.
  • the above instructions when executed by a portable electronic device that includes multiple application programs, can cause the portable electronic device to execute the method of the embodiment shown in FIG. Image sequence.
  • the obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the sample object presented by the multi-frame image sequence
  • the characteristic data belongs to the personal data of the sample object.
  • the above instructions when executed by a portable electronic device that includes multiple application programs, can cause the portable electronic device to execute the method of the embodiment shown in FIG. Image sequence.
  • the obtained multi-frame image sequence is characterized, and the initial encrypted feature data of the multi-frame image sequence corresponding to the sample object is obtained, wherein the sample object presented by the multi-frame image sequence
  • the characteristic data belongs to the personal data of the sample object.
  • this specification can be provided as a method, a system or a computer program product. Therefore, this specification may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this specification can take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

本说明书实施例提供一种保护个人数据隐私的特征提取方法、模型训练方法及硬件。其中,特征提取方法包括:获取呈现有样本对象的多帧图像序列。以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据,其中,目标加密特征数据可作为模型训练数据。

Description

保护个人数据隐私的特征提取方法、模型训练方法及硬件 技术领域
本文件涉及数据处理技术领域,尤其涉及一种保护个人数据隐私的特征提取方法、模型训练方法及硬件。
背景技术
深度学习模型凭借具有机械化处理信息的能力,已得到了越来越广泛的使用。而人脸识别就是深度学习领域中常见的一种业务形态。人脸识别原理是基于深度学习模型对待识别用户的人脸特征与样本人脸特征进行近似匹配,从而确定待识别用户的身份。显然,属于个人数据的样本人脸特征留底会存在泄漏风险,无法对隐私起到有效的保护。
有鉴于此,当前急需一种在深度学习领域中,能够保护个人数据隐私的技术方案。
发明内容
本说明书实施例目的是提供一种保护个人数据隐私的特征提取方法、模型训练方法及硬件,能够在深度学习领域中,保护个人数据隐私。
为了实现上述目的,本说明书实施例是这样实现的:第一方面,提供一种保护个人数据隐私的特征提取方法,包括:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
第二方面,提供一种保护个人数据隐私的模型训练方法,包括:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
第三方面,一种保护隐私数据的特征提取装置,包括:图像序列获取模块,获取呈现有样本对象的多帧图像序列;特征加密表示模块,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;特征集成学习模块,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
第四方面,提供一种电子设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
第五方面,提供一种算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
第六方面,提供一种保护个人数据隐私的模型训练装置,包括:图像序列获取模块,获取呈现有样本对象的多帧图像序列;特征加密表示模块,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;特征集成学习模块,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;模型训练模块,基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
第七方面,提供一种电子设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特 征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
第八方面,提供一种算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:获取呈现有样本对象的多帧图像序列;以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
本说明书实施例的方案采用非线性转换的加密方式,对呈现有样本对象的多帧图像序列进行特征加密提取,得到样本对象对应多帧图像序列的初始加密特征数据,之后通过集成学习方式将多帧图像序列的进行初始加密特征数据集成,获得高阶的目标加密特征数据。由于整个方案依赖的是加密后的图像特征数据,因此在对加密后的图像特征数据留样后如果发生泄漏,也不会暴露出样本对象的个人数据,实现了隐私保护作用。同时,目标加密特征数据是对多帧图像序列的初始加密特征数据集成得到,可以有效修正图像特征加密带来的损失,后续用于模型训练时可以得到更好的模型性能。
附图说明
为了更清楚地说明本说明书实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本说明书实施例提供的特征提取方法的流程示意图。
图2为本说明书实施例提供的模型训练方法的流程示意图。
图3为本说明书实施例提供的特征提取装置的结构示意图。
图4为本说明书实施例提供的模型训练装置的结构示意图。
图5为本说明书实施例提供的电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。
如前所述,人脸识别原理是基于深度学习模型对待识别用户的人脸特征与样本人脸特征进行近似匹配,从而确定待识别用户的身份。其中,深度学习模型的训练依赖于样本人脸图像。而这些样本人脸图像属于用户的个人数据,留底会存在隐私泄漏的风险。为此,本文件旨在提出一种在深度学习领域中能够保护个人数据隐私的技术方案。
图1是本说明书实施例保护个人数据隐私的特征提取方法的流程图。图1所示的方法可以由下文相对应的装置执行,包括:步骤S102,获取呈现有样本对象的多帧图像序列。
具体地,本步骤可以从呈现有样本对象的视频中截取多帧图像序列。比如,通过终端设备的摄像公开对样本对象进行视频拍摄,并按照预先设置的帧率截取出呈现有样本对象的多帧图像序列。
步骤S104,以非线性转换作为加密方式,对获得的多帧图像序列进行特征表示,得到样本对象对应多帧图像序列的初始加密特征数据,其中,多帧图像序列呈现的样本对象的特征数据属于样本对象的个人数据。
基于非线性转换,输出值(初始加密特征数据)的变化量与其相应的输入值(多帧图像序列)的变化量之比不是常数的转换,具有加密作用。
在实际应用中,实现非线性转换的方式并不唯一,本说明书不作具体限定。作为示例性介绍:本步骤可以采用局部敏感哈希算法,对获得的多帧图像序列进行哈希转换,得到样本对象对应所述多帧图像序列的初始加密特征数据。
或者,也可以利用卷积神经网络模型对多帧图像序列进行特征加密提取。其中,卷积神经网络模型可以包括:卷积层,对获得所述多帧图像序列进行卷积处理,得到卷积 层输出特征集;池化层,基于最大值池化算法和/或均值池化算法,对所述卷积层输出特征集进行池化处理,得到池化层输出特征集;全连接层,将池化层输出特征集转换为指定维度的初始加密特征数据。显然,通过将获得的多帧图像序列输入至卷积神经网络模型,即可得到由卷积神经网络模型输出的样本对象对应的初始加密特征数据。
步骤S106,对样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到样本对象对应的目标加密特征数据。
其中,集成学***均法、投票法和学习法)将个体学习器集合起来,实现对初始加密特征数据的选取集成,以得到更高阶的目标加密特征数据。
应理解,本步骤中获得的初始加密特征数据和/或目标加密特征数据可以作为模型训练数据。因此,仅需用对初始加密特征数据和/或目标加密特征数据进行留样。而之前获得多帧图像序列可以在特征表示完成后删除,也就是“用后即焚”。
本说明书实施例的特征方法采用非线性转换的加密方式,对呈现有样本对象的多帧图像序列进行特征加密提取,得到样本对象对应多帧图像序列的初始加密特征数据,之后通过集成学习方式将多帧图像序列的进行初始加密特征数据集成,获得高阶的目标加密特征数据。由于整个方案依赖的是加密后的图像特征数据,因此在对加密后的图像特征数据留样后如果发生泄漏,也不会暴露出样本对象的个人数据,实现了隐私保护作用。同时,目标加密特征数据是对多帧图像序列的初始加密特征数据集成得到,可以有效修正图像特征加密带来的损失,后续用于模型训练时可以得到更好的模型性能。
下面结合实际的应用场景对本说明书实施例的方法进行示例介绍。
本应用场景中用于获取人脸识别所需要的特征数据,流程包括以下步骤:
步骤一,对样本对象人脸进行视频拍摄,从视频中截取呈现有样本对象面部的多帧 人脸图像序列。
步骤二,以非线性转换作为加密方式,对获得的多帧人脸图像序列进行特征表示,得到初始加密特征数据。
步骤三,删除从视频中截取到的样本对象的多帧人脸图像序列。
步骤四,将样本对象对应的初始加密特征数据关联存储至特征库。
步骤五,对样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到样本对象对应的目标加密特征数据。
步骤六,将样本对象对应的目标加密特征数据关联存储至特征库。
基于上述流程,后续在训练用户识别模型的过程中,可以从特征库中调取样本对象对应的初始加密特征数据和/或目标加密特征数据,并对用户识别模型进行训练。
对应地,本说明书实施例还提供一种保护个人数据隐私的模型训练方法。图2是本说明书实施例模型训练方法的流程图。图2所示的方法可以由下文相对应的装置执行,包括:
步骤S202,获取呈现有样本对象的多帧图像序列。
步骤S204,以非线性转换作为加密方式,对获得的多帧图像序列进行特征表示,得到样本对象对应多帧图像序列的初始加密特征数据,其中,多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。
步骤S206,对样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到样本对象对应的目标加密特征数据。
步骤S208,基于样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
在具体的训练过程中,样本对象对应的目标加密特征数据作为预设学习模型的输入数据,样本用户对应的模型分类标签作预设为学习模型的输出数据。在将目标加密特征数据输入至预设学习模型后,可以得到预设学习模型给出的训练结果。这个训练结果是预设学习模型针对样本用户的预测分类结果,可能与样本用户的模型分类标签所指示的真值分类结果存在差异。本说明书实施例可以基于最大似然估计所推导出的损失函数,计算出预测分类结果与真值分类结果之间的误差值,并以降低误差值为目的,对预设学习模型中的参数进行调整(例如底层向量的权重值),从而达到训练效果。
本说明书实施例的模型训练方法采用非线性转换的加密方式,对呈现有样本对象的多帧图像序列进行特征加密提取,得到样本对象对应多帧图像序列的初始加密特征数据,之后通过集成学习方式将多帧图像序列的进行初始加密特征数据集成,获得高阶的目标加密特征数据。由于整个方案依赖的是加密后的图像特征数据,因此在对加密后的图像特征数据留样后如果发生泄漏,也不会暴露出样本对象的个人数据,实现了隐私保护作用。同时,目标加密特征数据是对多帧图像序列的初始加密特征数据集成得到,可以有效修正图像特征加密带来的损失,在训练预设学习模型后,可以获得更好的模型性能。
应理解,训练完成的预设学习模型可以用于进行预测、识别,从而为相关业务决策提供数据支持。
比如,本说明书实施例的预设学习模型可以应用在人脸支付业务中。在进行人脸支付的用户身份验证过程中,可以采集得到呈现有待验证的支付对象的多帧图像序列;之后,以上述同样的非线性转换作为加密方式,对支付对象的多帧图像序列进行特征表示,得到支付对象的初始加密特征数据。同理,对支付对象的初始加密特征数据进行集成学习,获得支付对象的目标加密特征数据,并将支付对象的目标加密特征数据输入至预设学习模型,由预设学习模型来判断支付对象是否为支付授权用户(目标用户)。最终,通过预设学习模型的识别结果来决定是否发起人脸支付。
与上述特征提取方法相对应地,本说明书实施例还提供一种保护隐私数据的特征提取装置。图3是本说明书实施例的特征提取装置300的结构示意图,包括:图像序列获取模块310,获取呈现有样本对象的多帧图像序列;特征加密表示模块320,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;特征集成学习模块330,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
本说明书实施例的特征提取装置在采用非线性转换的加密方式,对呈现有样本对象的多帧图像序列进行特征加密提取,得到样本对象对应多帧图像序列的初始加密特征数据,之后通过集成学习方式将多帧图像序列的进行初始加密特征数据集成,获得高阶的目标加密特征数据。由于整个方案依赖的是加密后的图像特征数据,因此在对加密后的图像特征数据留样后如果发生泄漏,也不会暴露出样本对象的个人数据,实现了隐私保护作用。同时,目标加密特征数据是对多帧图像序列的初始加密特征数据集成得到,可以有效修正图像特征加密带来的损失,后续用于模型训练时可以得到更好的模型性能。
可选地,所述特征加密表示模块320具体将获得的所述多帧图像序列输入至预设的卷积神经网络模型,得到所述样本对象对应所述多帧图像序列的初始加密特征数据。这里,所述卷积神经网络模型包括:卷积层,对获得所述多帧图像序列进行卷积处理,得到卷积层输出特征集;池化层,基于最大值池化算法和/或均值池化算法,对所述卷积层输出特征集进行池化处理,得到池化层输出特征集;全连接层,将池化层输出特征集转换为指定维度的初始加密特征数据。
可选地,特征加密表示模块320还可以基于局部敏感哈希算法,对获得的所述多帧图像序列进行哈希转换,得到所述样本对象对应所述多帧图像序列的初始加密特征数据。
可选地,本说明书实施例的特征提取装置300还可以包括:存储模块,将所述样本对象、对应的初始加密特征数据和/或目标加密特征数据进行关联存储。
可选地,本说明书实施例的特征提取装置300还可以包括:删除模块,在得到所述样本对象对应所述多帧图像序的初始加密特征数据后,删除获得的所述多帧图像序列。
与上述特征提取方法相对应地,本说明书实施例还提供一种保护个人数据隐私的模型训练装置。图4是本说明书实施例的模型训练装置400的结构示意图,包括:图像序列获取模块410,获取呈现有样本对象的多帧图像序列;特征加密表示模块420,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;特征集成学习模块430,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;模型训练模块440,基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
图5是本说明书的一个实施例电子设备的结构示意图。请参考图5,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture, 扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
可选地,处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成上述特征提取装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:获取呈现有样本对象的多帧图像序列。
以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。
对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
或者,处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成上述模型训练装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:获取呈现有样本对象的多帧图像序列。
以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。
对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
上述如本说明书图1所示实施例揭示的特征提取方法或者图2所示实施例揭示的模型训练方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本说明书实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本说明书实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
应理解,本说明书实施例的电子设备可以实现上述特征提取装置在图1所示的实施例的功能,或者实现上述模型训练装置在图2所示的实施例的功能。由于原理相同,本文不再赘述。
当然,除了软件实现方式之外,本说明书的电子设备并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。
此外,本说明书实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令。其中,上述指令当被包括多个应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图1所示实施例的方法,并具体用于执行以下步骤:获取呈现有样本对象的多帧图像序列。
以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。
对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
或者,上述指令当被包括多个应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图2所示实施例的方法,并具体用于执行以下步骤:获取呈现有样本对象的多帧图像序列。
以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据。
对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样 本对象对应的目标加密特征数据。
基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
本领域技术人员应明白,本说明书的实施例可提供为方法、***或计算机程序产品。因此,本说明书可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本说明书可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
以上仅为本说明书的实施例而已,并不用于限制本说明书。对于本领域技术人员来说,本说明书可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书的权利要求范围之内。此外,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本文件的保护范围。

Claims (13)

  1. 一种保护个人数据隐私的特征提取方法,包括:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
  2. 根据权利要求1所述的方法,
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,包括:
    将获得的所述多帧图像序列输入至预设的卷积神经网络模型,得到所述样本对象对应所述多帧图像序列的初始加密特征数据;其中,所述卷积神经网络模型包括:
    卷积层,对获得所述多帧图像序列进行卷积处理,得到卷积层输出特征集;
    池化层,基于最大值池化算法和/或均值池化算法,对所述卷积层输出特征集进行池化处理,得到池化层输出特征集;
    全连接层,将池化层输出特征集转换为指定维度的初始加密特征数据。
  3. 根据权利要求1中所述的方法,
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,包括:
    基于局部敏感哈希算法,对获得的所述多帧图像序列进行哈希转换,得到所述样本对象对应所述多帧图像序列的初始加密特征数据。
  4. 根据权利要求1-3中任一项所述的方法,还包括:
    将所述样本对象、对应的初始加密特征数据和/或目标加密特征数据进行关联存储。
  5. 根据权利要求1-3中任一项所述的方法,还包括:
    在得到所述样本对象对应所述多帧图像序的初始加密特征数据后,删除获得的所述多帧图像序列。
  6. 一种保护个人数据隐私的模型训练方法,包括:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本 对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;
    基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
  7. 根据权利要求6所述的方法,还包括:
    在得到所述样本对象对应多帧图像序列的初始加密特征数据后,删除获得的所述多帧图像序列。
  8. 一种保护隐私数据的特征提取装置,包括:
    图像序列获取模块,获取呈现有样本对象的多帧图像序列;
    特征加密表示模块,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    特征集成学习模块,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
  9. 一种电子设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
  10. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据。
  11. 一种保护个人数据隐私的模型训练装置,包括:
    图像序列获取模块,获取呈现有样本对象的多帧图像序列;
    特征加密表示模块,以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    特征集成学习模块,对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;
    模型训练模块,基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
  12. 一种电子设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;
    基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
  13. 一种电子设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行:
    获取呈现有样本对象的多帧图像序列;
    以非线性转换作为加密方式,对获得的所述多帧图像序列进行特征表示,得到所述样本对象对应多帧图像序列的初始加密特征数据,其中,所述多帧图像序列呈现的样本对象的特征数据属于所述样本对象的个人数据;
    对所述样本对象对应多帧图像序列的初始加密特征数据进行集成学习,得到所述样本对象对应的目标加密特征数据;
    基于所述样本对象对应的目标加密特征数据和样本对象的模型分类标签,对预设学习模型进行训练。
PCT/CN2021/093367 2020-05-14 2021-05-12 保护个人数据隐私的特征提取方法、模型训练方法及硬件 WO2021228148A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010409389.0 2020-05-14
CN202010409389.0A CN111553320B (zh) 2020-05-14 2020-05-14 保护个人数据隐私的特征提取方法、模型训练方法及硬件

Publications (1)

Publication Number Publication Date
WO2021228148A1 true WO2021228148A1 (zh) 2021-11-18

Family

ID=72006412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/093367 WO2021228148A1 (zh) 2020-05-14 2021-05-12 保护个人数据隐私的特征提取方法、模型训练方法及硬件

Country Status (2)

Country Link
CN (2) CN114419712A (zh)
WO (1) WO2021228148A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114419712A (zh) * 2020-05-14 2022-04-29 支付宝(杭州)信息技术有限公司 保护个人数据隐私的特征提取方法、模型训练方法及硬件
CN114676396B (zh) * 2022-05-30 2022-08-30 山东极视角科技有限公司 深度神经网络模型的保护方法、装置、电子设备和介质
CN116055651B (zh) * 2023-01-06 2023-11-10 广东电网有限责任公司 多中心能源经济数据的共享访问方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681698A (zh) * 2018-04-28 2018-10-19 武汉大学 一种具有隐私保护功能的大规模虹膜识别方法
CN108764486A (zh) * 2018-05-23 2018-11-06 哈尔滨工业大学 一种基于集成学习的特征选择方法及装置
CN109871749A (zh) * 2019-01-02 2019-06-11 上海高重信息科技有限公司 一种基于深度哈希的行人重识别方法和装置、计算机***
CN110110120A (zh) * 2018-06-11 2019-08-09 北方工业大学 一种基于深度学习的图像检索方法和装置
CN111553320A (zh) * 2020-05-14 2020-08-18 支付宝(杭州)信息技术有限公司 保护个人数据隐私的特征提取方法、模型训练方法及硬件

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100097861A (ko) * 2009-02-27 2010-09-06 홍익대학교 산학협력단 자동 배경 제거를 이용한 얼굴 인식 시스템 성능 향상
US9396354B1 (en) * 2014-05-28 2016-07-19 Snapchat, Inc. Apparatus and method for automated privacy protection in distributed images
CN105631296B (zh) * 2015-12-30 2018-07-31 北京工业大学 一种基于cnn特征提取器的安全人脸认证***设计方法
CN106447625A (zh) * 2016-09-05 2017-02-22 北京中科奥森数据科技有限公司 基于人脸图像序列的属性识别方法及装置
CN106682650A (zh) * 2017-01-26 2017-05-17 北京中科神探科技有限公司 基于嵌入式深度学习技术的移动终端人脸识别方法和***
CN107958244B (zh) * 2018-01-12 2020-07-10 成都视观天下科技有限公司 一种基于视频多帧人脸特征融合的人脸识别方法及装置
CN108596056A (zh) * 2018-04-10 2018-09-28 武汉斑马快跑科技有限公司 一种出租车运营行为动作识别方法及***
CN108960119B (zh) * 2018-06-28 2021-06-08 武汉市哈哈便利科技有限公司 一种用于无人售货柜的多角度视频融合的商品识别算法
CN108960207B (zh) * 2018-08-08 2021-05-11 广东工业大学 一种图像识别的方法、***及相关组件
CN109359210A (zh) * 2018-08-09 2019-02-19 中国科学院信息工程研究所 双盲隐私保护的人脸检索方法与***
US10915995B2 (en) * 2018-09-24 2021-02-09 Movidius Ltd. Methods and apparatus to generate masked images based on selective privacy and/or location tracking
CN108898191A (zh) * 2018-09-26 2018-11-27 苏州米特希赛尔人工智能有限公司 卷积神经网络特征提取图像传感器
CN110087099B (zh) * 2019-03-11 2020-08-07 北京大学 一种保护隐私的监控方法和***
CN110427972B (zh) * 2019-07-09 2022-02-22 众安信息技术服务有限公司 证件视频特征提取方法、装置、计算机设备和存储介质
CN110378092B (zh) * 2019-07-26 2020-12-04 北京积加科技有限公司 身份识别***及客户端、服务器和方法
CN110363183B (zh) * 2019-07-30 2020-05-08 贵州大学 基于生成式对抗网络的服务机器人视觉图片隐私保护方法
CN110633650A (zh) * 2019-08-22 2019-12-31 首都师范大学 基于隐私保护的卷积神经网络人脸识别方法及装置
CN110598606B (zh) * 2019-09-02 2022-05-27 南京邮电大学 一种具有视觉隐私保护优势的室内跌倒行为检测方法
CN110991462B (zh) * 2019-10-31 2023-04-07 福建师范大学 基于隐私保护cnn的密态图像识别方法及***
CN111080593B (zh) * 2019-12-07 2023-06-16 上海联影智能医疗科技有限公司 一种图像处理装置、方法及存储介质
CN111091102B (zh) * 2019-12-20 2022-05-24 华中科技大学 一种视频分析装置、服务器、***及保护身份隐私的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681698A (zh) * 2018-04-28 2018-10-19 武汉大学 一种具有隐私保护功能的大规模虹膜识别方法
CN108764486A (zh) * 2018-05-23 2018-11-06 哈尔滨工业大学 一种基于集成学习的特征选择方法及装置
CN110110120A (zh) * 2018-06-11 2019-08-09 北方工业大学 一种基于深度学习的图像检索方法和装置
CN109871749A (zh) * 2019-01-02 2019-06-11 上海高重信息科技有限公司 一种基于深度哈希的行人重识别方法和装置、计算机***
CN111553320A (zh) * 2020-05-14 2020-08-18 支付宝(杭州)信息技术有限公司 保护个人数据隐私的特征提取方法、模型训练方法及硬件

Also Published As

Publication number Publication date
CN111553320A (zh) 2020-08-18
CN114419712A (zh) 2022-04-29
CN111553320B (zh) 2021-12-21

Similar Documents

Publication Publication Date Title
WO2021228148A1 (zh) 保护个人数据隐私的特征提取方法、模型训练方法及硬件
US11444774B2 (en) Method and system for biometric verification
WO2021068616A1 (zh) 身份验证方法、装置、计算机设备和存储介质
CN109783338A (zh) 基于业务信息的录制处理方法、装置和计算机设备
CN111522996B (zh) 视频片段的检索方法和装置
WO2021114585A1 (zh) 模型训练方法、装置和电子设备
CN112784670A (zh) 基于像素差异的对象检测
US20210004587A1 (en) Image detection method, apparatus, device and storage medium
WO2021184852A1 (zh) 动作区域提取方法、装置、设备及计算机可读存储介质
WO2021104097A1 (zh) 表情包生成方法、装置及终端设备
US20220328050A1 (en) Adversarially robust voice biometrics, secure recognition, and identification
CN109299276B (zh) 一种将文本转化为词嵌入、文本分类方法和装置
WO2023173686A1 (zh) 检测方法、装置、电子设备及存储介质
CN114663871A (zh) 图像识别方法、训练方法、装置、***及存储介质
US20160351185A1 (en) Voice recognition device and method
US10880604B2 (en) Filter and prevent sharing of videos
WO2021051602A1 (zh) 基于唇语密码的人脸识别方法、***、装置及存储介质
CN113609900B (zh) 局部生成人脸定位方法、装置、计算机设备和存储介质
WO2024094086A1 (zh) 图像处理方法、装置、设备、介质及产品
KR102254037B1 (ko) 영상분석장치 및 그 장치의 구동방법
CN104580109A (zh) 生成点选验证码的方法及装置
CN112041847A (zh) 提供具有隐私标签的图像
CN111539382A (zh) 一种图像识别模型隐私风险的评估方法、装置及电子设备
CN109003190B (zh) 一种核保方法、计算机可读存储介质及终端设备
CN113283978B (zh) 基于生物基础与行为特征及业务特征的金融风险评估方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21803077

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21803077

Country of ref document: EP

Kind code of ref document: A1