WO2021237907A1 - 基于多分类器的风险识别方法、装置、计算机设备及存储介质 - Google Patents

基于多分类器的风险识别方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021237907A1
WO2021237907A1 PCT/CN2020/103795 CN2020103795W WO2021237907A1 WO 2021237907 A1 WO2021237907 A1 WO 2021237907A1 CN 2020103795 W CN2020103795 W CN 2020103795W WO 2021237907 A1 WO2021237907 A1 WO 2021237907A1
Authority
WO
WIPO (PCT)
Prior art keywords
training data
video data
classifier
user
current video
Prior art date
Application number
PCT/CN2020/103795
Other languages
English (en)
French (fr)
Inventor
熊玮
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021237907A1 publication Critical patent/WO2021237907A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Definitions

  • This application relates to the technical field of intelligent decision-making, and in particular to a method, device, computer equipment, and storage medium for risk identification based on multiple classifiers.
  • classifiers that can be used to achieve this classification task in the prior art include: linear classifiers based on logistic regression, classifiers based on clustering algorithms, decision tree models, and non-linear classifiers based on kernel functions-Support Vector Machine (SVM) ).
  • SVM kernel functions-Support Vector Machine
  • sentiment information and content information are some high-dimensional data.
  • Linear classifiers based on logistic regression, classifiers based on clustering algorithms, and decision tree models cannot classify nonlinear problems. Therefore, its generalization ability after training is poor, and it is easily affected and interfered by abnormal points.
  • support vector machines can handle nonlinear classification problems.
  • support vector machines require very long processing time and low efficiency when processing larger sample data and more dimensional features (a large number of non-support vectors are added to the convex quadratic programming problem).
  • kernel function when using the support vector machine, a specific kernel function needs to be selected. Different kernel functions have different effects on different types of sample data, but there is no effective kernel function selection method, and it depends more on the experience or inspiration of the technicians.
  • emotional information and content information are relatively complex, and there are no very special or significant data type characteristics. It is difficult to obtain very good accuracy using only one kernel function, and support vector machines cannot achieve good recognition. Effect.
  • the embodiments of this application provide a method, device, computer equipment and storage medium for risk identification based on multi-classifiers, aiming to solve the problem that the user risk identification of the user in the matter handling process in the matter online handling system in the prior art is based on Single classifier leads to the problem of low recognition rate.
  • an embodiment of the present application provides a method for risk identification based on multiple classifiers, which includes:
  • the connection with the user terminal is terminated and a prompt for handling termination matters is issued.
  • an embodiment of the present application provides a multi-classifier-based risk identification device, which includes:
  • the user video data acquisition unit is used to acquire user video data if the transaction processing instruction sent by the user terminal is detected;
  • the user identity verification unit is configured to obtain user identity information corresponding to the user video data to determine whether the user identity information passes the user identity verification;
  • the current video data obtaining unit is configured to obtain the current video data in the current transaction process if the user identity information passes the user identity verification;
  • a video information extraction unit configured to extract emotion information in the current video data, and extract content information in the current video data
  • the combined classifier calling unit is used to call a pre-built combined classifier composed of several non-linear classifiers and linear classifiers;
  • the user classification unit is used to input the emotional information and content information of the current video data into the combined classifier to obtain the corresponding user risk category;
  • a category judging unit for judging whether the user risk category corresponding to the current video data belongs to a high-risk category
  • the high-risk category processing unit is configured to, if the user risk category corresponding to the current video data belongs to the high-risk category, terminate the connection with the user terminal and perform a prompt for handling termination matters.
  • an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer
  • the program implements the multi-classifier-based risk identification method described in the first aspect above.
  • an embodiment of the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the above-mentioned first On the one hand, the risk identification method based on multiple classifiers.
  • the embodiment of the application provides a method, device, computer equipment, and storage medium for risk identification based on multi-classifiers, including obtaining user video data if a transaction processing instruction sent by a user terminal is detected; obtaining user video data corresponding to the user video data User identity information to determine whether the user identity information passes the user identity verification; if the user identity information passes the user identity verification, obtain the current video data in the current transaction process; extract the emotional information in the current video data, And extract the content information in the current video data; call a pre-built combined classifier composed of several non-linear classifiers and linear classifiers; input the emotion information and content information of the current video data into the combined classifier, Obtain the corresponding user risk category; determine whether the user risk category corresponding to the current video data belongs to a high risk category; and if the user risk category corresponding to the current video data belongs to a high risk category, terminate the connection with the user terminal and proceed with termination Prompt for handling.
  • This method realizes the user risk category judgment on the user video data based
  • FIG. 1 is a schematic diagram of an application scenario of a multi-classifier-based risk identification method provided by an embodiment of the application;
  • FIG. 2 is a schematic flowchart of a method for risk identification based on multiple classifiers provided by an embodiment of the application;
  • FIG. 3 is a schematic block diagram of a multi-classifier-based risk identification device provided by an embodiment of the application
  • Fig. 4 is a schematic block diagram of a computer device provided by an embodiment of the application.
  • Figure 1 is a schematic diagram of an application scenario of a multi-classifier-based risk identification method provided by an embodiment of the application
  • Figure 2 is a schematic flow chart of a multi-classifier-based risk identification method provided by an embodiment of the application
  • the risk identification method based on multiple classifiers is applied to a server, and the method is executed by application software installed in the server.
  • the method includes steps S110 to S180.
  • the server After the server receives the transaction handling instruction sent by the client, it can trigger the user's camera and other related interactive devices to conduct video interaction between the client and the server.
  • the verification of user identity information can be completed in a variety of ways, including but not limited to ID card information, face recognition, fingerprint and iris information, and so on.
  • the customer can be prompted to enter ID card information (it can also be obtained directly by photographing the ID card). Then, according to the ID card information, retrieve the corresponding biometric verification information (including face, iris, fingerprint, etc.), and match to determine whether it is the customer himself.
  • ID card information it can also be obtained directly by photographing the ID card.
  • biometric verification information including face, iris, fingerprint, etc.
  • a series of actions such as blinking and nodding may be used to prompt and guide the customer to determine the authenticity and effectiveness of the customer's identity.
  • the server obtains the user's video data and completes the verification of the user's identity, at this time, when obtaining the current video data of the user when handling matters, the current video data is used as the data basis for analyzing the user's risk level.
  • the server can realize guidance in the form of broadcasting voice information, thereby simulating the operation of the customer service staff.
  • broadcast options can also be provided, such as the type of broadcast sound, the broadcast rate, and so on. Users can choose to broadcast in their own language or speaking speed according to their needs.
  • S140 Extract emotion information in the current video data, and extract content information in the current video data.
  • emotional information refers to the emotional state of the customer in the process of handling matters, and can reflect the state of the customer when handling matters.
  • emotional information can be represented by tagging, for example, it can be set to five different types of emotional tags such as "happy”, “disgust”, “suppression”, “surprise” and “other”.
  • tagging for example, it can be set to five different types of emotional tags such as "happy”, “disgust”, “suppression”, “surprise” and “other”.
  • Content information refers to the specific feedback information provided by customers in the video.
  • the specifics are determined according to the voice information broadcast by the business handling process. For different types of customer feedback information, corresponding methods can be used to obtain specific content information from the answer video.
  • the voice information that the server pushes to the client to broadcast is a judgment question (whether you have ever handled this service), you can use voice recognition to extract whether the user's answer is affirmative or negative (yes or no) ).
  • the voice information broadcast by the system is a descriptive question (please explain the current income situation)
  • semantic analysis technology can be used to convert the semantics of the customer’s answer into text information, based on natural language processing to obtain the text information expression The true meaning (salary income xxx, additional rental income xxx).
  • extracting the emotional information in the current video data in step S140 includes:
  • the emotional information in the current video data is obtained through the micro-expression recognition model.
  • the feature extraction based on optical flow or the feature extraction based on the LBP-TOP operator can be used to obtain the image frame including the micro-expression in the current video data, and combine the image frame and the micro-expression of the micro-expression.
  • the recognition model obtains the emotional information in the current video data.
  • the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression.
  • the LBP-TOP operator real-time spatial local texture
  • LBP operator real-time spatial local texture
  • extracting content information in the current video data in step S140 includes:
  • the content information in the video data is obtained according to the keyword corresponding to yes or no in the text information.
  • the N-gram model is a pre-trained N-gram model, which is a commonly used speech recognition model. After obtaining the text information corresponding to the audio data in the current video data through the N-gram model, it can be determined whether the keyword corresponding to "Yes” or "No" is included in the text information to obtain the content information in the video data.
  • a pre-built combined classifier composed of several non-linear classifiers and linear classifiers can be called at this time.
  • the specific choice of non-linear classifier and linear classifier can be determined according to the actual process.
  • the method before step S150, the method further includes:
  • the weight coefficients of each of the nonlinear classifiers and linear classifiers in the initial combined classifier are optimized to obtain the corresponding combined classifiers through training.
  • a number of nonlinear classifiers and linear classifiers are respectively assigned values according to preset default weight coefficients to form an initial combined classifier.
  • the weight coefficients of each of the nonlinear classifiers and linear classifiers in the initial combined classifier are continuously optimized, so as to train to obtain the corresponding combined classifier. Optimizing the weight coefficient and the combined classifier is more conducive to accurately classifying the user's risk level.
  • the initial combined classifier includes five non-linear classifiers based on support vector machines and one linear classifier based on logistic regression; among them, five non-linear classifiers based on support vector machines are recorded respectively.
  • U1 to U5 the linear classifier based on logistic regression is marked as U6;
  • the non-linear classifier corresponding to U1 is used to identify the first type of emotion,
  • the non-linear classifier corresponding to U2 is used to recognize the second type of emotion, and the non-linear classification corresponding to U3
  • the non-linear classifier corresponding to U4 is used to recognize the first type of emotion
  • the non-linear classifier corresponding to U5 is used to recognize the fifth type of emotion
  • the linear classifier corresponding to U6 is used to recognize judgment problems.
  • the combined classifier is composed of 5 non-linear classifiers based on SVM and 1 linear classifier based on logistic regression.
  • the five non-linear classifiers use different kernel functions, which are respectively obtained by training data subsets with corresponding emotion labels ("happy”, “disgust”, “suppression”, “surprise” and “other”). That is, each non-linear classifier pays attention to a certain item of emotional information for risk identification.
  • a linear classifier based on logistic regression is trained through a subset of training data whose content information is the answer to a judgment question (that is, "yes" or a question containing the only option).
  • the training data set to optimize the weight coefficients of each of the nonlinear classifiers and linear classifiers in the initial combined classifier to obtain the corresponding combined classifier includes:
  • the training data in the second training data subset Y2 corresponds to the second total quantity K 2
  • the training data in the third training data subset Y3 corresponds to the third total quantity K 3
  • the training data in the fourth training data subset Y4 corresponds to the fourth total quantity K 4
  • the training data in the fifth training data subset Y5 corresponds to the fifth total quantity K 5
  • the training data in the sixth training data subset Y6 corresponds to the sixth total quantity K 6 ;
  • Loss j represents the classification loss corresponding to the j-th classifier in the initial combined classifier
  • W i is the weight of the i-th training data
  • I i is the i-th training data
  • the corresponding pointer function and the value is 0 or 1;
  • Loss' j represents an initial composition classifier j-classifiers corresponding to the current classification loss
  • W 'i is the i th current weight training data weight
  • I' i is the current pointer function corresponding to the i-th training data and the value is 0 or 1;
  • the current weight coefficients corresponding to U1 to U6 and U1 to U6 respectively form a combined classifier.
  • the non-linear classifier and the linear classifier are collectively referred to as sub-classifiers in the combined classifier.
  • Each sub-classifier is assigned a corresponding weight coefficient.
  • the final output of the combined classifier is the weighted sum of the outputs of all sub-classifiers.
  • the formula corresponding to the weight coefficient adjustment strategy is: Among them, the Loss optimal sub-classifier is the minimum classification loss among the classification losses corresponding to U1 to U6;
  • the training data set Y contains multiple different sample data Xi. Each sample data has a corresponding emotion label and content information, and the user risk category is known (marked by the user risk category field, such as the user risk category is marked as 1 if the user risk category is too high, and the user risk category is normally marked as 0).
  • the training data corresponding to the five emotion labels in the training data set Y is selected as the training data subsets Y1 to Y5 (for example, the training data subset Y1 is all the emotional labels of "happy" The set of sample data) and the training data subset Y6 (content information is the answer to the judgment question).
  • U1 to U6 Train sub-classifiers U1 to U6 through training data subsets Y1 to Y6, respectively.
  • U1 to U5 are non-linear classifiers based on SVM, the type of kernel function used is Gaussian kernel function, and U6 is a linear classifier based on logistic regression.
  • U1 to U6 are combined to form a combined classifier U.
  • the initialization weight 1/K is first assigned to each training data in the training data set Y, where K is the total number of training data in the training data set. That is, the initial weight of each training data is the same.
  • the sub-classifier with the smallest classification loss is regarded as the optimal sub-classifier, and the weight coefficient of the sub-classifier is determined accordingly.
  • the calculation idea of the weight coefficient is: when the classification loss is smaller, the weight coefficient is higher, and when the classification loss is larger, the weight coefficient is smaller.
  • the weight coefficient corresponding to the optimal sub-classifier is obtained, the pre-stored weight update strategy is called, and the initialization weight of each training data in the training data set is adjusted accordingly to obtain the current weight corresponding to each training data. Updating the weight coefficient of the training data is to hope that the training sample data can get more attention (that is, the weight becomes larger) when the training sample data is incorrectly classified by the optimal sub-classifier last time. And when it was correctly classified by the optimal sub-classifier last time, its importance is reduced (that is, the weight becomes smaller).
  • the current weight coefficients of each sub-classifier are adjusted to form a combined classifier from the current weight coefficients corresponding to U1 to U6 and U1 to U6.
  • the emotional information and content information of the current video data can be input into the combined classifier to obtain the corresponding user risk category .
  • the combined classifier used in this embodiment integrates non-linear classifiers and linear classifiers with different kernel functions, making full use of the advantages of each sub-classifier (each sub-classifier constructed has its own recognition direction), It can improve the performance and recognition accuracy of the classifier, and can comprehensively reflect the relationship between emotional information and content information and business risks.
  • the sample data is divided into multiple types for processing separately, which reduces the amount of sample data that needs to be processed during training and optimization of each support vector machine, which can improve the practical application efficiency of the support vector machine.
  • S170 Determine whether the user risk category corresponding to the current video data belongs to a high risk category.
  • the user risk category corresponding to the current video data belongs to the high risk category, it means that the user may handle matters in an abnormal situation (for example, in the case of involuntarily being threatened by others). At this time, in order to ensure user data It is safe to terminate the connection with the user terminal and prompt for termination matters.
  • the server can continue to push the process data to the user end for interaction according to the item handling process.
  • the method realizes the judgment of user risk categories on user video data based on a combined classifier including several non-linear classifiers and linear classifiers, and improves the recognition accuracy.
  • the embodiment of the present application also provides a multi-classifier-based risk identification device, and the multi-classifier-based risk identification device is used to execute any embodiment of the aforementioned multi-classifier-based risk identification method.
  • FIG. 3 is a schematic block diagram of a multi-classifier-based risk identification device provided by an embodiment of the present application.
  • the risk identification device 100 based on multiple classifiers can be configured in a server.
  • the multi-classifier-based risk identification device 100 includes: a user video data acquisition unit 110, a user identity verification unit 120, a current video data acquisition unit 130, a video information extraction unit 140, a combined classifier calling unit 150, The user classification unit 160, the category judgment unit 170, and the high-risk category processing unit 180.
  • the user video data obtaining unit 110 is configured to obtain user video data if the transaction processing instruction sent by the user terminal is detected.
  • the server After the server receives the transaction handling instruction sent by the client, it can trigger the user's camera and other related interactive devices to conduct video interaction between the client and the server.
  • the user identity verification unit 120 is configured to obtain user identity information corresponding to the user video data to determine whether the user identity information passes the user identity verification.
  • the verification of user identity information can be completed in a variety of ways, including but not limited to ID card information, face recognition, fingerprint and iris information, and so on.
  • the customer can be prompted to enter ID card information (it can also be obtained directly by photographing the ID card). Then, according to the ID card information, retrieve the corresponding biometric verification information (including face, iris, fingerprint, etc.), and match to determine whether it is the customer himself.
  • ID card information it can also be obtained directly by photographing the ID card.
  • biometric verification information including face, iris, fingerprint, etc.
  • a series of actions such as blinking and nodding may be used to prompt and guide the customer to determine the authenticity and effectiveness of the customer's identity.
  • the current video data obtaining unit 130 is configured to obtain current video data in the current transaction process if the user identity information passes the user identity verification.
  • the server obtains the user's video data and completes the verification of the user's identity, at this time, when obtaining the current video data of the user when handling matters, the current video data is used as the data basis for analyzing the user's risk level.
  • the server can realize guidance in the form of broadcasting voice information, thereby simulating the operation of the customer service staff.
  • broadcast options can also be provided, such as the type of broadcast sound, the broadcast rate, and so on. Users can choose to broadcast in their own language or speaking speed according to their needs.
  • the video information extraction unit 140 is configured to extract emotion information in the current video data, and extract content information in the current video data.
  • emotional information refers to the emotional state of the customer in the process of handling matters, and can reflect the state of the customer when handling matters.
  • emotional information can be represented by tagging, for example, it can be set to five different types of emotional tags such as "happy”, “disgust”, “suppression”, “surprise” and “other”.
  • tagging for example, it can be set to five different types of emotional tags such as "happy”, “disgust”, “suppression”, “surprise” and “other”.
  • Content information refers to the specific feedback information provided by customers in the video.
  • the specifics are determined according to the voice information broadcast by the business handling process. For different types of customer feedback information, corresponding methods can be used to obtain specific content information from the answer video.
  • the voice information that the server pushes to the client to broadcast is a judgment question (whether you have ever handled this service), you can use voice recognition to extract whether the user's answer is affirmative or negative (yes or no) ).
  • the voice information broadcast by the system is a descriptive question (please explain the current income situation)
  • semantic analysis technology can be used to convert the semantics of the customer’s answer into text information, based on natural language processing to obtain the text information expression The true meaning (salary income xxx, additional rental income xxx).
  • the video information extraction unit 140 includes:
  • the emotion information extraction unit is used to obtain the emotion information in the current video data through the micro-expression recognition model.
  • the feature extraction based on optical flow or the feature extraction based on the LBP-TOP operator can be used to obtain the image frame including the micro-expression in the current video data, and combine the image frame and the micro-expression of the micro-expression.
  • the recognition model obtains the emotional information in the current video data.
  • the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression.
  • the LBP-TOP operator real-time spatial local texture
  • LBP operator real-time spatial local texture
  • the video information extraction unit 140 further includes:
  • the text information extraction unit is used to obtain the text information corresponding to the audio data in the current video data through the N-gram model;
  • the content information extraction unit is configured to obtain the content information in the video data according to the keyword corresponding to yes or no in the text information.
  • the N-gram model is a pre-trained N-gram model, which is a commonly used speech recognition model. After obtaining the text information corresponding to the audio data in the current video data through the N-gram model, it can be determined whether the keyword corresponding to "Yes” or "No" is included in the text information to obtain the content information in the video data.
  • the combined classifier calling unit 150 is used for calling a pre-built combined classifier composed of several non-linear classifiers and linear classifiers.
  • a pre-built combined classifier composed of several non-linear classifiers and linear classifiers can be called at this time.
  • the specific choice of non-linear classifier and linear classifier can be determined according to the actual process.
  • the device 100 for risk identification based on multiple classifiers further includes:
  • the initial classifier building unit is used to construct an initial combined classifier composed of several non-linear classifiers and linear classifiers;
  • the classifier optimization unit is configured to optimize the weight coefficients of each of the nonlinear classifiers and linear classifiers in the initial combined classifier through the training data set, so as to train to obtain the corresponding combined classifier.
  • a number of nonlinear classifiers and linear classifiers are respectively assigned values according to preset default weight coefficients to form an initial combined classifier.
  • the weight coefficients of each of the nonlinear classifiers and linear classifiers in the initial combined classifier are continuously optimized, so as to train to obtain the corresponding combined classifier. Optimizing the weight coefficient and the combined classifier is more conducive to accurately classifying the user's risk level.
  • the initial combined classifier includes five non-linear classifiers based on support vector machines and one linear classifier based on logistic regression; among them, five non-linear classifiers based on support vector machines are recorded respectively.
  • U1 to U5 the linear classifier based on logistic regression is marked as U6;
  • the non-linear classifier corresponding to U1 is used to identify the first type of emotion,
  • the non-linear classifier corresponding to U2 is used to recognize the second type of emotion, and the non-linear classification corresponding to U3
  • the non-linear classifier corresponding to U4 is used to recognize the first type of emotion
  • the non-linear classifier corresponding to U5 is used to recognize the fifth type of emotion
  • the linear classifier corresponding to U6 is used to recognize judgment problems.
  • the combined classifier consists of 5 non-linear classifiers based on SVM and 1 linear classifier based on logistic regression.
  • the five non-linear classifiers use different kernel functions, which are respectively obtained by training data subsets with corresponding emotion labels ("happy”, “disgust”, “suppression”, “surprise” and “other”). That is, each non-linear classifier pays attention to a certain item of emotional information for risk identification.
  • a linear classifier based on logistic regression is trained through a subset of training data whose content information is the answer to a judgment question (that is, "yes" or a question containing the only option).
  • the classifier optimization unit includes:
  • the first subset dividing unit is used to obtain the subsets of the emotion information corresponding to each type of emotion in the training data set, which are respectively denoted as the first training data subset Y1, the second training data subset Y2, and the third training data subset.
  • the second subset dividing unit is used to obtain the subset of the judgment question corresponding to the content information in the training data set to obtain the sixth training data subset Y6;
  • the weight initialization assignment unit is used to assign the initialization weight 1/K to each training data in the training data set; where K is the total number of training data in the training data set, and the training data in the first training data subset Y1 corresponds to the first training data A total quantity K 1 , the training data in the second training data subset Y2 corresponds to the second total quantity K 2 , the training data in the third training data subset Y3 corresponds to the third total quantity K 3 , and the fourth training data subset Y4 The training data corresponds to the fourth total quantity K 4 , the training data in the fifth training data subset Y5 corresponds to the fifth total quantity K 5 , and the training data in the sixth training data subset Y6 corresponds to the sixth total quantity K 6 ;
  • the initial classification loss acquisition unit is used to pass Calculate and obtain the classification loss corresponding to U1 to U6; among them, Loss j represents the classification loss corresponding to the j-th classifier in the initial combined classifier, W i is the weight of the i-th training data , and I i is the i-th training data The corresponding pointer function and the value is 0 or 1;
  • the optimal sub-classifier obtaining unit is configured to obtain the sub-classifier corresponding to the minimum value of the classification losses respectively corresponding to U1 to U6 as the optimal sub-classifier, and call a preset weight coefficient adjustment strategy to obtain the optimal sub-classifier The weight coefficient corresponding to the classifier;
  • the current weight obtaining unit is configured to adjust the initialization weight of each training data in the training data set according to the weight coefficient corresponding to the optimal sub-classifier and call the pre-stored weight update strategy to obtain each training data The corresponding current weight;
  • the current classification loss acquisition unit is used to obtain the current weight corresponding to each training data, and Calculating and acquiring the current classification loss U1 to U6 respectively;
  • Loss' j represents an initial composition classifier j-classifiers corresponding to the current classification loss
  • W 'i is the i th current weight training data weight
  • I' i Is the current pointer function corresponding to the i-th training data and the value is 0 or 1;
  • the current weight coefficient obtaining unit is configured to call a preset weight coefficient obtaining strategy to obtain the current weight coefficients corresponding to U1 to U6;
  • the classifier combination unit is used to form a combined classifier from the current weight coefficients corresponding to U1 to U6 and U1 to U6 respectively.
  • the non-linear classifier and the linear classifier are collectively referred to as sub-classifiers in the combined classifier.
  • Each sub-classifier is assigned a corresponding weight coefficient.
  • the final output of the combined classifier is the weighted sum of the outputs of all sub-classifiers.
  • the formula corresponding to the weight coefficient adjustment strategy is: Among them, the Loss optimal sub-classifier is the minimum classification loss among the classification losses corresponding to U1 to U6;
  • the training data set Y contains multiple different sample data Xi. Each sample data has a corresponding emotion label and content information, and the user risk category is known (marked by the user risk category field, such as the user risk category is marked as 1 if the user risk category is too high, and the user risk category is normally marked as 0).
  • the training data corresponding to the five emotion labels in the training data set Y is selected as the training data subsets Y1 to Y5 (for example, the training data subset Y1 is all the emotional labels of "happy" The set of sample data) and the training data subset Y6 (content information is the answer to the judgment question).
  • U1 to U6 Train sub-classifiers U1 to U6 through training data subsets Y1 to Y6, respectively.
  • U1 to U5 are non-linear classifiers based on SVM, the type of kernel function used is Gaussian kernel function, and U6 is a linear classifier based on logistic regression.
  • U1 to U6 are combined to form a combined classifier U.
  • the initialization weight 1/K is first assigned to each training data in the training data set Y, where K is the total number of training data in the training data set. That is, the initial weight of each training data is the same.
  • the sub-classifier with the smallest classification loss is regarded as the optimal sub-classifier, and the weight coefficient of the sub-classifier is determined accordingly.
  • the calculation idea of the weight coefficient is: when the classification loss is smaller, the weight coefficient is higher, and when the classification loss is larger, the weight coefficient is smaller.
  • the weight coefficient corresponding to the optimal sub-classifier is obtained, the pre-stored weight update strategy is called, and the initialization weight of each training data in the training data set is adjusted accordingly to obtain the current weight corresponding to each training data. Updating the weight coefficient of the training data is to hope that the training sample data can get more attention (that is, the weight becomes larger) when the training sample data is incorrectly classified by the optimal sub-classifier last time. When it was correctly classified by the optimal sub-classifier the last time, its importance is reduced (that is, the weight becomes smaller).
  • the current weight coefficients of each sub-classifier are adjusted to form a combined classifier from the current weight coefficients corresponding to U1 to U6 and U1 to U6.
  • the user classification unit 160 is configured to input the emotional information and content information of the current video data into the combined classifier to obtain the corresponding user risk category.
  • the emotional information and content information of the current video data can be input into the combined classifier to obtain the corresponding user risk category .
  • the combined classifier used in this embodiment integrates non-linear classifiers and linear classifiers with different kernel functions, making full use of the advantages of each sub-classifier (each sub-classifier constructed has its own recognition direction), It can improve the performance and recognition accuracy of the classifier, and can comprehensively reflect the relationship between emotional information and content information and business risks.
  • the sample data is divided into multiple types for processing separately, which reduces the amount of sample data that needs to be processed during training and optimization of each support vector machine, which can improve the practical application efficiency of the support vector machine.
  • the category determining unit 170 is configured to determine whether the user risk category corresponding to the current video data belongs to a high risk category.
  • the high-risk category processing unit 180 is configured to, if the user risk category corresponding to the current video data belongs to the high-risk category, terminate the connection with the user terminal and perform a prompt for handling termination matters.
  • the user risk category corresponding to the current video data belongs to the high risk category, it means that the user may handle matters in an abnormal situation (for example, in the case of involuntarily being threatened by others). At this time, in order to ensure user data It is safe to terminate the connection with the user terminal and prompt for termination matters.
  • the server can continue to push the process data to the user end for interaction according to the item handling process.
  • the device realizes the user risk category judgment on the user video data based on the combined classifier including several non-linear classifiers and linear classifiers, and improves the recognition accuracy.
  • the above-mentioned multi-classifier-based risk identification device can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 4.
  • FIG. 4 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the processor 502 can execute a risk identification method based on multiple classifiers.
  • the processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute the risk identification method based on multiple classifiers.
  • the network interface 505 is used for network communication, such as providing data information transmission.
  • the structure shown in FIG. 4 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to implement the risk identification method based on multiple classifiers disclosed in the embodiments of the present application.
  • the embodiment of the computer device shown in FIG. 4 does not constitute a limitation on the specific configuration of the computer device.
  • the computer device may include more or less components than those shown in the figure. Or combine certain components, or different component arrangements.
  • the computer device may only include a memory and a processor. In such embodiments, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 4, and will not be repeated here.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • a computer-readable storage medium may be a non-volatile computer-readable storage medium, or may be a volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to realize the risk identification method based on the multi-classifier disclosed in the embodiments of the present application.
  • the disclosed equipment, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods, or the units with the same function may be combined into one. Units, for example, multiple units or components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Multimedia (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Tourism & Hospitality (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Technology Law (AREA)
  • Accounting & Taxation (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Child & Adolescent Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Finance (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Educational Administration (AREA)

Abstract

一种基于多分类器的风险识别方法、装置、计算机设备及存储介质,涉及智能决策技术领域,该方法包括若检测到用户端发送的事项办理指令,获取用户视频数据(S110);获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验(S120);若通过则获取当前办理事项流程中的当前视频数据(S130);提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息(S140);调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器(S150);将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别(S160);判断所述当前视频数据对应的用户风险类别是否属于高风险类别(S170),若是则终止与用户端的连接并进行终止事项办理的提示(S180)。该方法实现了基于组合分类器对用户视频数据进行用户风险类别进行判断,提升了识别准确率。

Description

基于多分类器的风险识别方法、装置、计算机设备及存储介质
本申请要求于2020年05月26日提交中国专利局、申请号为202010457551.6,发明名称为“基于多分类器的风险识别方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及智能决策技术领域,尤其涉及一种基于多分类器的风险识别方法、装置、计算机设备及存储介质。
背景技术
随着互联网技术的不断发展,现有的许多金融机构也开始推广或者试行在线办理金融业务,以提升客户办理金融业务的便利性和廉洁度。
但金融业务存在高风险的特殊性,在线上办理过程中必须提供及时的阻断机制以规避风险,避免客户或者金融机构蒙受损失(例如客户受欺诈或者胁迫时及时进行阻断)。在传统的客户与客服人员之间一对一进行沟通时,可以依赖客服人员的经验和判断来及时的进行阻断。但发明人发现这样的效率较低,也无法达到24小时服务的要求。
如何通过计算机程序,令计算机可以自动的识别其中的风险是当前研究的热点问题。对于计算机而言,对风险的识别判断实际上被认为是一个分类任务。亦即,根据视频和语音中的情感信息和内容信息,确定其属于风险过高的类别还是属于风险正常的类别。
现有技术中可以用于实现该分类任务的分类器包括:基于逻辑回归的线性分类器,基于聚类算法的分类器,决策树模型以及基于核函数的非线性分类器-支持向量机(SVM)。
其中,使用基于逻辑回归的线性分类器,基于聚类算法的分类器,决策树模型会使得分类效果倾斜或者偏向于样本数量较大的类别(亦即属于风险正常的类别)。但是,实际使用过程期待检出的是哪些异常的,处于风险过高的部分,与实际使用情况不契合。
而且,情感信息和内容信息都是一些高维数据,基于逻辑回归的线性分类器,基于聚类算法的分类器以及决策树模型都无法对非线性问题进行分类。因此,其经过训练后的泛化能力较差,容易受到异常点的影响和干扰。
支持向量机虽然可以处理非线性分类问题。但支持向量机在对较大的样本数据量和维度较多的特征进行处理时,需要非常长的处理训练时间,效率很低(大量的非支持向量被加入到了凸二次规划问题中)。
而且,支持向量机使用时需要选择使用特定的核函数。不同的核函数对于不同类型的样本数据具有不同的效果,但是却没有行之有效的核函数选择方式,更多的依赖于技术人员的经验或者灵感。而在进行业务风险评价时,情感信息和内容信息是比较庞杂的,没有非常特殊或者显著的数据类型特征,仅使用一种核函数难以获得非常好的准确率,支持向量机不能取得良好的识别效果。
因此,如何针对风险识别的数据特点(样本数据之间的差异较大,分布零散而且容易存在特征缺失,是高维度的非线性可分数据),提供一种合适的分类器执行分类任务,及时并且准确的识别风险是一个迫切需要解决的技术问题。
发明内容
本申请实施例提供了一种基于多分类器的风险识别方法、装置、计算机设备及存储介质,旨在解决现有技术中事项在线办理***中对用户在事项办理过程中进行用户风险识别是基于单分类器,导致识别率较低的问题。
第一方面,本申请实施例提供了一种基于多分类器的风险识别方法,其包括:
若检测到用户端发送的事项办理指令,获取用户视频数据;
获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身 份核验;
若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
第二方面,本申请实施例提供了一种基于多分类器的风险识别装置,其包括:
用户视频数据获取单元,用于若检测到用户端发送的事项办理指令,获取用户视频数据;
用户身份核验单元,用于获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验;
当前视频数据获取单元,用于若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
视频信息提取单元,用于提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
组合分类器调用单元,用于调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
用户分类单元,用于将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
类别判断单元,用于判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
高风险类别处理单元,用于若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的基于多分类器的风险识别方法。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的基于多分类器的风险识别方法。
本申请实施例提供了一种基于多分类器的风险识别方法、装置、计算机设备及存储介质,包括若检测到用户端发送的事项办理指令,获取用户视频数据;获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验;若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。该方法实现了基于组合分类器对用户视频数据进行用户风险类别进行判断,提升了识别准确率。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的基于多分类器的风险识别方法的应用场景示意图;
图2为本申请实施例提供的基于多分类器的风险识别方法的流程示意图;
图3为本申请实施例提供的基于多分类器的风险识别装置的示意性框图;
图4为本申请实施例提供的计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
请参阅图1和图2,图1为本申请实施例提供的基于多分类器的风险识别方法的应用场景示意图;图2为本申请实施例提供的基于多分类器的风险识别方法的流程示意图,该基于多分类器的风险识别方法应用于服务器中,该方法通过安装于服务器中的应用软件进行执行。
如图2所示,该方法包括步骤S110~S180。
S110、若检测到用户端发送的事项办理指令,获取用户视频数据。
在本实施例中,以用户是在线办理事项(如在线购买某产品)为场景来说明。客户可以根据在用户端(如智能手机、平板电脑等)浏览得到的产品相关信息和自身的需要,选择点击办理相应的购买操作时。在服务器接收到用户端发送的事项办理指令后,可以触发用户端的摄像头等相关的交互设备以进行客户与服务器之间的视频交互。
S120、获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验。
在本实施例中,为保证用户的数据安全,需要在在线办理事项前,核验客户的真实身份。在实际操作中,具体可以通过多种方式完成对用户身份信息的核验,包括但不限于身份证信息、人脸识别、指纹和虹膜信息等等。
例如,在用户身份信息过程中,可以提示客户输入身份证信息(也可以直接通过拍摄身份证获得)。然后,根据身份证信息,调取对应的生物验证信息(包括人脸、虹膜和指纹等),匹配判断是否为客户本人。
在一些实施例中,还可以通过语音播报等形式,提示和引导客户做出眨眼,点头等等一系列的动作,以确定客户身份的真实有效。
S130、若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据。
在本实施例中,服务器获取了用户视频数据并完成了对用户身份核验后,此时在获取用户在办理事项时的当前视频数据,以当前视频数据为分析用户风险等级的数据基础。
服务器在获取当前视频数据的过程中,可以通过播报语音信息的形式来实现引导,从而模拟客服人员的操作。当然,为了满足不同客户的个性化需求,还可以提供多种播报选项,如播报的声音类型,播报语速等等。用户可以根据自己的需要,选择使用适合自己的语言或者语速等进行播报。
S140、提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息。
在本实施例中,“情感信息”是指客户在办理事项流程中所体现出的情感状态,可以反映客户在办理事项时的状态。具体而言,情感信息可以通过标签化的方式来表示,例如可以设置为“开心”,“厌恶”,“压制”,“惊奇”以及“其他”这样的五个不同类别的情感标签。通过微表情或者语音情感识别或者其两者的结合,输出上千视频数据所属的情感类别,并打上相应的情感标签。
“内容信息”是指办理视频中客户具体反馈的信息。其具体是根据业务办理流程播报的语音信息所决定的。针对不同类型的客户反馈信息,可以采用相应的方式来从回答视频中获取具体的内容信息。
例如,当服务器推送至用户端播报的语音信息是一个判断性问题时(是否曾经办理过本项业务),可以通过语音识别的方式,提取确定用户的回答为肯定回答还是否定回答(是或者否)。而当***播报的语音信息是一个描述性问题时(请说明当前的收入情况),可以通过语义分析技术,将客户回答语义转换为文本信息以后,基于自然语言处理的方式,获得文本信息表达的真实含义(工资收入xxx,额外的租金收入xxx)。
在一实施例中,步骤S140中提取所述当前视频数据中的情感信息,包括:
通过微表情识别模型获取所述当前视频数据中的情感信息。
在本实施例中,可以使用基于光流的特征提取或者基于LBP-TOP算子的特征提取来获取所述当前视频数据中的包括微表情的图像帧,并结合微表情的图像帧及微表情识别模型获取所述当前视频数据中的情感信息。
其中,光流算法是在一定约束条件下估算视频图像序列中的光流从而识别出客户面部的细微运动,实现对微表情的特征提取。而LBP-TOP算子(即时空局部纹理)则是在局部二值模式(LBP算子)的基础上发展而来的,用于反映像素在视频图像序列中的空间分布的特征。
在一实施例中,步骤S140中提取所述当前视频数据中的内容信息,包括:
通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息;
根据所述文本信息中是或否对应的关键词获取所述视频数据中的内容信息。
在本实施例中,N-gram模型为预先训练的N元模型,是一种常用的语音识别模型。通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息后,即可判断其中是否包括有“是”或“否”对应的关键词获取所述视频数据中的内容信息。
S150、调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器。
在本实施例中,为了提高基于用户的当前视频数据来进行风险等级识别的准确率,此时可以调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器。具体选择使用的非线性分类器和线性分类器可以根据实际的事项流程所确定。
在一实施例中,步骤S150之前还包括:
构建由若干个非线性分类器和线性分类器组成的初始组合分类器;
通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。
在本实施例中,先是根据预先设置的默认权重系数对若干个非线性分类器和线性分类器分别赋值,以组成初始组合分类器。之后,通过训练数据集合,不断优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。优化权重系数后额组合分类器,更加有利于准确对用户风险等级进行分类。
在一实施例中,所述初始组合分类器包括5个基于支持向量机的非线性分类器和1个基于逻辑回归的线性分类器;其中,5个基于支持向量机的非线性分类器分别记为U1至U5,基于逻辑回归的线性分类器记为U6;U1对应的非线性分类器用于识别第一类型情感,U2对应的非线性分类器用于识别第二类型情感,U3对应的非线性分类器用于识别第三类型情感,U4对应的非线性分类器用于识别第是类型情感,U5对应的非线性分类器用于识别第五类型情感,U6对应的线性分类器用于识别判断性问题。
在本实施例中,该组合分类器由5个基于SVM的非线性分类器以及1个基于逻辑回归的 线性分类器组成。
其中,5个非线性分类器使用不同的核函数,分别由具有相应情感标签(“开心”,“厌恶”,“压制”,“惊奇”以及“其他”)的训练数据子集训练获得。亦即,每个非线性分类器关注某一项情感信息来进行风险识别。
1个基于逻辑回归的线性分类器是通过内容信息为判断性问题的回答(亦即“是否”或者包含唯一选项的问题)的训练数据子集而训练获得。
在一实施例中,所述通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器,包括:
获取所述训练数据集合中情感信息对应各类型情感的子集,分别记为第一训练数据子集Y1、第二训练数据子集Y2、第三训练数据子集Y3、第四训练数据子集Y4、第五训练数据子集Y5;
获取所述训练数据集合中内容信息对应判断性问题的子集,以得到第六训练数据子集Y6;
对训练数据集合中的每个训练数据赋予初始化权重1/K;其中,K为训练数据集合中训练数据的总数量,第一训练数据子集Y1中训练数据对应第一总数量K 1、第二训练数据子集Y2中训练数据对应第二总数量K 2、第三训练数据子集Y3中训练数据对应第三总数量K 3、第四训练数据子集Y4中训练数据对应第四总数量K 4、第五训练数据子集Y5中训练数据对应第五总数量K 5、第六训练数据子集Y6中训练数据对应第六总数量K 6
通过
Figure PCTCN2020103795-appb-000001
计算获取U1至U6分别对应的分类损失;其中,Loss j表示初始组合分类器中第j个分类器对应的分类损失,W i是第i个训练数据的权重,I i是第i个训练数据对应的指针函数且取值为0或1;
获取U1至U6分别对应的分类损失中的最小值对应的子分类器以作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数;
根据所述最优子分类器对应的权重系数及调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重;
根据每个训练数据对应的当前权重,及
Figure PCTCN2020103795-appb-000002
计算获取U1至U6分别对应的当前分类损失;其中,Loss' j表示初始组合分类器中第j个分类器对应的当前分类损失,W' i是第i个训练数据的当前权重,I' i为第i个训练数据对应的当前指针函数且取值为0或1;
调用预设的权重系数获取策略获取U1至U6分别对应的当前权重系数;
由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
在本实施例中,所述非线性分类器和线性分类器统称为组合分类器中的子分类器。每个子分类器都赋予对应的权重系数。组合分类器最终的输出为所有子分类器输出的加权求和值。
在一实施例中,所述权重系数调整策略对应的公式为:
Figure PCTCN2020103795-appb-000003
其中Loss 最优子分类器为U1至U6分别对应的分类损失中分类损失最小值;
所述权重更新策略对应的公式为:
Figure PCTCN2020103795-appb-000004
其中Zk为归一化因子,UpdateW i为第i个训练数据的当前权重;
所述权重系数获取策略对应的公式为:
Figure PCTCN2020103795-appb-000005
假设存在一个训练数据集合Y。该训练数据集合Y中包含有多个不同的样本数据Xi。每个样本数据具有对应的情感标签以及内容信息,并且已知用户风险类别(通过用户风险类别字段来标记,如用户风险类别过高的标记为1,用户风险类别正常的标记为0)。
根据情感标签以及内容信息,选取训练数据集合Y训练数据中的分别与5种情感标签对应的训练数据作为训练数据子集Y1至Y5(例如训练数据子集Y1是情感标签为“开心”的所有样本数据组成的集合)以及训练数据子集Y6(内容信息为判断性问题的回答)。
分别通过训练数据子集Y1至Y6,训练获得子分类器U1至U6。U1至U5为基于SVM的非 线性分类器,使用的核函数种类为高斯核函数,U6为基于逻辑回归的线性分类器。U1至U6组合形成一个组合分类器U。
在已知了训练数据子集Y1至Y6后,先对训练数据集合Y中的每个训练数据赋予初始化权重1/K,K为训练数据集合中训练数据的总数量。即令每个训练数据的初始权重都相同。
然后计算出U1至U6分别对应的分类损失,并以其中的最小值对应的子分类器作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数。分类损失最小的那个子分类器作为最优子分类器,并据此确定该子分类器的权重系数。权重系数的计算思路是:当分类损失越小时,权重系数越高,而当分类损失越大时,权重系数则越小。
之后获取了所述最优子分类器对应的权重系数,调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重。更新训练数据的权重系数是希望训练样本数据在上一次被最优子分类器分类错误时,可以得到更多的重视(即权重变大)。而在上一次被最优子分类器分类正确时,则减少其重视程度(即权重变小)。
最后,再对各子分类器的当前权重系数进行调整,即可由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
通过上述调整各子分类器的权重系数的过程,能实现对用户风险类别更精准的识别。
S160、将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别。
在本实施例中,当获取了组合分类器和当前视频数据对应的情感信息以及内容信息后,即可将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别。
本实施例中采用的组合分类器,整合了不同核函数的非线性分类器和线性分类器,充分利用各个子分类器的优势(构造的每个子分类器都有自己较为擅长的识别方向),可以提升分类器的性能和识别准确率,能够全面的反映情感信息和内容信息与业务风险之间的关联。另外,将样本数据切分为多种分别进行处理,减少每个支持向量机训练优化时所需要处理的样本数据量,可以提升支持向量机的实际应用效率。
S170、判断所述当前视频数据对应的用户风险类别是否属于高风险类别。
在本实施例中,获取了用户风险类别后,需要快速的判断该用户风险类别是否属于高风险类别,从而确定后续的事项办理流程。
S180、若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
在本实施例中,若所述当前视频数据对应的用户风险类别属于高风险类别,表示用户可能在非正常情况(例如在非自愿受他人威胁的情况下)办理事项,此时为了确保用户数据安全,需终止与用户端的连接并进行终止事项办理的提示。
若所述当前视频数据对应的用户风险类别属于正常风险类别,无需终止与用户端的连接,服务器继续根据事项办理流程对应向用户端推送流程数据进行交互即可。
该方法实现了基于包括若干个非线性分类器和线性分类器的组合分类器对用户视频数据进行用户风险类别进行判断,提升了识别准确率。
本申请实施例还提供一种基于多分类器的风险识别装置,该基于多分类器的风险识别装置用于执行前述基于多分类器的风险识别方法的任一实施例。具体地,请参阅图3,图3是本申请实施例提供的基于多分类器的风险识别装置的示意性框图。该基于多分类器的风险识别装置100可以配置于服务器中。
如图3所示,基于多分类器的风险识别装置100包括:用户视频数据获取单元110、用户身份核验单元120、当前视频数据获取单元130、视频信息提取单元140、组合分类器调用单元150、用户分类单元160、类别判断单元170、高风险类别处理单元180。
用户视频数据获取单元110,用于若检测到用户端发送的事项办理指令,获取用户视频 数据。
在本实施例中,以用户是在线办理事项(如在线购买某产品)为场景来说明。客户可以根据在用户端(如智能手机、平板电脑等)浏览得到的产品相关信息和自身的需要,选择点击办理相应的购买操作时。在服务器接收到用户端发送的事项办理指令后,可以触发用户端的摄像头等相关的交互设备以进行客户与服务器之间的视频交互。
用户身份核验单元120,用于获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验。
在本实施例中,为保证用户的数据安全,需要在在线办理事项前,核验客户的真实身份。在实际操作中,具体可以通过多种方式完成对用户身份信息的核验,包括但不限于身份证信息、人脸识别、指纹和虹膜信息等等。
例如,在用户身份信息过程中,可以提示客户输入身份证信息(也可以直接通过拍摄身份证获得)。然后,根据身份证信息,调取对应的生物验证信息(包括人脸、虹膜和指纹等),匹配判断是否为客户本人。
在一些实施例中,还可以通过语音播报等形式,提示和引导客户做出眨眼,点头等等一系列的动作,以确定客户身份的真实有效。
当前视频数据获取单元130,用于若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据。
在本实施例中,服务器获取了用户视频数据并完成了对用户身份核验后,此时在获取用户在办理事项时的当前视频数据,以当前视频数据为分析用户风险等级的数据基础。
服务器在获取当前视频数据的过程中,可以通过播报语音信息的形式来实现引导,从而模拟客服人员的操作。当然,为了满足不同客户的个性化需求,还可以提供多种播报选项,如播报的声音类型,播报语速等等。用户可以根据自己的需要,选择使用适合自己的语言或者语速等进行播报。
视频信息提取单元140,用于提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息。
在本实施例中,“情感信息”是指客户在办理事项流程中所体现出的情感状态,可以反映客户在办理事项时的状态。具体而言,情感信息可以通过标签化的方式来表示,例如可以设置为“开心”,“厌恶”,“压制”,“惊奇”以及“其他”这样的五个不同类别的情感标签。通过微表情或者语音情感识别或者其两者的结合,输出上千视频数据所属的情感类别,并打上相应的情感标签。
“内容信息”是指办理视频中客户具体反馈的信息。其具体是根据业务办理流程播报的语音信息所决定的。针对不同类型的客户反馈信息,可以采用相应的方式来从回答视频中获取具体的内容信息。
例如,当服务器推送至用户端播报的语音信息是一个判断性问题时(是否曾经办理过本项业务),可以通过语音识别的方式,提取确定用户的回答为肯定回答还是否定回答(是或者否)。而当***播报的语音信息是一个描述性问题时(请说明当前的收入情况),可以通过语义分析技术,将客户回答语义转换为文本信息以后,基于自然语言处理的方式,获得文本信息表达的真实含义(工资收入xxx,额外的租金收入xxx)。
在一实施例中,视频信息提取单元140,包括:
情感信息提取单元,用于通过微表情识别模型获取所述当前视频数据中的情感信息。
在本实施例中,可以使用基于光流的特征提取或者基于LBP-TOP算子的特征提取来获取所述当前视频数据中的包括微表情的图像帧,并结合微表情的图像帧及微表情识别模型获取所述当前视频数据中的情感信息。
其中,光流算法是在一定约束条件下估算视频图像序列中的光流从而识别出客户面部的细微运动,实现对微表情的特征提取。而LBP-TOP算子(即时空局部纹理)则是在局部二值模式(LBP算子)的基础上发展而来的,用于反映像素在视频图像序列中的空间分布的特征。
在一实施例中,视频信息提取单元140还包括:
文本信息提取单元,用于通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息;
内容信息提取单元,用于根据所述文本信息中是或否对应的关键词获取所述视频数据中的内容信息。
在本实施例中,N-gram模型为预先训练的N元模型,是一种常用的语音识别模型。通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息后,即可判断其中是否包括有“是”或“否”对应的关键词获取所述视频数据中的内容信息。
组合分类器调用单元150,用于调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器。
在本实施例中,为了提高基于用户的当前视频数据来进行风险等级识别的准确率,此时可以调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器。具体选择使用的非线性分类器和线性分类器可以根据实际的事项流程所确定。
在一实施例中,基于多分类器的风险识别装置100还包括:
初始分类器构建单元,用于构建由若干个非线性分类器和线性分类器组成的初始组合分类器;
分类器优化单元,用于通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。
在本实施例中,先是根据预先设置的默认权重系数对若干个非线性分类器和线性分类器分别赋值,以组成初始组合分类器。之后,通过训练数据集合,不断优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。优化权重系数后额组合分类器,更加有利于准确对用户风险等级进行分类。
在一实施例中,所述初始组合分类器包括5个基于支持向量机的非线性分类器和1个基于逻辑回归的线性分类器;其中,5个基于支持向量机的非线性分类器分别记为U1至U5,基于逻辑回归的线性分类器记为U6;U1对应的非线性分类器用于识别第一类型情感,U2对应的非线性分类器用于识别第二类型情感,U3对应的非线性分类器用于识别第三类型情感,U4对应的非线性分类器用于识别第是类型情感,U5对应的非线性分类器用于识别第五类型情感,U6对应的线性分类器用于识别判断性问题。
在本实施例中,该组合分类器由5个基于SVM的非线性分类器以及1个基于逻辑回归的线性分类器组成。
其中,5个非线性分类器使用不同的核函数,分别由具有相应情感标签(“开心”,“厌恶”,“压制”,“惊奇”以及“其他”)的训练数据子集训练获得。亦即,每个非线性分类器关注某一项情感信息来进行风险识别。
1个基于逻辑回归的线性分类器是通过内容信息为判断性问题的回答(亦即“是否”或者包含唯一选项的问题)的训练数据子集而训练获得。
在一实施例中,所述分类器优化单元,包括:
第一子集划分单元,用于获取所述训练数据集合中情感信息对应各类型情感的子集,分别记为第一训练数据子集Y1、第二训练数据子集Y2、第三训练数据子集Y3、第四训练数据子集Y4、第五训练数据子集Y5;
第二子集划分单元,用于获取所述训练数据集合中内容信息对应判断性问题的子集,以得到第六训练数据子集Y6;
权重初始化赋值单元,用于对训练数据集合中的每个训练数据赋予初始化权重1/K;其中,K为训练数据集合中训练数据的总数量,第一训练数据子集Y1中训练数据对应第一总数量K 1、第二训练数据子集Y2中训练数据对应第二总数量K 2、第三训练数据子集Y3中训练数据对应第三总数量K 3、第四训练数据子集Y4中训练数据对应第四总数量K 4、第五训练数据子集Y5中训练数据对应第五总数量K 5、第六训练数据子集Y6中训练数据对应第六总数量K 6
初始分类损失获取单元,用于通过
Figure PCTCN2020103795-appb-000006
计算获取U1至U6分别对应的分类损失;其中,Loss j表示初始组合分类器中第j个分类器对应的分类损失,W i是第i个训练数据的权重,I i是第i个训练数据对应的指针函数且取值为0或1;
最优子分类器获取单元,用于获取U1至U6分别对应的分类损失中的最小值对应的子分类器以作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数;
当前权重获取单元,用于根据所述最优子分类器对应的权重系数及调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重;
当前分类损失获取单元,用于根据每个训练数据对应的当前权重,及
Figure PCTCN2020103795-appb-000007
计算获取U1至U6分别对应的当前分类损失;其中,Loss' j表示初始组合分类器中第j个分类器对应的当前分类损失,W' i是第i个训练数据的当前权重,I' i为第i个训练数据对应的当前指针函数且取值为0或1;
当前权重系数获取单元,用于调用预设的权重系数获取策略获取U1至U6分别对应的当前权重系数;
分类器组合单元,用于由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
在本实施例中,所述非线性分类器和线性分类器统称为组合分类器中的子分类器。每个子分类器都赋予对应的权重系数。组合分类器最终的输出为所有子分类器输出的加权求和值。
在一实施例中,所述权重系数调整策略对应的公式为:
Figure PCTCN2020103795-appb-000008
其中Loss 最优子分类器为U1至U6分别对应的分类损失中分类损失最小值;
所述权重更新策略对应的公式为:
Figure PCTCN2020103795-appb-000009
其中Zk为归一化因子,UpdateW i为第i个训练数据的当前权重;
所述权重系数获取策略对应的公式为:
Figure PCTCN2020103795-appb-000010
假设存在一个训练数据集合Y。该训练数据集合Y中包含有多个不同的样本数据Xi。每个样本数据具有对应的情感标签以及内容信息,并且已知用户风险类别(通过用户风险类别字段来标记,如用户风险类别过高的标记为1,用户风险类别正常的标记为0)。
根据情感标签以及内容信息,选取训练数据集合Y训练数据中的分别与5种情感标签对应的训练数据作为训练数据子集Y1至Y5(例如训练数据子集Y1是情感标签为“开心”的所有样本数据组成的集合)以及训练数据子集Y6(内容信息为判断性问题的回答)。
分别通过训练数据子集Y1至Y6,训练获得子分类器U1至U6。U1至U5为基于SVM的非线性分类器,使用的核函数种类为高斯核函数,U6为基于逻辑回归的线性分类器。U1至U6组合形成一个组合分类器U。
在已知了训练数据子集Y1至Y6后,先对训练数据集合Y中的每个训练数据赋予初始化权重1/K,K为训练数据集合中训练数据的总数量。即令每个训练数据的初始权重都相同。
然后计算出U1至U6分别对应的分类损失,并以其中的最小值对应的子分类器作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数。分类损失最小的那个子分类器作为最优子分类器,并据此确定该子分类器的权重系数。权重系数的计算思路是:当分类损失越小时,权重系数越高,而当分类损失越大时,权重系数则越小。
之后获取了所述最优子分类器对应的权重系数,调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重。更新训练数据的权重系数是希望训练样本数据在上一次被最优子分类器分类错误时,可以得到更多的重视(即权重变大)。而在上一次被最优子分类器分类正确时,则减少其重视程度 (即权重变小)。
最后,再对各子分类器的当前权重系数进行调整,即可由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
通过上述调整各子分类器的权重系数的过程,能实现对用户风险类别更精准的识别。
用户分类单元160,用于将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别。
在本实施例中,当获取了组合分类器和当前视频数据对应的情感信息以及内容信息后,即可将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别。
本实施例中采用的组合分类器,整合了不同核函数的非线性分类器和线性分类器,充分利用各个子分类器的优势(构造的每个子分类器都有自己较为擅长的识别方向),可以提升分类器的性能和识别准确率,能够全面的反映情感信息和内容信息与业务风险之间的关联。另外,将样本数据切分为多种分别进行处理,减少每个支持向量机训练优化时所需要处理的样本数据量,可以提升支持向量机的实际应用效率。
类别判断单元170,用于判断所述当前视频数据对应的用户风险类别是否属于高风险类别。
在本实施例中,获取了用户风险类别后,需要快速的判断该用户风险类别是否属于高风险类别,从而确定后续的事项办理流程。
高风险类别处理单元180,用于若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
在本实施例中,若所述当前视频数据对应的用户风险类别属于高风险类别,表示用户可能在非正常情况(例如在非自愿受他人威胁的情况下)办理事项,此时为了确保用户数据安全,需终止与用户端的连接并进行终止事项办理的提示。
若所述当前视频数据对应的用户风险类别属于正常风险类别,无需终止与用户端的连接,服务器继续根据事项办理流程对应向用户端推送流程数据进行交互即可。
该装置实现了基于包括若干个非线性分类器和线性分类器的组合分类器对用户视频数据进行用户风险类别进行判断,提升了识别准确率。
上述基于多分类器的风险识别装置可以实现为计算机程序的形式,该计算机程序可以在如图4所示的计算机设备上运行。
请参阅图4,图4是本申请实施例提供的计算机设备的示意性框图。该计算机设备500是服务器,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。
参阅图4,该计算机设备500包括通过***总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。
该非易失性存储介质503可存储操作***5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行基于多分类器的风险识别方法。
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行基于多分类器的风险识别方法。
该网络接口505用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现本申请实施例公开的基于多分类器的风险识别方法。
本领域技术人员可以理解,图4中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合 某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图4所示实施例一致,在此不再赘述。
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质,也可以是易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现本申请实施例公开的基于多分类器的风险识别方法。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为逻辑功能划分,实际实现时可以有另外的划分方式,也可以将具有相同功能的单元集合成一个单元,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种基于多分类器的风险识别方法,其中,包括:
    若检测到用户端发送的事项办理指令,获取用户视频数据;
    获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验;
    若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
    提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
    调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
    将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
    判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
    若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
  2. 根据权利要求1所述的基于多分类器的风险识别方法,其中,所述提取所述当前视频数据中的情感信息,包括:
    通过微表情识别模型获取所述当前视频数据中的情感信息;
    所述提取所述当前视频数据中的内容信息,包括:
    通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息;
    根据所述文本信息中是或否对应的关键词获取所述视频数据中的内容信息。
  3. 根据权利要求1所述的基于多分类器的风险识别方法,其中,还包括:
    构建由若干个非线性分类器和线性分类器组成的初始组合分类器;
    通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。
  4. 根据权利要求3所述的基于多分类器的风险识别方法,其中,所述初始组合分类器包括5个基于支持向量机的非线性分类器和1个基于逻辑回归的线性分类器;其中,5个基于支持向量机的非线性分类器分别记为U1至U5,基于逻辑回归的线性分类器记为U6;U1对应的非线性分类器用于识别第一类型情感,U2对应的非线性分类器用于识别第二类型情感,U3对应的非线性分类器用于识别第三类型情感,U4对应的非线性分类器用于识别第是类型情感,U5对应的非线性分类器用于识别第五类型情感,U6对应的线性分类器用于识别判断性问题。
  5. 根据权利要求4所述的基于多分类器的风险识别方法,其中,所述通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器,包括:
    获取所述训练数据集合中情感信息对应各类型情感的子集,分别记为第一训练数据子集Y1、第二训练数据子集Y2、第三训练数据子集Y3、第四训练数据子集Y4、第五训练数据子集Y5;
    获取所述训练数据集合中内容信息对应判断性问题的子集,以得到第六训练数据子集Y6;
    对训练数据集合中的每个训练数据赋予初始化权重1/K;其中,K为训练数据集合中训练数据的总数量,第一训练数据子集Y1中训练数据对应第一总数量K 1、第二训练数据子集Y2中训练数据对应第二总数量K 2、第三训练数据子集Y3中训练数据对应第三总数量K 3、第四训练数据子集Y4中训练数据对应第四总数量K 4、第五训练数据子集Y5中训练数据对应第五总数量K 5、第六训练数据子集Y6中训练数据对应第六总数量K 6
    通过
    Figure PCTCN2020103795-appb-100001
    计算获取U1至U6分别对应的分类损失;其中,Loss j表示初始组合分类器中第j个分类器对应的分类损失,W i是第i个训练数据的权重,I i是第i个训练数据对应的指针函数且取值为0或1;
    获取U1至U6分别对应的分类损失中的最小值对应的子分类器以作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数;
    根据所述最优子分类器对应的权重系数及调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重;
    根据每个训练数据对应的当前权重,及
    Figure PCTCN2020103795-appb-100002
    计算获取U1至U6分别对应的当前分类损失;其中,Loss′ j表示初始组合分类器中第j个分类器对应的当前分类损失,W' i是第i个训练数据的当前权重,I' i为第i个训练数据对应的当前指针函数且取值为0或1;
    调用预设的权重系数获取策略获取U1至U6分别对应的当前权重系数;
    由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
  6. 根据权利要求5所述的基于多分类器的风险识别方法,其中,所述权重系数调整策略对应的公式为:
    Figure PCTCN2020103795-appb-100003
    其中Loss 最优子分类器为U1至U6分别对应的分类损失中分类损失最小值;
    所述权重更新策略对应的公式为:
    Figure PCTCN2020103795-appb-100004
    其中Zk为归一化因子,UpdateW i为第i个训练数据的当前权重;
    所述权重系数获取策略对应的公式为:
    Figure PCTCN2020103795-appb-100005
  7. 根据权利要求1所述的基于多分类器的风险识别方法,其中,所述判断所述当前视频数据对应的用户风险类别是否属于高风险类别之后,还包括:
    若所述当前视频数据对应的用户风险类别属于正常风险类别,根据事项办理流程对应向用户端推送流程数据进行交互。
  8. 根据权利要求2所述的基于多分类器的风险识别方法,其中,所述通过微表情识别模型获取所述当前视频数据中的情感信息,包括:
    使用基于光流的特征提取或者基于LBP-TOP算子的特征提取来获取所述当前视频数据中的包括微表情的图像帧,并结合微表情的图像帧及微表情识别模型获取所述当前视频数据中的情感信息。
  9. 一种基于多分类器的风险识别装置,其中,包括:
    用户视频数据获取单元,用于若检测到用户端发送的事项办理指令,获取用户视频数据;
    用户身份核验单元,用于获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验;
    当前视频数据获取单元,用于若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
    视频信息提取单元,用于提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
    组合分类器调用单元,用于调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
    用户分类单元,用于将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
    类别判断单元,用于判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
    高风险类别处理单元,用于若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
  10. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现以下步骤:
    若检测到用户端发送的事项办理指令,获取用户视频数据;
    获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身 份核验;
    若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
    提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
    调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
    将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
    判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
    若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
  11. 根据权利要求10所述的计算机设备,其中,所述提取所述当前视频数据中的情感信息,包括:
    通过微表情识别模型获取所述当前视频数据中的情感信息;
    所述提取所述当前视频数据中的内容信息,包括:
    通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息;
    根据所述文本信息中是或否对应的关键词获取所述视频数据中的内容信息。
  12. 根据权利要求10所述的计算机设备,其中,还包括:
    构建由若干个非线性分类器和线性分类器组成的初始组合分类器;
    通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。
  13. 根据权利要求12所述的计算机设备,其中,所述初始组合分类器包括5个基于支持向量机的非线性分类器和1个基于逻辑回归的线性分类器;其中,5个基于支持向量机的非线性分类器分别记为U1至U5,基于逻辑回归的线性分类器记为U6;U1对应的非线性分类器用于识别第一类型情感,U2对应的非线性分类器用于识别第二类型情感,U3对应的非线性分类器用于识别第三类型情感,U4对应的非线性分类器用于识别第是类型情感,U5对应的非线性分类器用于识别第五类型情感,U6对应的线性分类器用于识别判断性问题。
  14. 根据权利要求13所述的计算机设备,其中,所述通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器,包括:
    获取所述训练数据集合中情感信息对应各类型情感的子集,分别记为第一训练数据子集Y1、第二训练数据子集Y2、第三训练数据子集Y3、第四训练数据子集Y4、第五训练数据子集Y5;
    获取所述训练数据集合中内容信息对应判断性问题的子集,以得到第六训练数据子集Y6;
    对训练数据集合中的每个训练数据赋予初始化权重1/K;其中,K为训练数据集合中训练数据的总数量,第一训练数据子集Y1中训练数据对应第一总数量K 1、第二训练数据子集Y2中训练数据对应第二总数量K 2、第三训练数据子集Y3中训练数据对应第三总数量K 3、第四训练数据子集Y4中训练数据对应第四总数量K 4、第五训练数据子集Y5中训练数据对应第五总数量K 5、第六训练数据子集Y6中训练数据对应第六总数量K 6
    通过
    Figure PCTCN2020103795-appb-100006
    计算获取U1至U6分别对应的分类损失;其中,Loss j表示初始组合分类器中第j个分类器对应的分类损失,W i是第i个训练数据的权重,I i是第i个训练数据对应的指针函数且取值为0或1;
    获取U1至U6分别对应的分类损失中的最小值对应的子分类器以作为最优子分类器,调用预设的权重系数调整策略获取所述最优子分类器对应的权重系数;
    根据所述最优子分类器对应的权重系数及调用预先存储的权重更新策略,对应调整所述训练数据集合中每个训练数据的初始化权重,以得到每个训练数据对应的当前权重;
    根据每个训练数据对应的当前权重,及
    Figure PCTCN2020103795-appb-100007
    计算获取U1至U6分别对应的当前分类损失;其中,Loss′ j表示初始组合分类器中第j个分类器对应的当前分类损失,W' i 是第i个训练数据的当前权重,I' i为第i个训练数据对应的当前指针函数且取值为0或1;
    调用预设的权重系数获取策略获取U1至U6分别对应的当前权重系数;
    由U1至U6及U1至U6分别对应的当前权重系数,组成组合分类器。
  15. 根据权利要求14所述的计算机设备,其中,所述权重系数调整策略对应的公式为:
    Figure PCTCN2020103795-appb-100008
    其中Loss 最优子分类器为U1至U6分别对应的分类损失中分类损失最小值;
    所述权重更新策略对应的公式为:
    Figure PCTCN2020103795-appb-100009
    其中Zk为归一化因子,UpdateW i为第i个训练数据的当前权重;
    所述权重系数获取策略对应的公式为:
    Figure PCTCN2020103795-appb-100010
  16. 根据权利要求10所述的计算机设备,其中,所述判断所述当前视频数据对应的用户风险类别是否属于高风险类别之后,还包括:
    若所述当前视频数据对应的用户风险类别属于正常风险类别,根据事项办理流程对应向用户端推送流程数据进行交互。
  17. 根据权利要求11所述的计算机设备,其中,所述通过微表情识别模型获取所述当前视频数据中的情感信息,包括:
    使用基于光流的特征提取或者基于LBP-TOP算子的特征提取来获取所述当前视频数据中的包括微表情的图像帧,并结合微表情的图像帧及微表情识别模型获取所述当前视频数据中的情感信息。
  18. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行以下操作:
    若检测到用户端发送的事项办理指令,获取用户视频数据;
    获取所述用户视频数据对应的用户身份信息,以判断所述用户身份信息是否通过用户身份核验;
    若所述用户身份信息通过用户身份核验,获取当前办理事项流程中的当前视频数据;
    提取所述当前视频数据中的情感信息,并提取所述当前视频数据中的内容信息;
    调用预先构建的由若干个非线性分类器和线性分类器组成的组合分类器;
    将所述当前视频数据的情感信息以及内容信息输入组合分类器,得到对应的用户风险类别;
    判断所述当前视频数据对应的用户风险类别是否属于高风险类别;以及
    若所述当前视频数据对应的用户风险类别属于高风险类别,终止与用户端的连接并进行终止事项办理的提示。
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述提取所述当前视频数据中的情感信息,包括:
    通过微表情识别模型获取所述当前视频数据中的情感信息;
    所述提取所述当前视频数据中的内容信息,包括:
    通过N-gram模型获取所述当前视频数据中音频数据对应的文本信息;
    根据所述文本信息中是或否对应的关键词获取所述视频数据中的内容信息。
  20. 根据权利要求18所述的计算机可读存储介质,其中,还包括:
    构建由若干个非线性分类器和线性分类器组成的初始组合分类器;
    通过训练数据集合,优化每个所述非线性分类器和线性分类器在所述初始组合分类器中的权重系数,以训练得到对应的组合分类器。
PCT/CN2020/103795 2020-05-26 2020-07-23 基于多分类器的风险识别方法、装置、计算机设备及存储介质 WO2021237907A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010457551.6A CN111639584A (zh) 2020-05-26 2020-05-26 基于多分类器的风险识别方法、装置及计算机设备
CN202010457551.6 2020-05-26

Publications (1)

Publication Number Publication Date
WO2021237907A1 true WO2021237907A1 (zh) 2021-12-02

Family

ID=72331057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103795 WO2021237907A1 (zh) 2020-05-26 2020-07-23 基于多分类器的风险识别方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN111639584A (zh)
WO (1) WO2021237907A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723269A (zh) * 2022-03-31 2022-07-08 支付宝(杭州)信息技术有限公司 一种事件的风险防控方法、装置及设备
CN115730233A (zh) * 2022-10-28 2023-03-03 支付宝(杭州)信息技术有限公司 一种数据处理方法、装置、可读存储介质以及电子设备
CN117094184A (zh) * 2023-10-19 2023-11-21 上海数字治理研究院有限公司 基于内网平台的风险预测模型的建模方法、***及介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465626B (zh) * 2020-11-24 2023-08-29 平安科技(深圳)有限公司 基于客户端分类聚合的联合风险评估方法及相关设备
CN112487295A (zh) * 2020-12-04 2021-03-12 ***通信集团江苏有限公司 5g套餐推送方法、装置、电子设备及计算机存储介质
CN112767967A (zh) * 2020-12-30 2021-05-07 深延科技(北京)有限公司 语音分类方法、装置及自动语音分类方法
CN113344581A (zh) * 2021-05-31 2021-09-03 中国工商银行股份有限公司 业务数据处理方法及装置
CN113313575B (zh) * 2021-06-08 2022-06-03 支付宝(杭州)信息技术有限公司 一种风险识别模型的确定方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090287512A1 (en) * 2003-04-30 2009-11-19 Genworth Financial,Inc System And Process For Dominance Classification For Insurance Underwriting Suitable For Use By An Automated System
CN104200804A (zh) * 2014-09-19 2014-12-10 合肥工业大学 一种面向人机交互的多类信息耦合的情感识别方法
CN107766792A (zh) * 2017-06-23 2018-03-06 北京理工大学 一种遥感图像舰船目标识别方法
CN109165685A (zh) * 2018-08-21 2019-01-08 南京邮电大学 基于表情和动作的监狱服刑人员潜在性风险监测方法和***
CN110097020A (zh) * 2019-05-10 2019-08-06 山东大学 一种基于联合稀疏字典学习的微表情识别方法
CN110147321A (zh) * 2019-04-19 2019-08-20 北京航空航天大学 一种基于软件网络的缺陷高风险模块的识别方法
CN110309840A (zh) * 2018-03-27 2019-10-08 阿里巴巴集团控股有限公司 风险交易识别方法、装置、服务器及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10146769B2 (en) * 2017-04-03 2018-12-04 Uber Technologies, Inc. Determining safety risk using natural language processing
CN109711297A (zh) * 2018-12-14 2019-05-03 深圳壹账通智能科技有限公司 基于面部图片的风险识别方法、装置、计算机设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090287512A1 (en) * 2003-04-30 2009-11-19 Genworth Financial,Inc System And Process For Dominance Classification For Insurance Underwriting Suitable For Use By An Automated System
CN104200804A (zh) * 2014-09-19 2014-12-10 合肥工业大学 一种面向人机交互的多类信息耦合的情感识别方法
CN107766792A (zh) * 2017-06-23 2018-03-06 北京理工大学 一种遥感图像舰船目标识别方法
CN110309840A (zh) * 2018-03-27 2019-10-08 阿里巴巴集团控股有限公司 风险交易识别方法、装置、服务器及存储介质
CN109165685A (zh) * 2018-08-21 2019-01-08 南京邮电大学 基于表情和动作的监狱服刑人员潜在性风险监测方法和***
CN110147321A (zh) * 2019-04-19 2019-08-20 北京航空航天大学 一种基于软件网络的缺陷高风险模块的识别方法
CN110097020A (zh) * 2019-05-10 2019-08-06 山东大学 一种基于联合稀疏字典学习的微表情识别方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114723269A (zh) * 2022-03-31 2022-07-08 支付宝(杭州)信息技术有限公司 一种事件的风险防控方法、装置及设备
CN115730233A (zh) * 2022-10-28 2023-03-03 支付宝(杭州)信息技术有限公司 一种数据处理方法、装置、可读存储介质以及电子设备
CN117094184A (zh) * 2023-10-19 2023-11-21 上海数字治理研究院有限公司 基于内网平台的风险预测模型的建模方法、***及介质
CN117094184B (zh) * 2023-10-19 2024-01-26 上海数字治理研究院有限公司 基于内网平台的风险预测模型的建模方法、***及介质

Also Published As

Publication number Publication date
CN111639584A (zh) 2020-09-08

Similar Documents

Publication Publication Date Title
WO2021237907A1 (zh) 基于多分类器的风险识别方法、装置、计算机设备及存储介质
WO2019120115A1 (zh) 人脸识别的方法、装置及计算机装置
US11907274B2 (en) Hyper-graph learner for natural language comprehension
WO2022142006A1 (zh) 基于语义识别的话术推荐方法、装置、设备及存储介质
WO2020077895A1 (zh) 签约意向判断方法、装置、计算机设备和存储介质
US20210012777A1 (en) Context acquiring method and device based on voice interaction
US10839238B2 (en) Remote user identity validation with threshold-based matching
CN110222554A (zh) 欺诈识别方法、装置、电子设备及存储介质
WO2022105118A1 (zh) 基于图像的健康状态识别方法、装置、设备及存储介质
US11531987B2 (en) User profiling based on transaction data associated with a user
CN111177569A (zh) 基于人工智能的推荐处理方法、装置及设备
WO2021204017A1 (zh) 文本意图识别方法、装置以及相关设备
WO2020147409A1 (zh) 一种文本分类方法、装置、计算机设备及存储介质
TWI752349B (zh) 風險識別方法及裝置
WO2021114936A1 (zh) 信息推荐方法、装置、电子设备及计算机可读存储介质
US20230274282A1 (en) Transaction tracking and fraud detection using voice and/or video data
CN112418059A (zh) 一种情绪识别的方法、装置、计算机设备及存储介质
WO2016122575A1 (en) Product, operating system and topic based recommendations
CN113919437A (zh) 生成客户画像的方法、装置、设备及存储介质
US20220292372A1 (en) Methods and systems for processing approval requests using pre-authorized approval information in an application-independent processing system
WO2021174814A1 (zh) 众包任务的答案验证方法、装置、计算机设备及存储介质
US20230362112A1 (en) Virtual-assistant-based resolution of user inquiries via failure-triggered document presentation
CN112597295A (zh) 摘要提取方法、装置、计算机设备和存储介质
US11907306B2 (en) Systems and methods for classifying documents
WO2024021526A1 (zh) 训练样本的生成方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937799

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 13/03/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20937799

Country of ref document: EP

Kind code of ref document: A1