CN117437936A - Compliance recognition method and recognition device - Google Patents

Compliance recognition method and recognition device Download PDF

Info

Publication number
CN117437936A
CN117437936A CN202311518310.8A CN202311518310A CN117437936A CN 117437936 A CN117437936 A CN 117437936A CN 202311518310 A CN202311518310 A CN 202311518310A CN 117437936 A CN117437936 A CN 117437936A
Authority
CN
China
Prior art keywords
voice
bank
marketing
semantics
compliance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311518310.8A
Other languages
Chinese (zh)
Inventor
王利华
李叶东
彭鹏
曾凡茂
谈日生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdian Yuntong Group Co ltd
Original Assignee
Guangdian Yuntong Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdian Yuntong Group Co ltd filed Critical Guangdian Yuntong Group Co ltd
Priority to CN202311518310.8A priority Critical patent/CN117437936A/en
Publication of CN117437936A publication Critical patent/CN117437936A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Signal Processing (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Child & Adolescent Psychology (AREA)
  • Psychiatry (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Hospice & Palliative Care (AREA)
  • Game Theory and Decision Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a compliance identification method and a compliance identification device, and belongs to the technical field of artificial intelligence. A compliance identification method comprising: acquiring a first voice in a bank marketing process; under the condition that the speaker corresponding to the first voice is a bank employee, determining a target stage to which the first voice belongs based on a second voice in a bank marketing process and a customer portrait corresponding to the bank marketing process; comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the standard speech operation are not matched. According to the compliance identification method and the identification device, whether the marketing behaviors of the bank staff accord with the standard operation program is detected by combining the standard operation program of the marketing scene, the non-compliance behaviors which do not accord with the standard operation program are found and corrected in a mode of outputting information, and the marketing behaviors of the bank can be managed more effectively and efficiently.

Description

Compliance recognition method and recognition device
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a compliance recognition method and a recognition device.
Background
The quality of service of the staff of the banking outlets (e.g. hall manager, teller or financial manager, etc.) is very important for both the banks and outlets. However, the implementation process of the marketing behaviors of the staff (hereinafter referred to as "bank staff") at the bank website is difficult to monitor and manage, and it is difficult to identify timely and accurately whether the marketing behaviors of the bank staff are compliant, so how to effectively identify the compliance of the marketing behaviors of the bank staff becomes a technical problem to be solved urgently by banking industry.
Disclosure of Invention
The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides the compliance identification method and the identification device, which can identify the compliance of the marketing behaviors and effectively and efficiently manage the marketing behaviors of the bank.
In a first aspect, the present application provides a compliance identification method, the method comprising:
acquiring a first voice in a bank marketing process;
under the condition that the speaker corresponding to the first voice is a bank employee, determining a target stage to which the first voice belongs based on a second voice in the bank marketing process and a customer portrait corresponding to the bank marketing process; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is the voice before the first voice in the bank marketing process;
Comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the standard speech operation are not matched.
According to the compliance recognition method, by combining the standard operation program of the marketing scene, whether the semantics of the voice fragment of the bank staff are matched with the semantics of the standard operation program of the marketing scene to which the voice fragment belongs or not is judged, whether the marketing behavior of the bank staff accords with the standard operation program is detected, whether the marketing behavior is compliant or not is recognized, the non-compliant behavior which does not accord with the standard operation program is found and corrected in an information output mode, the implementation process of the marketing behavior can be supervised and managed, the marketing behavior of the bank staff is standardized, and the marketing behavior of the bank can be managed more effectively and efficiently.
According to one embodiment of the present application, before the acquiring the first voice in the bank marketing process, the method further includes:
acquiring the second voice;
text analysis is carried out on the text corresponding to the second voice, and characteristic information of clients in the speaker corresponding to the second voice is obtained;
And acquiring the customer portrait based on the characteristic information.
According to the compliance recognition method, the target marketing scene and the target stage to which the first voice belongs can be more accurately determined by acquiring the customer portrait, so that whether the marketing behaviors of the bank staff accord with the standard operation program can be more accurately detected, whether the marketing behaviors are compliance is recognized, the non-compliance behaviors which do not accord with the standard operation program are found and corrected in a mode of outputting information, the implementation process of the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be more effectively and efficiently managed.
According to one embodiment of the present application, after the acquiring the first voice in the bank marketing process, the method further includes:
under the condition that the speaker corresponding to the first voice is a bank employee, detecting the first voice based on at least one of keywords, regular expressions, semantics, emotion rules and speech speed rules, and judging whether the bank employee has illegal speaking behaviors;
and outputting at least one of alarm information and prompt information under the condition that illegal operation behaviors exist.
According to the compliance recognition method, whether the bank staff has the illegal behaviors in the marketing process or not is detected based on at least one of the keywords, the regular expressions, the semantics, the emotion rules and the speech speed rules, whether the marketing behaviors are in compliance is recognized, the illegal behaviors such as the illegal behaviors are found and corrected in a mode of outputting information, the illegal behaviors in the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be managed more effectively and efficiently.
According to an embodiment of the present application, in a case where the speaker corresponding to the first voice is a banking employee, detecting the first voice based on at least one of a keyword, a regular expression, a semantic meaning, an emotion rule, and a speech rate rule, and determining whether the banking employee has an illegal voice operation, includes:
text classification is carried out on the first voice based on a textRCNN model, and whether a speaker corresponding to the first voice is a banking employee or not is determined;
under the condition that the speaker corresponding to the first voice is a bank employee, detecting the first voice based on a rule template, and judging whether the bank employee has illegal speaking behaviors; the rule templates are used for representing at least one of keywords, regular expressions, semantics, emotion rules and speed of speech rules.
According to the compliance recognition method, text classification and key information extraction are carried out based on the textRCNN model, whether the illegal operation behaviors exist in the marketing process of the bank staff is detected based on at least one of keywords, regular expressions, semantics, emotion rules and speech speed rules, whether the marketing behaviors are in compliance is recognized, the illegal operation behaviors such as the illegal operation behaviors are found and corrected in a mode of outputting information, the illegal operation behaviors in the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, the marketing behaviors of the bank can be managed more effectively and efficiently, and the operation compliance level of the bank website and the management capability of the bank can be improved.
According to one embodiment of the present application, the comparing the semantics of the first speech with the semantics of the standard speaking of the target phase includes:
performing voice recognition processing on the first voice to obtain a first text;
correcting the first text based on a corpus in the banking and finance field to obtain a second text;
inputting the second text and the standard conversation text into a Sentence-BERT model, and acquiring the similarity between the semantics of the first voice and the semantics of the standard conversation in the target stage.
According to the compliance recognition method, based on the Sentence-BERT model, the semantics of the first voice are compared with the semantics of the standard speaking operation of the stage, whether the semantics of the first voice are similar to the semantics of the standard speaking operation of the target stage can be judged more accurately, so that whether the marketing behaviors of the bank staff accord with the standard operation program is detected, whether the marketing behaviors are compliant is recognized, the non-compliant behaviors which do not accord with the standard operation program are found and corrected in a mode of outputting information, the implementation process of the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be managed more effectively and efficiently. Moreover, based on the great difference between the terms in the banking and finance field and the terms in other fields, the automatic error correction is performed through the corpus in the banking and finance field, so that the accuracy of voice recognition can be greatly improved, the accuracy of semantic analysis on voice data can be improved, and the compliance recognition and the banking marketing management of banking marketing behaviors can be more accurately and effectively performed.
According to one embodiment of the present application, the acquiring the first voice of the bank marketing process includes:
under the condition that at least 2 persons exist in a target area or abnormal sound exists in the target area, starting a recording function to acquire voice data of the bank marketing process; the voice data includes the first voice.
According to the compliance recognition method, the voice data of the bank marketing process is acquired by adopting at least one of the human body induction starting recording function scheme and the abnormal sound starting recording function scheme, the problem of unreasonable recording time can be solved, the utilization rate of pickup equipment can be improved under the condition that voice data are acquired as much as possible, and software and hardware resources consumed by various algorithms in the later period are reduced.
In a second aspect, the present application provides a compliance identification device, the device comprising:
the acquisition module is used for acquiring a first voice in the bank marketing process;
the determining module is used for determining a target stage to which the first voice belongs based on a second voice in the bank marketing process and a customer portrait corresponding to the bank marketing process under the condition that the speaker corresponding to the first voice is a bank employee; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is the voice before the first voice in the bank marketing process;
the output module is used for comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the semantics of the standard speech operation of the target stage are not matched.
According to the compliance recognition device, by combining the standard operation program of the marketing scene, whether the semantics of the voice fragment of the bank staff are matched with the semantics of the standard operation program of the marketing scene to which the voice fragment belongs or not is judged, whether the marketing behavior of the bank staff accords with the standard operation program is detected, whether the marketing behavior is compliant or not is recognized, the non-compliant behavior which does not accord with the standard operation program is found and corrected in an information output mode, the implementation process of the marketing behavior can be supervised and managed, the marketing behavior of the bank staff is standardized, and the marketing behavior of the bank can be managed more effectively and efficiently.
In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the compliance identification method as described in the first aspect when executing the computer program.
In a fourth aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a compliance identification method as described in the first aspect above.
In a fifth aspect, the present application provides a chip comprising a processor and a communication interface, the communication interface and the processor being coupled, the processor being configured to execute a program or instructions to implement the compliance identification method according to the first aspect.
In a sixth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements a compliance identification method as described in the first aspect above.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, wherein:
FIG. 1 is a flow chart of a compliance identification method provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart of SOP of a complaint handling scenario in a compliance identification method provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of SOP of a large amount transfer service scenario or a large amount withdrawal service scenario in a compliance identification method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of the principle of detection of offending behaviors in a compliance identification method provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of the principles of the Sentence-BERT model in the compliance identification method provided by embodiments of the present application;
FIG. 6 is a flow chart illustrating the sub-steps of the configuration step 120 and the step 130 in the compliance identification method provided in the embodiments of the present application;
FIG. 7 is a schematic structural diagram of a compliance identification device provided in an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Technical solutions in the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The compliance recognition method, the compliance recognition device, the electronic equipment and the readable storage medium provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
The compliance recognition method can be applied to the terminal, and can be specifically executed by hardware or software in the terminal.
The terminal includes, but is not limited to, a portable communication device such as a mobile phone or tablet having a touch sensitive surface (e.g., a touch screen display and/or a touch pad). It should also be appreciated that in some embodiments, the terminal may not be a portable communication device, but rather a desktop computer having a touch-sensitive surface (e.g., a touch screen display and/or a touch pad).
In the following various embodiments, a terminal including a display and a touch sensitive surface is described. However, it should be understood that the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and joystick.
The implementation main body of the compliance identification method provided in the embodiment of the present application may be an electronic device or a functional module or a functional entity capable of implementing the compliance identification method in the electronic device, where the electronic device mentioned in the embodiment of the present application includes, but is not limited to, a mobile phone, a tablet computer, a camera, a wearable device, and the like, and the compliance identification method provided in the embodiment of the present application is described below by taking the electronic device as an implementation main body as an example.
As shown in fig. 1, the compliance identification method includes: step 110, step 120 and step 130.
Step 110, a first voice of a bank marketing process is acquired.
In actual execution, the first voice in the bank marketing process can be obtained by recording the bank marketing process.
It should be noted that, the bank marketing process in the embodiment of the present application may be suitable for a face-to-face on-site marketing process performed by a bank employee between a banking website and a customer, and may also be suitable for a marketing process performed by a bank employee with a customer through a telephone.
According to whether the speaker changes or not and whether the time interval between sentences exceeds a preset time length threshold, a section of voice of the same speaker in the bank marketing process can be obtained and used as a first voice.
It is understood that a speaker in a banking marketing process may include at least one customer and at least one banking employee.
In some embodiments, in the event that a change in speaker is detected, a segment of speech from the previous speaker may be used as the first speech.
Whether the speaker changes is detected, and the detection can be based on voice characteristics such as voiceprints and the like. The embodiment of the present application is not particularly limited with respect to a specifically adopted method for detecting whether a speaker changes.
In some embodiments, when no change is detected to the speaker, but no new sentence is detected within a duration threshold after the previous sentence, a section of speech including the previous sentence may be taken as the first speech.
The time length threshold value can be set according to actual situations such as the habit of speaking by the human. The embodiment of the present application is not specifically limited with respect to the specific value of the duration threshold.
Step 120, determining a target stage to which the first voice belongs based on the second voice in the bank marketing process and the customer portrait corresponding to the bank marketing process under the condition that the speaker corresponding to the first voice is a bank employee; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is a voice before the first voice in the bank marketing process.
In actual implementation, for the first voice, it may be first determined whether the speaker corresponding to the first voice is a banking employee or a customer.
The determining whether the speaker corresponding to the first voice is a banking employee or a client may be based on characteristics of voice prints of the banking employee, and the like. The embodiment of the application is not particularly limited for the method for judging whether the speaker corresponding to the first voice is a bank employee or a client.
In the case that the speaker corresponding to the first voice is a bank employee, the context semantics of a plurality of continuous sentences in the second voice occurring before the first voice in the bank marketing process can be analyzed, the marketing scene corresponding to the first voice can be determined in combination with the customer portrait, and the stage of the standard operation program to which the first voice belongs can be determined based on the standard operation program (Standard Operating Procedure, SOP) of the marketing scene. This phase is the target phase. The marketing scene corresponding to the first voice is a target marketing scene.
In some alternative implementations, the semantics of a statement may be obtained by any common semantic analysis method (e.g., a semantic analysis method based on a pre-trained language model or a semantic analysis method for business-oriented modeling, etc.). The embodiment of the present application is not specifically limited to a specifically adopted semantic analysis method.
The marketing scenario may include, but is not limited to, complaint handling, financial transactions, insurance transactions, loan transactions, credit card transactions, deposit transactions, bulk transfer transactions, and bulk withdrawal transactions.
It is to be appreciated that the banking marketing process can include at least one marketing scenario. For example, the bank marketing process is to recommend the customer to purchase the financial product, and complaints are generated in the process, so that the bank marketing process comprises two marketing scenes such as financial business and complaints, and voice data of the bank marketing process can be divided into two data segments, and the two marketing scenes are respectively processed corresponding to the financial business and the complaints.
For another example, the bank marketing process is that the customer consults the large-amount transfer service first, and then purchases the insurance product and transacts the credit card under the recommendation of the bank staff, so that the bank marketing process comprises three marketing scenes of the large-amount transfer service, the insurance service, the credit card service and the like, and the voice data of the bank marketing process can be divided into three data fragments which respectively correspond to the three marketing scenes of the large-amount transfer service, the insurance service and the credit card service.
For another example, in the case that a customer has a fund demand, a suitable loan product is recommended to the customer, so that the banking marketing process includes a loan business as a marketing scenario, and the voice data of the banking marketing process is taken as a whole as a data segment, and the corresponding marketing scenario is the loan business.
Customer portraits corresponding to a bank marketing process refer to portraits of customers participating in the bank marketing process. The customer representation may be derived based on historical behavioral data of the customer or may be derived based on second voice data.
Historical behavioral data refers to historical dialogue data for banking employees and customers. The historical dialogue data can be at least one of monitoring video, audio recording and text chat recording based on communication programs of a bank.
In some embodiments, at least one of the second speech data in the historical behavioral data may be analyzed based on any of the customer representation methods to obtain a description of the characteristics of the customer, thereby obtaining a customer representation. The embodiment of the present application is not particularly limited with respect to the specific client portrayal method employed.
It should be noted that, according to different marketing scenarios of the bank marketing, the bank makes a standard operation program of the marketing scenario in advance. The standard job program for each marketing scenario may include a plurality of different phases.
Fig. 2 is a schematic flow chart of SOP of a complaint handling scenario in a compliance identification method according to an embodiment of the present application. Illustratively, as shown in FIG. 2, the SOP of a complaint handling scenario may include the following phases: a customer emotion comfort stage, a problem information collection stage, a solution proposal stage and a customer opinion solicitation stage.
Fig. 3 is a schematic flow chart of SOP of a large amount transfer service scenario or a large amount withdrawal service scenario in a compliance identification method provided in an embodiment of the present application. For example, as shown in fig. 3, the SOP of the large transfer service scenario or the large withdrawal service scenario may include the following phases: inquiring the reason stage; in case the cause is telecommunication fraud, the next phase of the inquiry cause phase is the announce risk prompt and discourage phase; under the condition of discouraging failure, declaring a risk prompt and encouraging the next stage of the discouraging stage to be a assistance processing stage of contacting the police; in other cases, the next stage of the query reason stage is to recommend other products or services of the line to the customer; the next phase to recommend other products or services of the line to the customer is to establish a customer contact phase.
For the first speech, it may be determined which phase of the standard job program the first speech belongs to based on the semantics of the second speech.
For example, if it is determined that the customer emotion comfort phase of the SOP of the complaint processing scene is completed according to the semantics of the second voice, it may be determined that the first voice belongs to the collecting problem information phase of the SOP of the complaint processing scene.
For another example, if the query cause stage of the SOP of the large-amount transfer service is determined to be completed and the telecommunication fraud is determined to be based on the semantics of the second voice, the announce risk prompt and discouraging stage of the SOP of the large-amount transfer service to which the first voice belongs may be determined.
In some alternative implementations, the above process may include the steps of:
a section of voice of a certain speaker can be obtained as a first voice based on whether the continuous voice belongs to the same speaker or not;
determining that the speaker corresponding to the first voice is a bank employee;
under the condition that the marketing scene corresponding to the first voice is not determined, the semantics of the second voice can be analyzed, and the marketing scene corresponding to the first voice is determined by combining the customer portrait, namely, a target marketing scene is determined;
under the condition that the marketing scene corresponding to the first voice is determined, determining which stage of the SOP of the target marketing scene the first voice belongs to, namely determining the target stage, according to the semantics of the second voice and the SOP of the target marketing scene.
And 130, comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the standard speech operation are not matched.
In actual execution, the semantics of each sentence in the first voice can be respectively compared with the semantics of each sentence in the standard speaking operation of the target stage to which the first voice belongs based on a pre-trained model, so as to determine whether the semantics of the first voice is matched with the semantics of the standard speaking operation of the target stage to which the first voice belongs. Whether the semantics match or not, meaning whether the semantics are similar or not.
The model can be used for acquiring the similarity of the semantics of two sentences. In the case that the similarity of the semantics of the two sentences is greater than or equal to the first threshold value, it may be determined that the semantics of the two sentences are matched; conversely, in the case where the similarity of the semantics of the two sentences is smaller than the first threshold value, it may be determined that the semantics of the two sentences are not matched.
It should be noted that, the first threshold may be predetermined according to the actual situation. The embodiment of the present application is not specifically limited with respect to the specific value of the first threshold.
And integrating whether the semantics of each sentence in the first voice are matched with the semantics of each sentence in the standard speaking operation of the target stage to which the first voice belongs. It may be determined whether the semantics of the first speech match the semantics of the standard speaking of the target phase to which the first speech belongs.
In the actual execution, for each sentence in the standard speaking of the target stage to which the first voice belongs, when the number of sentences for which there is no semantic match in the first voice is greater than the second threshold or the ratio is greater than the third threshold, it may be determined that the semantics of the first voice is not matched with the semantics of the standard speaking of the target stage to which the first voice belongs.
For each sentence in the standard speaking of the target stage to which the first voice belongs, in the case that the number of sentences for which there is no semantic match in the first voice is smaller than the fourth threshold or the ratio is smaller than the fifth threshold, it may be determined that the semantic of the first voice matches the semantic of the standard speaking of the target stage to which the first voice belongs.
For example, for each sentence in the standard speaking of the target stage to which the first speech belongs, in the case where there is a sentence whose semantics match in the first speech, it may be determined that the semantics of the first speech match the semantics of the standard speaking of the target stage to which the first speech belongs.
It should be noted that the second threshold value, the third threshold value, the fourth threshold value, and the fifth threshold value may be predetermined according to actual situations. Specific values of the second threshold, the third threshold, the fourth threshold, and the fifth threshold are not limited in the embodiments of the present application.
After determining whether the semantics of the first voice is matched with the semantics of the standard speaking operation of the target stage to which the first voice belongs, if the semantics of the first voice and the semantics of the standard speaking operation of the target stage are not matched, determining that the marketing behavior corresponding to the first voice does not accord with the standard operation program and outputting alarm information so as to warn the bank staff to remind the marketing behavior of the bank staff that the marketing behavior does not accord with the standard operation program and needs to be immediately corrected. In the case of a semantic mismatch between the two, a prompt may also be output to prompt the relevant manager (e.g., the lead of the banking employee or the staff of the compliance department, etc.), that the marketing activity of the banking employee does not conform to the standard job program.
It will be appreciated that the marketing campaign is not compliant with the standard job program, indicating that the compliance of the marketing campaign is identified as non-compliance and that the marketing campaign is non-compliance.
Through the process, a monitoring and feedback mechanism can be established, and the possible behavior which does not accord with the standard operation program can be found and corrected by monitoring the dialogue content of the bank staff in real time. If the bank staff is found to have the behavior which does not accord with the standard operation program, corresponding measures can be immediately taken for processing and fed back to relevant management staff for timely improvement.
In some alternative embodiments, the first voice may be further marked in case that the semantics of the first voice does not match the semantics of the standard speaking of the target stage to which the first voice belongs, so as to facilitate manual verification and processing after the fact. Under the condition, the voice data of the whole marketing process does not need to be manually listened, and only the marked voice fragments are manually verified to be interfaced, so that the efficiency can be greatly improved.
It will be appreciated that marking a speech segment may be implemented in a variety of ways, such as marking the start and stop points of the speech segment, or cutting the speech segment from speech data and adding a tag as "to be verified", etc.
According to the compliance recognition method provided by the embodiment of the application, by combining the standard operation program of the marketing scene, whether the semantics of the voice fragment of the bank staff are matched with the semantics of the standard operation program of the stage of the standard operation program of the marketing scene to which the voice fragment belongs is judged, whether the marketing behavior of the bank staff accords with the standard operation program is detected, whether the marketing behavior is compliant is recognized, the non-compliant behavior which does not accord with the standard operation program is found and corrected in the mode of outputting information, the implementation process of the marketing behavior can be supervised and managed, the marketing behavior of the bank staff is standardized, and the marketing behavior of the bank can be managed more effectively and efficiently.
In some embodiments, prior to obtaining the first voice in the banking marketing process, the method further comprises: a second voice is acquired.
In actual implementation, the method of acquiring the second voice is the same as the method of acquiring the first voice. Therefore, the specific process of obtaining the second voice can be referred to the foregoing embodiments, and will not be described herein.
And carrying out text analysis on the text corresponding to the second voice to obtain the characteristic information of the client in the speaker corresponding to the second voice.
In actual execution, text analysis can be carried out on the text corresponding to the second voice through a natural language processing technology, so that the client information of each client participating in the bank marketing process is obtained; key information in the customer information can be further extracted as characteristic information of the customer.
In some alternative embodiments, the customer information may include, but is not limited to, customer preferences, loan preferences, investment preferences, life attributes, consumption characteristics, job attributes, family conditions, suspected customer complaints, company information, relationship maps, industry maps, and business representations, among other dimensional information.
In some alternative embodiments, the characteristic information may include language style, topic interest point, emotion state, and the like of the client.
Based on the feature information, a customer representation is acquired.
In actual execution, for each customer, the portrait tag of the customer may be determined based on the feature information of the customer; based on all portrait tags of the customer, a portrait of the customer can be obtained.
It should be noted that, according to different tasks, the characteristic information of the sample customer can be learned and predicted by different algorithms, so as to build a customer portrait tag system which accords with the characteristics of banking and finance industry. The customer portrait tag architecture may include a portrait tag of multiple dimensions. The number of portrait tags per dimension may be multiple.
Portrayal tags for the customer preference dimension may include business preferences, business transaction channels, customer intent, and the like.
The portrait tags of the loan preference dimension may include funding requirements, loan types, and loan intents, among others.
Portrait labels of the investment preference dimension may include investment amounts, deposit preferences, fund preferences, gold preferences, foreign exchange preferences, bond preferences, investment deadlines, investment years, investment experiences, and risk preferences, among others.
The portrait tag of the living attribute dimension may include a income source, a vehicle condition, whether or not a vehicle is credited, a living condition, an annual income, and the like.
Portrayal tags for the consumption feature dimension may include major economic overheads, consumption channels, brand preferences, consumption hotspots, work areas, entertainment modes, and the like.
Portrayal labels for the job attribute dimension may include job units, social security, industry categories, job functions, and the like.
The portrait tag for the family status dimension may include family status, child status, and the like.
Portrait labels of suspected customer complaint dimensions may include suspected customer complaints, and the like.
The portrait tags for the corporate information dimension may include corporate names, business scales, corporate industry and annual revenue, and the like.
Portrait labels of the relationship graph dimension can include equity relationships, group relationships, trade relationships, litigation relationships, guarantee relationships, actual controllers, and the like.
Portrayal labels for industry map dimensions may include industry upstream and downstream, policy planning, industry presence, industry architecture, enterprise distribution, and customer base, among others.
Portrayal tags for enterprise portrayal dimensions may include credit load, benefit case, industry classification, risk seafood, all system classification, financial status, and the like.
The customer portrait tag and the customer portrait can improve the utilization rate and the value of the voice data, and can more accurately judge the two marketing scenes corresponding to the first voice.
According to the compliance recognition method, the target marketing scene and the target stage to which the first voice belongs can be more accurately determined by acquiring the customer portrait, so that whether the marketing behaviors of the bank staff accord with the standard operation program can be more accurately detected, whether the marketing behaviors are compliance is recognized, the non-compliance behaviors which do not accord with the standard operation program are found and corrected in a mode of outputting information, the implementation process of the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be more effectively and efficiently managed.
In some embodiments, after obtaining the first voice of the banking marketing process, the method further comprises: and under the condition that the speaker corresponding to the first voice is a bank employee, detecting the first voice based on at least one of the keyword, the regular expression, the semantic meaning, the emotion rule and the speech speed rule, and judging whether the bank employee has illegal speaking behaviors.
In actual execution, in addition to detecting whether the marketing behavior of the banking staff accords with the standard operation program, whether the banking staff has illegal operation behavior in the marketing process can be detected. Such offensive behavior may include, but is not limited to, behavior such as flyers, podcast operations, offending commitments, avoidance of double entry, and duress transactions.
It will be appreciated that marketing activities that do not conform to standard business procedures are one type of non-compliance activities, and that illicit activities are another type of non-compliance activities.
After determining that the speaker corresponding to the first voice is a bank employee, at least one violation detection method can be flexibly adopted to detect the first voice. The above-mentioned violation detection method may include at least one of a keyword-based detection method, a regular expression-based detection method, a semantic-based detection method, an emotion rule-based detection method, and a speech rate rule-based detection method.
The step of detecting the first voice based on the keywords is to judge whether the bank staff has illegal speaking behaviors by detecting whether the speaking operation of the bank staff in the first voice includes specific keywords. Under the condition that any specific keyword is included in the conversation of the bank staff, the fact that the bank staff has illegal conversation behaviors can be initially judged; in the case that no specific keyword is included in the conversation of the banking staff, it may be determined that the banking staff does not have the illegal conversation.
For example, the keywords may be "cashback" and "absolute security", etc., and in the case that "cashback" or "absolute security" is included in the conversation of the bank employee is detected, it may be determined that the bank employee has illegal conversation.
It is understood that at least one keyword may be preset in the step of detecting the voice data based on the keyword. The embodiments of the present application are not limited to specific keywords.
By constructing a client employee violation keyword library containing a certain number of keywords and detecting the violation behaviors, the operation compliance level of banking sites can be improved and the management capability of the bank can be enhanced.
The detection of the first voice based on the regular expression refers to inquiring the dialect of the bank staff in the first voice through the preset regular expression and judging whether the bank staff has illegal dialect behaviors. Under the condition that the conversation of the bank staff conforming to the regular expression is inquired, the fact that the bank staff has illegal conversation behaviors can be primarily judged; in the case that no conversation of the bank employee conforming to the regular expression is queried, it can be determined that no illegal conversation behavior exists for the bank employee.
It is understood that at least one regular expression may be preset in the step of detecting the voice data based on the regular expression. The embodiments of the present application are not limited to specific regular expressions.
The detection of the first voice based on the semantics refers to judging whether the banking staff has illegal speaking behaviors or not by analyzing the semantics of the speaking of the banking staff in the first voice. Under the condition that the semantics of the conversation of the banking staff is a certain illegal conversation behavior, the situation that the banking staff has the illegal conversation behavior can be primarily judged; in the case where the semantics of the banker's voice is not a rule-breaking action, it may be determined that the banker is not present.
It will be appreciated that the analysis of the semantics of the bank employee's speaking in the first voice may employ a commonly used semantic model. The embodiments of the present application are not limited to a particular semantic model.
The first voice is detected based on emotion rules, namely, emotion of both sides of a conversation is obtained through analysis of the first voice, and then whether the bank staff has illegal speaking behaviors is judged according to whether emotion of a client and/or staff accords with the preset first rules. Under the condition of conforming to the first rule, the bank staff can be primarily judged to have illegal speaking behaviors; in the event that the first rule is not met, it may be determined that the bank employee is not present with the offending act.
It can be appreciated that a common dialogue emotion recognition model may be used to analyze the first voice and obtain the emotion of both parties of the dialogue. The embodiments of the present application are not limited to a specific dialog emotion recognition model.
It is understood that the first rule is a rule regarding emotion of a speaker. The first rule may be predetermined according to the actual situation. For the specific first rule, the embodiment of the present application is not limited.
The first voice is detected based on the voice speed rule, namely, the voice speeds of two parties of the conversation are obtained through analyzing the first voice, and then whether the bank staff has illegal behaviors is judged according to whether the voice speeds of clients and/or the bank staff accord with a preset second rule. Under the condition of conforming to the second rule, the bank staff can be primarily judged to have illegal speaking behaviors; in the event that the second rule is not met, it may be determined that the bank employee is not present with the offending act.
It can be understood that a conventional speech rate estimation method or a speech rate detection method may be used to analyze the first speech to obtain the speech rate of both parties of the conversation. The embodiment of the present application is not limited to a specific speech rate estimation method and speech rate detection method.
It is understood that the second rule is a rule regarding the speaking speed of the speaker. The second rule may be predetermined according to the actual situation. The embodiment of the present application is not limited to the specific second rule.
In actual execution, the first speech may be detected based on at least two of keywords, regular expressions, semantics, emotional rules, and pace rules to improve the accuracy of detecting the offensive behavior.
And outputting at least one of alarm information and prompt information under the condition that illegal operation behaviors exist.
In actual execution, when the bank staff has the illegal operation, the alarm information can be output to warn the bank staff to remind the bank staff of the illegal operation, and the bank staff needs to stop immediately. In the case that a banking employee has a violation, a prompt may also be output to prompt a related manager (e.g., a leader of the banking employee or a staff of a compliance department, etc.) that the banking employee has a violation.
It is understood that determining that the bank employee does not have a violation, the compliance of identifying the marketing activity corresponding to the first voice is an out-of-compliance activity.
In some alternative embodiments, the first voice may also be marked in the presence of offensive behavior.
It will be appreciated that marking the offending portion may be implemented in a variety of ways, such as marking the start and stop points of the offending portion, or cutting the offending portion from voice data and tagging it as "offending" or the like.
For the first speech that is initially marked for the presence of the offending behavior, the later stage can be verified manually.
According to the compliance recognition method provided by the embodiment of the application, whether the illegal behaviors exist in the marketing process of the bank staff is detected based on at least one of the keywords, the regular expressions, the semantics, the emotion rules and the speech speed rules, whether the marketing behaviors are in compliance is recognized, the illegal behaviors such as the illegal behaviors are found and corrected in an information output mode, the illegal behaviors in the marketing behaviors can be monitored and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be managed more effectively and efficiently. Compared with the mode of manually detecting the illegal operation behaviors, the compliance identification method provided by the embodiment of the application can greatly improve the detection efficiency.
In some embodiments, in the case that the speaker corresponding to the first voice is a banking employee, detecting the first voice based on at least one of a keyword, a regular expression, a semantic meaning, an emotion rule, and a speed rule, and determining whether the banking employee has a illegal speaking behavior includes: and carrying out text classification on the first voice based on the textRCNN model, and determining whether the speaker corresponding to the first voice is a bank employee.
In actual execution, text classification can be performed on the first voice based on the trained TextRCNN model, different speakers are distinguished, and whether the speaker corresponding to the first voice is a bank employee is determined.
It should be noted that, in the case that the speaker in the first voice and the second voice includes m bank employees, m sets of first utterances may be obtained. Wherein m is a positive integer. Each bank employee speaking operation is a set of first speaking operation. Similarly, in the case where the speaker in the voice data includes n clients, n sets of second utterances can be obtained. Wherein n is a positive integer. Each customer session is a set of second sessions.
The TextRCNN model may be a text classification model built based on the TextRCNN in combination with convolutional neural networks (Convolutional Neural Network, CNN). The model may capture local features in the sentence using the CNN first, and then capture global features using the recurrent neural network (Recurrent Neural Network, RNN), thereby improving the performance of the model.
The TextRCNN model described above can classify the dialogs of banking employees and customers. The TextRCNN model can better capture local information in a text by introducing convolution operation on the basis of RNN, so that the TextRCNN model can be suitable for the situation that various relations exist between entities.
The trained TextRCNN model may be obtained by training based on a pre-training model in the banking field based on the modification of TextRCNN.
In the training process, after parameter setting and preprocessing of training data, model training can be performed.
The parameters to be set may include: the number of training, the frequency of evaluating the model, the frequency of saving the model, the feature of high-dimensional sparsity, the dimension of the embedded vector, the number of feature graphs, the size of the convolution kernel, the probability of retaining neurons, the weight of regularization term, the average value of all sequence lengths, the number of samples in the model, the classification setting (the value of the parameter can be set to 1 in two classifications, the value of the parameter can be set to the number of categories in multiple classifications), the proportion of the training set, and the like.
The embodiments of the present application are not limited to specific values of the above parameters.
Preprocessing the training data may include, but is not limited to, the following steps: loading data, dividing sentences into word representations, and removing low-frequency words and stop words; generating a training set and a testing set according to the training data; reading word vectors from the pre-trained word vector model, and inputting the word vectors into the model as initialization values; outputting a data set of each batch (batch); constructing a textRCNN model for text classification; and calculating the performance index of the model.
Specific steps of training the model may include, but are not limited to, defining a computational graph.
Under the condition that the speaker corresponding to the first voice is a banking employee, detecting the first voice based on a rule template, and judging whether the banking employee has illegal speaking behaviors; a rule template for representing at least one of keywords, regular expressions, semantics, emotional rules, and pace rules.
In actual execution, the first speech may be extracted for key information based on a rule template for representing at least one of keywords, regular expressions, semantics, emotional rules, and pace rules. The key information may include a result of detecting whether the first voice is a voice of whether the bank employee has a illegal voice operation.
Fig. 4 is a schematic diagram of the principle of detecting illegal behaviors in the compliance identification method provided in the embodiment of the present application. As shown in FIG. 4, text classification is performed on sentences in a first speech based on the text RCNN model to determine whether they belong to a first phone or a second phone, and rule templates can be matched based on the text classification. The rule templates may include at least one of templates of the first rule and templates of the second rule. The half pointer relation extraction is to extract the relation by adopting a half pointer mode. LSTM (Long Short-Term Memory) +CRF (conditional random field, conditional Random Fields) named entity identification is carried out by using an LSTM+CRF model; semantic matching is performed based on the extracted relationship and the identified named entity, so as to realize detection of illegal operation behaviors. And structuring the key information, namely performing structuring treatment on the key information.
It should be noted that, business rules in the banking and finance field are very different from those in other fields, so that model customization and training are required to be performed on data in the banking and finance field, so that a comprehensive detection model of illegal behaviors of banking staff can be obtained. Historical dialogue data for bank marketing can be used as a training sample for the bank employee offending behavior detection model.
First, the LSTM+CRF model may be employed in the sequence annotation task.
The detection model of the illegal behaviors of the bank staff is trained based on the LSTM+CRF model, and the illegal behaviors of the bank staff can be effectively identified through deep learning of training samples.
In addition, a large number of terms and terms exist in the banking and finance field, so that the term part of speech can be selected as an important characteristic for training in the embodiment of the application, the number of model parameters is reduced, and the training efficiency of the model is improved.
In the training process of the bank employee illegal behaviors detection model, besides the basic word bag model or N-gram model and other characteristic engineering methods, a named entity recognition technology can be introduced, and entity information such as terms and terms in a dialogue can be effectively recognized, so that the accuracy of the bank employee illegal behaviors detection model is improved.
Secondly, the large-scale unsupervised corpus can be utilized for pre-training to obtain parameters of the language model, and then the parameters are used as initial parameters and fine-tuning is performed through training. The training method can capture language rules and context information by utilizing the pre-training model, and can improve the training efficiency, accuracy and other effects of the model.
Third, an attention mechanism may be introduced after LSTM, using a Softmax function to translate the attention score into an attention weight for use in weighting elements in the input sequence. The context information is considered in a weighted mode, so that the accuracy of the model and other effects can be improved.
In actual execution, training may be performed for a specific task. Text classification tasks are based on a given sentence, which is classified into a specified category. The clustering task is to divide a given text set into a plurality of clusters, so that the text similarity in the same cluster is high, and the text similarity between different clusters is low. And recommending tasks, namely recommending texts related to the interests of the user according to the given user and the context information. By training separately for specific tasks, the method can better adapt to the text characteristics and distribution in the banking and finance field.
The key information extracted based on the TextRCNN model may further include the customer portrait tag. The TextRCNN model may also extract key information in the banking employee's phone skills and the customer's phone skills. The TextRCNN model can better capture local information in a text by introducing convolution operation on the basis of RNN, so that the situation that various relations exist between entities can be solved.
In some alternative embodiments, the detection results of the bank employee offence detection model (including the lstm+crf model and the TextRCNN model) can be continuously evaluated and fed back, so that the bank employee offence detection model is continuously perfected and optimized to improve the accuracy and reliability of the bank employee offence detection model.
According to the compliance recognition method provided by the embodiment of the application, text classification and key information extraction are performed based on the textRCNN model, whether the bank staff has illegal behaviors in the marketing process or not is detected based on at least one of keywords, regular expressions, semantics, emotion rules and speech speed rules, whether the marketing behaviors are in compliance is recognized, illegal behaviors such as illegal behaviors are found and corrected in a mode of outputting information, the illegal behaviors in the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, the marketing behaviors of the bank can be managed more effectively and efficiently, the operation compliance level of the bank website can be improved, and the management capability of the bank is enhanced.
In some embodiments, comparing the semantics of the first speech to the semantics of standard speech of the target phase comprises: and performing voice recognition processing on the first voice to obtain a first text.
In actual implementation, the first speech may be recognized by any speech recognition method (e.g., a speech recognition method based on template matching or a speech recognition method based on a hidden markov model, etc.), and the speech may be converted into text to obtain the first text. The first text is text corresponding to the first voice. The embodiment of the present application is not specifically limited to a specific voice recognition method.
And correcting the error of the first text based on the corpus in the banking and finance field to obtain a second text.
In actual implementation, a corpus of banking financial fields may store a large number of terms and vocabulary of the banking financial fields. The terms and words in the banking and finance field can be obtained through the internet or a professional dictionary.
In some alternative implementations, a corpus of banking financial fields may store terms and words commonly used in the banking financial fields.
The corpus of banking financial domain may be an unstructured database (e.g., HBase, etc.). The terms and words in the banking and finance field can be stored in a corpus in the banking and finance field in a text form, so that the processing such as searching is facilitated.
According to the corpus in the banking and finance field, automatic correction can be performed on the "wrong words" in the text data converted from the voice data, namely, the first text is automatically corrected, and the second text is obtained.
For example, the "bandwidth" in the first text may be corrected to "loan" or the "heterogeneous" card "in the first text may be corrected to" class of card "based on a corpus in the banking financial domain.
The corpus in the banking and finance field can be updated based on the Internet, the corpus in the banking and finance field can be enriched to the maximum limit, hot words can be updated in time, and the accuracy of error correction is improved.
Inputting the second text and the standard speaking text into a Sentence-BERT model, and acquiring the similarity between the semantics of the first voice and the semantics of the standard speaking of the target stage.
In actual execution, the semantics of each Sentence in the two second texts corresponding to the first voice can be respectively compared with the semantics of each Sentence in the standard speaking operation of the target stage to which the first voice belongs through a trained Sentence-BERT (also called SBERT) model, so that whether the semantics of the first voice are matched with the semantics of the standard speaking operation of the target stage to which the first voice belongs, namely whether the similarity of the semantics is larger than or equal to a first threshold value is determined.
The trained Sentence-BERT model can be obtained by training on the basis of a pre-training model in the bank finance field based on the Sentence-BERT improvement.
In the embodiment of the application, a semantic-based SOP speech detection system is established, and the line vectorization is performed on sentences in a marketing scene. In actual implementation, similarity matching is performed between the banker's phone and standard phone. Each Sentence in the standard speech technology can be encoded offline in advance through a Sentence-BERT model to obtain Sentence vectors. In SOP (system on a chip) voice operation detection, only the voice operation of a bank employee and the voice operation of a client are required to be encoded, so that the similarity comparison efficiency can be greatly improved.
Fig. 5 is a schematic diagram of the principles of the Sentence-BERT model in the compliance identification method provided in an embodiment of the present application. As shown in fig. 5, a sentence in a speech fragment may be input to the BERT of the left branch, a sentence in a standard speech skill may be input to the BERT of the right branch, and the expression vectors of the two sentences are calculated to obtain corresponding CLS values, i.e., u and v in fig. 5; and then splicing u, v and |u-v| and classifying through a Softmax layer so as to obtain the semantic similarity of the two sentences.
It should be noted that, in the pooling step in fig. 5, different pooling methods may be selected according to practical situations. For example, max pooling is employed in embodiments of the present application.
It is understood that this step includes a semantic similarity calculation task. The semantic similarity calculation task refers to a task of calculating semantic similarity between two sentences given.
In actual execution, training may be performed for a specific task. For the semantic similarity calculation task in the embodiment of the application, at least one of a plurality of processing modes such as random insertion, deletion, replacement, synonym replacement, sentence inversion and fusion and the like is carried out on the original data collected and preprocessed in the earlier stage by using a data amplification technology in the training stage of the Sentence-BERT model, so that a new Sentence is generated, and the generalization and adaptability of the Sentence-BERT model are improved.
In some alternative embodiments, the detection result of the Sentence-BERT model may be continuously evaluated and fed back, so as to continuously perfect and optimize the Sentence-BERT model, so as to improve the accuracy and reliability thereof.
According to the compliance recognition method provided by the embodiment of the application, based on the Sentence-BERT model, the semantics of the first voice and the semantics of the standard speaking operation in the target stage are compared, so that whether the semantics of the first voice are similar to the semantics of the standard speaking operation in the target stage or not can be judged more accurately, whether the marketing behaviors of the bank staff accord with the standard operation program or not is detected, whether the marketing behaviors are compliant or not is recognized, the non-compliant behaviors which do not accord with the standard operation program are found and corrected in a mode of outputting information, the implementation process of the marketing behaviors can be supervised and managed, the marketing behaviors of the bank staff are standardized, and the marketing behaviors of the bank can be managed more effectively and efficiently. Moreover, based on the great difference between the terms in the banking and finance field and the terms in other fields, the automatic error correction is performed through the corpus in the banking and finance field, so that the accuracy of voice recognition can be greatly improved, the accuracy of semantic analysis on voice data can be improved, and the compliance recognition and the banking marketing management of banking marketing behaviors can be more accurately and effectively performed.
In some embodiments, the standard job program may be configured visually. The flow of the standard job program of each marketing scene can be generated by the operations of the configurator drag and the like.
Fig. 6 is a flow chart illustrating the sub-steps of the configuration step 120 and the step 130 in the compliance identification method according to the embodiment of the present application. As shown in fig. 6, the configuration process may include: step 610, step 620, step 630, step 640, step 650, step 660, and step 670.
Step 610, writing a BPMN file.
Business process modeling markup (Business Process Modeling Notation, BPMN) files may be written first to generate the nodes of the SOP session detection process based on the BPMN files. The above-mentioned nodes may include a start node, an end node, a judgment node, a detection node, and the like.
It will be appreciated that the SOP session detection procedure includes steps 120 and 130.
Step 620, create a flow engine.
The process of creating a flow engine creates multiple core tables. The core tables may include sop_ge_x, sop_hi_x, sop_re_x, and the like.
Core table sop_ge_, which can be used to table the basic information of the SOP session detection procedure.
Core table sop_hi_, which can be used to hold historical data and SOP detection session instances.
Core table sop_re_, may be used to hold detection rule information. A detection rule may consist of one or more detection conditions. For example, the "civilization stay" detection rule may be configured such that two detection conditions, such as whether a banking employee uses an illicit language and whether the banking employee's speed of speech is not within a set range, satisfy one of them.
Step 630, obtaining a core class.
The core class SOPRepositortyService can be used for deploying and managing flow resources, such as BPMN files, attachments and the like.
Step 640, deployment flow.
Based on the core class sopranostioryservice, the flow of the SOP (refer to each stage included in the SOP) is deployed.
Step 650, start the flow.
A class SOPRuntimeERVIE is created, and the flow of SOP call detection runtime is started.
Step 660, creating a task.
And creating a class SOPTaskService used for storing the information of the tasks in the SOP call detection flow.
Step 670, completing the task.
And creating a class SOPHisttoryService for storing the history information of the SOP call detection flow.
And creating a class SOPManagerService used for storing information of the flow engine.
According to the compliance identification method provided by the embodiment of the application, the SOP call detection flow is configured through the visualization method, so that the SOP and the SOP call detection flow are more flexible and convenient to configure, and the operation capability of banking sites can be improved.
In the related art, the recording time is unreasonable. The pick-up equipment set at the bank business is always recording from the start of every morning to the stop of every evening. Therefore, a large amount of non-human voice and invalid voice record exist in the recorded data, which can cause great burden on subsequent data processing and consume a large amount of software and hardware resources.
In some embodiments, obtaining a first voice of a banking marketing process includes: under the condition that at least 2 persons exist in the target area or abnormal sound exists in the target area, starting a recording function to acquire voice data of a bank marketing process; the voice data includes a first voice.
In actual implementation, the voice data collection in step 110 may employ at least one of a human body induction start recording function scheme and an abnormal sound start recording function scheme. Through the recording function, all voice data of the bank marketing process including the first voice and the second voice can be acquired.
In actual execution of the scheme of the human body induction wake-up pickup device, human body induction is additionally arranged on the basis of the traditional pickup device. The human body sensor can be used for identifying human bodies in the target and counting the number of people by analyzing the temperature of the sensed target. Under the condition that people exist and the number of people is more than or equal to 2, the recording function can be automatically started, and the pickup equipment of the hardware is awakened.
In the actual execution of the abnormal sound starting recording function scheme, an abnormal sound detection device is additionally arranged on the basis of the traditional pickup equipment. When at least one of abnormal sounds such as spike sounds, crying sounds, loud speaking sounds, continuous striking sounds, glass breaking sounds and the like occur in banking outlets, the sound recording function can be automatically started, and sound pickup equipment of hardware is awakened.
In some alternative implementations, the recording function may also be turned off automatically after the abnormal sound has ended for a period of time.
It will be appreciated that the sound pickup apparatus is electrically connected to the compliance recognition device. Under the condition that the pickup device has a recording function, wake up the pickup device and start the recording function of the pickup device, and after the pickup device records, the compliance recognition device can acquire voice data of a bank marketing process obtained by recording from the pickup device. Under the condition that the pickup equipment does not have a recording function, recording software installed on the compliance recognition device can be started to record sound signals collected by the pickup equipment, so that voice data of a bank marketing process are obtained.
It will be appreciated that the pickup apparatus described above may be disposed in a target area. The target area may be an area where marketing may occur, such as a business handling area of a banking website or a financial department.
In some optional implementations, after the voice data of the bank marketing process is acquired, the voice data can also be automatically uploaded to a server, and a data warehouse is built, so that later backtracking is facilitated.
In some optional implementation manners, the sound pickup equipment with different specifications can be used in different target areas in a matching way, and the human body induction starting recording function scheme or the abnormal sound starting recording function scheme is flexibly adopted.
According to the compliance identification method provided by the embodiment of the application, the voice data of the bank marketing process is acquired by adopting at least one of a human body induction starting recording function scheme and an abnormal sound starting recording function scheme, the problem of unreasonable recording time can be solved, the utilization rate of sound pickup equipment can be improved under the condition that voice data are acquired as much as possible, and software and hardware resources consumed by various algorithms in the later period are reduced.
According to the compliance identification method provided by the embodiment of the application, the execution subject can be a compliance identification device. In the embodiment of the application, a compliance recognition device executes a compliance recognition method as an example, and the compliance recognition device provided in the embodiment of the application is described.
The embodiment of the application also provides a compliance recognition device.
As shown in fig. 7, the compliance recognition apparatus includes: an acquisition module 710, a determination module 720, and an output module 730.
An acquisition module 710, configured to acquire a first voice in a bank marketing process;
a determining module 720, configured to determine, in case that the speaker corresponding to the first voice is a banking employee, a target stage to which the first voice belongs based on the second voice in the banking marketing process and the customer portrait corresponding to the banking marketing process; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is the voice before the first voice in the bank marketing process;
and an output module 730, configured to compare the semantics of the first voice with the semantics of the standard speech operation of the target stage, and output at least one of the alarm information and the prompt information if the semantics of the first voice and the standard speech operation are not matched.
According to the compliance recognition device provided by the embodiment of the application, by combining the standard operation program of the marketing scene, whether the semantics of the voice fragment of the bank staff are matched with the semantics of the standard operation program of the marketing scene to which the voice fragment belongs is judged, whether the marketing behavior of the bank staff accords with the standard operation program is detected, whether the marketing behavior is compliant is recognized, the non-compliant behavior which does not accord with the standard operation program is found and corrected in the mode of outputting information, the implementation process of the marketing behavior can be supervised and managed, the marketing behavior of the bank staff is standardized, and the marketing behavior of the bank can be managed more effectively and efficiently.
In some embodiments, the obtaining module 710 may also be configured to obtain a second voice;
the device may further include:
the feature extraction module is used for carrying out text analysis on the text corresponding to the second voice and obtaining the feature information of the client in the speaker corresponding to the second voice;
and the image module is used for acquiring the customer image based on the characteristic information.
In some embodiments, the apparatus may further comprise:
the detection module is used for detecting the first voice based on at least one of keywords, regular expressions, semantics, emotion rules and speech speed rules under the condition that the speaker corresponding to the first voice is a bank employee, and judging whether the bank employee has illegal speaking behaviors; and outputting at least one of alarm information and prompt information under the condition that illegal operation behaviors exist.
In some embodiments, the detection module may be specifically configured to:
text classification is carried out on the first voice based on the textRCNN model, and whether a speaker corresponding to the first voice is a bank employee or not is determined;
under the condition that the speaker corresponding to the first voice is a banking employee, detecting the first voice based on a rule template, and judging whether the banking employee has illegal speaking behaviors; a rule template for representing at least one of keywords, regular expressions, semantics, emotional rules, and pace rules.
In some embodiments, the output module 730 may be specifically configured to:
performing voice recognition processing on the first voice to obtain a first text;
correcting the error of the first text based on a corpus in the banking and finance field to obtain a second text;
inputting the second text and the standard speaking text into a Sentence-BERT model, and acquiring the similarity between the semantics of the first voice and the semantics of the standard speaking of the target stage.
In some embodiments, the obtaining module 710 may be specifically configured to start a recording function to obtain voice data of a bank marketing process when at least 2 people are detected to exist in the target area, or when abnormal sounds are detected to exist in the target area; the voice data includes a first voice.
The compliance recognition device in the embodiment of the application can be an electronic device, and also can be a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the electronic device may be a mobile phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.
The compliance identification device in the embodiments of the present application may be a device having an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
The compliance identification apparatus provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to 6, and in order to avoid repetition, a description is omitted here.
In some embodiments, as shown in fig. 8, the embodiment of the present application further provides an electronic device 800, including a processor 801, a memory 802, and a computer program stored in the memory 802 and capable of running on the processor 801, where the program when executed by the processor 801 implements the respective processes of the above-mentioned embodiments of the compliance identification method, and the same technical effects are achieved, so that repetition is avoided, and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device described above.
The embodiments of the present application further provide a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program when executed by a processor implements each process of the above-mentioned embodiments of the compliance identification method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the above-described compliance identification method.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, so as to implement each process of the above-mentioned compliance identification method embodiment, and achieve the same technical effect, so that repetition is avoided, and no further description is provided here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the application, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A method of compliance identification, comprising:
acquiring a first voice in a bank marketing process;
under the condition that the speaker corresponding to the first voice is a bank employee, determining a target stage to which the first voice belongs based on a second voice in the bank marketing process and a customer portrait corresponding to the bank marketing process; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is the voice before the first voice in the bank marketing process;
comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the standard speech operation are not matched.
2. The method of compliance identification of claim 1, wherein prior to the capturing the first voice in the banking marketing process, the method further comprises:
acquiring the second voice;
text analysis is carried out on the text corresponding to the second voice, and characteristic information of clients in the speaker corresponding to the second voice is obtained;
And acquiring the customer portrait based on the characteristic information.
3. The method of compliance identification of claim 1, wherein after the first voice in the banking marketing process is obtained, the method further comprises:
under the condition that the speaker corresponding to the first voice is a bank employee, detecting the first voice based on at least one of keywords, regular expressions, semantics, emotion rules and speech speed rules, and judging whether the bank employee has illegal speaking behaviors;
and outputting at least one of alarm information and prompt information under the condition that illegal operation behaviors exist.
4. The compliance identification method of claim 3, wherein, in the case where the speaker corresponding to the first voice is a banking employee, detecting the first voice based on at least one of a keyword, a regular expression, a semantic meaning, an emotion rule, and a speed rule, and determining whether the banking employee has a violation act includes:
text classification is carried out on the first voice based on a textRCNN model, and whether a speaker corresponding to the first voice is a banking employee or not is determined;
Under the condition that the speaker corresponding to the first voice is a bank employee, detecting the first voice based on a rule template, and judging whether the bank employee has illegal speaking behaviors; the rule templates are used for representing at least one of keywords, regular expressions, semantics, emotion rules and speed of speech rules.
5. The compliance identification method of claim 1, wherein the comparing the semantics of the first speech with the semantics of standard speaking at the target stage comprises:
performing voice recognition processing on the first voice to obtain a first text;
correcting the first text based on a corpus in the banking and finance field to obtain a second text;
inputting the second text and the standard conversation text into a Sentence-BERT model, and acquiring the similarity between the semantics of the first voice and the semantics of the standard conversation in the target stage.
6. The method of any one of claims 1 to 5, wherein the obtaining a first voice of a banking marketing process comprises:
under the condition that at least 2 persons exist in a target area or abnormal sound exists in the target area, starting a recording function to acquire voice data of the bank marketing process; the voice data includes the first voice.
7. A compliance identification device, comprising:
the acquisition module is used for acquiring a first voice in the bank marketing process;
the determining module is used for determining a target stage to which the first voice belongs based on a second voice in the bank marketing process and a customer portrait corresponding to the bank marketing process under the condition that the speaker corresponding to the first voice is a bank employee; the target stage is a stage in a standard operation program of a target marketing scene; the target marketing scene is a marketing scene corresponding to the first voice; the second voice data is the voice before the first voice in the bank marketing process;
the output module is used for comparing the semantics of the first voice with the semantics of the standard speech operation of the target stage, and outputting at least one of alarm information and prompt information under the condition that the semantics of the first voice and the semantics of the standard speech operation of the target stage are not matched.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the compliance identification method of any one of claims 1-6 when the program is executed by the processor.
9. A non-transitory computer readable storage medium, having stored thereon a computer program, which when executed by a processor implements the compliance identification method of any of claims 1-6.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the compliance identification method according to any one of claims 1-6.
CN202311518310.8A 2023-11-14 2023-11-14 Compliance recognition method and recognition device Pending CN117437936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311518310.8A CN117437936A (en) 2023-11-14 2023-11-14 Compliance recognition method and recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311518310.8A CN117437936A (en) 2023-11-14 2023-11-14 Compliance recognition method and recognition device

Publications (1)

Publication Number Publication Date
CN117437936A true CN117437936A (en) 2024-01-23

Family

ID=89553333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311518310.8A Pending CN117437936A (en) 2023-11-14 2023-11-14 Compliance recognition method and recognition device

Country Status (1)

Country Link
CN (1) CN117437936A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020211354A1 (en) * 2019-04-16 2020-10-22 平安科技(深圳)有限公司 Speaker identity recognition method and device based on speech content, and storage medium
CN111897931A (en) * 2020-06-24 2020-11-06 深圳追一科技有限公司 Session setting method and apparatus, server, computer readable storage medium
CN113723767A (en) * 2021-08-10 2021-11-30 上海浦东发展银行股份有限公司 Business process quality inspection method and device based on voice interaction data
CN113808616A (en) * 2021-09-16 2021-12-17 平安银行股份有限公司 Voice compliance detection method, device, equipment and storage medium
CN113887239A (en) * 2021-09-29 2022-01-04 未鲲(上海)科技服务有限公司 Statement analysis method and device based on artificial intelligence, terminal equipment and medium
CN114510556A (en) * 2020-11-17 2022-05-17 北京有限元科技有限公司 Method, apparatus and storage medium for determining dialogs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020211354A1 (en) * 2019-04-16 2020-10-22 平安科技(深圳)有限公司 Speaker identity recognition method and device based on speech content, and storage medium
CN111897931A (en) * 2020-06-24 2020-11-06 深圳追一科技有限公司 Session setting method and apparatus, server, computer readable storage medium
CN114510556A (en) * 2020-11-17 2022-05-17 北京有限元科技有限公司 Method, apparatus and storage medium for determining dialogs
CN113723767A (en) * 2021-08-10 2021-11-30 上海浦东发展银行股份有限公司 Business process quality inspection method and device based on voice interaction data
CN113808616A (en) * 2021-09-16 2021-12-17 平安银行股份有限公司 Voice compliance detection method, device, equipment and storage medium
CN113887239A (en) * 2021-09-29 2022-01-04 未鲲(上海)科技服务有限公司 Statement analysis method and device based on artificial intelligence, terminal equipment and medium

Similar Documents

Publication Publication Date Title
CN110188194B (en) False news detection method and system based on multitask learning model
US20210272040A1 (en) Systems and methods for language and speech processing with artificial intelligence
US20200074312A1 (en) System and method for call centre management
US11082554B2 (en) Method for conversion and classification of data based on context
CN114556354A (en) Automatically determining and presenting personalized action items from an event
CN113094578B (en) Deep learning-based content recommendation method, device, equipment and storage medium
WO2018184518A1 (en) Microblog data processing method and device, computer device and storage medium
Chamishka et al. A voice-based real-time emotion detection technique using recurrent neural network empowered feature modelling
KR102100214B1 (en) Method and appratus for analysing sales conversation based on voice recognition
EP4239496A1 (en) Near real-time in-meeting content item suggestions
CN114547475B (en) Resource recommendation method, device and system
US20220156460A1 (en) Tool for categorizing and extracting data from audio conversations
CN113919437A (en) Method, device, equipment and storage medium for generating client portrait
CN116151233A (en) Data labeling and generating method, model training method, device and medium
US11487767B2 (en) Automated object checklist
CN117493973A (en) Social media negative emotion recognition method based on generation type artificial intelligence
CN117171403A (en) Data processing method, device, computer equipment and storage medium
Jia et al. A deep learning system for sentiment analysis of service calls
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
US11586878B1 (en) Methods and systems for cascading model architecture for providing information on reply emails
Chaurasia et al. Twitter Sentiment Analysis using Natural Language Processing
CN117437936A (en) Compliance recognition method and recognition device
CN113807920A (en) Artificial intelligence based product recommendation method, device, equipment and storage medium
US20240121125A1 (en) Data analytics platform for stateful, temporally-augmented observability, explainability and augmentation in web-based interactions and other user media
Vo Machine learning algorithms for wealth data analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination