CN114610840A - Sensitive word-based accounting monitoring method, device, equipment and storage medium - Google Patents

Sensitive word-based accounting monitoring method, device, equipment and storage medium Download PDF

Info

Publication number
CN114610840A
CN114610840A CN202210289706.9A CN202210289706A CN114610840A CN 114610840 A CN114610840 A CN 114610840A CN 202210289706 A CN202210289706 A CN 202210289706A CN 114610840 A CN114610840 A CN 114610840A
Authority
CN
China
Prior art keywords
voice
message
user
preset
overdue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210289706.9A
Other languages
Chinese (zh)
Inventor
衷平平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202210289706.9A priority Critical patent/CN114610840A/en
Publication of CN114610840A publication Critical patent/CN114610840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90348Query processing by searching ordered data, e.g. alpha-numerically ordered data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to the field of big data and discloses an accounting monitoring method, device, equipment and storage medium based on sensitive words. The method comprises the following steps: detecting overdue users and acquiring communication information of the overdue users; receiving a text message input by a salesman and detecting sensitive words of the text message, if the sensitive words are not detected, sending an acceptance message to an overdue user according to communication information; detecting the repayment state of an overdue user based on a preset frequency within a preset time period after the text message is sent, if the minimum repayment state is reached, ending the financial affair collection flow, and otherwise, determining that the user is a lost-letter user; sending a message losing prompt receiving service instruction to a waiter and receiving a voice message collected by a terminal; and identifying the voice message based on a voice identification algorithm, and if the voice message is malicious voice, sending a warning to the terminal. The invention improves the quality of monitoring service by identifying and filtering the communication text and the communication voice between the salesman and the borrower.

Description

Sensitive word-based accounting monitoring method, device, equipment and storage medium
Technical Field
The invention relates to the field of big data, in particular to an accounting monitoring method, device, equipment and storage medium based on sensitive words.
Background
The reasonable wind control means is an indispensable link in the financial loan business, and is generally divided into pre-loan wind control means and post-loan wind control means, the pre-loan wind control means generally comprises property pledges and the like in legal ways, and the post-loan wind control means is carried out by monitoring the financial flow and regular interview investigation of business personnel on the current situation of a borrower in time, so that the bad account rate can be effectively reduced, the occurrence of financial affairs can be delayed, and the service quality of the financial loan business can be improved.
The existing accounting monitoring method generally sends a list to business personnel to directly negotiate with a borrower, the negotiation process is opaque, and the business personnel often apply an opinion containing malicious information to the borrower for improving personal performance, so that the service quality and the image of the whole company are influenced, and the monitoring quality of the accounting monitoring method is low.
Disclosure of Invention
The invention mainly aims to solve the problem of low accuracy of the existing sensitive word-based accounting monitoring method.
The invention provides an accounting monitoring method based on sensitive words in a first aspect, which comprises the following steps:
detecting overdue users in a preset user database, and acquiring communication information of the overdue users;
receiving a text message input by a salesman at a terminal, carrying out sensitive word detection on the text message, and if the sensitive word is not detected, sending a message of receiving to the overdue user according to the communication information;
detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection flow of the overdue user, and otherwise, determining that the overdue user is a lost user;
sending a message loss prompt receiving service instruction to the service staff, and receiving a voice message collected by the terminal, wherein the message loss prompt receiving service instruction is used for instructing the service staff to establish voice communication to the message loss user by using the terminal, and the voice message is a voice message in the communication process between the service staff and the message loss user;
and based on a preset voice recognition algorithm, recognizing the voice message to judge whether the voice message is malicious voice, and if the voice message is the malicious voice, sending a warning to the terminal.
Optionally, in a first implementation manner of the first aspect of the present invention, the receiving a text message input by an attendant at a terminal, and performing sensitive word detection on the text message, and if a sensitive word is not detected, sending an acceptance-urging message to the overdue user according to the communication information includes:
receiving a text message input by an operator at a terminal, sending a data access request to a preset distributed cache server, and receiving a search tree returned by a cache node in the distributed cache server, wherein the search tree corresponds to a main key contained in the data access request;
and determining a matching result of the text message in the search tree based on an AC automatic machine algorithm, determining sensitive words in the text message according to the matching result, and sending an acceptance message to the overdue user according to the communication information if any sensitive word is not detected.
Optionally, in a second implementation manner of the first aspect of the present invention, before the receiving a text message input by a service person at a terminal, sending a data access request to a preset distributed cache server, and receiving a search tree returned by a cache node in the distributed cache server, the method further includes:
acquiring a sensitive word library constructed by a user, and constructing a search tree corresponding to the sensitive word library based on an AC (automatic control) automaton algorithm;
synchronizing the search tree into the distributed cache server to store the search tree in a distributed manner in each cache node of the distributed cache server.
Optionally, in a third implementation manner of the first aspect of the present invention, the recognizing, based on a preset speech recognition algorithm, the speech message to determine whether the speech message is a malicious speech, and if the speech message is a malicious speech, sending an alert to the terminal includes:
generating an audio fingerprint to be identified according to the voice message to be identified;
matching the audio fingerprint with a preset audio hash table, if the audio fingerprint is successfully matched with the audio hash table, identifying the voice message as malicious voice, and otherwise, extracting Mel cepstrum coefficient characteristics from the voice message;
performing keyword analysis on the Mel cepstrum coefficient characteristics, and generating a retrieval score;
and if the retrieval score is larger than a preset threshold value, identifying the voice message as malicious voice, and sending a warning to the terminal.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the generating an audio fingerprint to be recognized according to the voice message to be recognized includes:
extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
connecting the characteristics of the multi-frame filter banks to generate a voice spectrogram to be recognized;
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions based on a preset dividing rule, and calculating a binary code value of each spectrogram region;
and splicing the binary code values of each spectrogram region of the voice spectrogram to be identified to obtain the audio fingerprint to be identified.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the dividing the to-be-recognized speech spectrogram into a plurality of spectrogram regions based on a preset dividing rule, and calculating a binary code value of each spectrogram region includes:
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions in designated distribution, wherein each spectrogram region comprises a horizontal axis direction and a vertical axis direction, each vertical axis direction comprises a plurality of sub-bands, and each sub-band has sub-band energy;
calculating the average sub-band energy of each sub-band based on the horizontal axis direction of the spectrogram region;
and carrying out binary coding on the spectrogram region according to the average sub-band energy to obtain a binary coding value of each spectrogram region.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the performing keyword analysis on the mel-frequency cepstrum coefficient features and generating a retrieval score includes:
calling a preset acoustic model to process the Mel cepstrum coefficient characteristics to obtain a plurality of candidate word lattices and acoustic scores corresponding to the candidate word lattices;
calling a preset language model to process the candidate word lattices to obtain a plurality of keywords and a language score corresponding to each keyword;
calculating the acoustic score corresponding to each candidate word lattice and the language score corresponding to each keyword based on a Viterbi algorithm to obtain an optimal score;
and determining a target keyword corresponding to the optimal score, calling a preset dynamic programming algorithm model, and retrieving in a preset malicious keyword library according to the target keyword to obtain a retrieval score corresponding to the target keyword.
The second aspect of the present invention provides an accounting monitoring apparatus based on sensitive words, including:
the system comprises a user detection module, a user identification module and a user identification module, wherein the user detection module is used for detecting overdue users in a preset user database and acquiring communication information of the overdue users;
the text detection module receives a text message input by an operator at a terminal, performs sensitive word detection on the text message, and sends a message of receiving to the overdue user according to the communication information if the sensitive word is not detected;
the compensation detection module is used for detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection process of the overdue user, and otherwise, determining that the overdue user is a lost user;
the voice acquisition module is used for sending a message loss prompt receiving service instruction to the waiter and receiving a voice message acquired by the terminal, wherein the message loss prompt receiving service instruction is used for instructing the waiter to establish voice communication to the message loss user by using the terminal, and the voice message is a voice message in the communication process between the waiter and the message loss user;
and the voice recognition module is used for recognizing the voice message based on a preset voice recognition algorithm so as to judge whether the voice message is malicious voice or not, and if the voice message is the malicious voice, sending a warning to the terminal.
Optionally, in a first implementation manner of the second aspect of the present invention, the text detection module specifically includes:
the system comprises a tree acquisition unit, a data access unit and a search unit, wherein the tree acquisition unit is used for receiving a text message input by an operator at a terminal, sending a data access request to a preset distributed cache server and receiving a search tree returned by a cache node in the distributed cache server, and the search tree corresponds to a main key contained in the data access request;
and the tree matching unit is used for determining a matching result of the text message in the search tree based on an AC automata algorithm, determining a sensitive word in the text message according to the matching result, and sending an acceptance message to the overdue user according to the communication information if no sensitive word is detected.
Optionally, in a second implementation manner of the second aspect of the present invention, the text detection module specifically includes:
the system comprises a tree construction unit, a search unit and a search unit, wherein the tree construction unit is used for acquiring a sensitive word library constructed by a user and constructing a search tree corresponding to the sensitive word library based on an AC (automatic control) automaton algorithm;
the tree cache unit is used for synchronizing the search tree to the distributed cache server so as to store the search tree in a distributed manner in each cache node of the distributed cache server;
the system comprises a tree acquisition unit, a data access unit and a search unit, wherein the tree acquisition unit is used for receiving a text message input by an operator at a terminal, sending a data access request to a preset distributed cache server and receiving a search tree returned by a cache node in the distributed cache server, and the search tree corresponds to a main key contained in the data access request;
and the tree matching unit is used for determining a matching result of the text message in the search tree based on an AC automata algorithm, determining sensitive words in the text message according to the matching result, and sending an acceptance message to the overdue user according to the communication information if any sensitive word is not detected.
Optionally, in a third implementation manner of the second aspect of the present invention, the speech recognition module specifically includes:
the fingerprint generating unit is used for generating an audio fingerprint to be identified according to the voice message to be identified;
the hash matching unit is used for matching the audio fingerprint with a preset audio hash table, if the audio fingerprint is successfully matched with the audio hash table, the voice message is identified as malicious voice, and otherwise, the Mel cepstrum coefficient characteristics are extracted from the voice message;
the score calculation unit is used for carrying out keyword analysis on the Mel cepstrum coefficient characteristics and generating a retrieval score;
and the early warning notification unit is used for identifying the voice message as malicious voice and sending a warning to the terminal if the retrieval score is greater than a preset threshold value.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the fingerprint generating unit is specifically configured to:
extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
connecting the characteristics of the multi-frame filter bank to generate a voice spectrogram to be identified;
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions based on a preset dividing rule, and calculating a binary code value of each spectrogram region;
and splicing the binary code values of each spectrogram region of the voice spectrogram to be identified to obtain the audio fingerprint to be identified.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the fingerprint generating unit is specifically configured to:
extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
connecting the characteristics of the multi-frame filter banks to generate a voice spectrogram to be recognized;
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions in designated distribution, wherein each spectrogram region comprises a horizontal axis direction and a vertical axis direction, each vertical axis direction comprises a plurality of sub-bands, and each sub-band has sub-band energy;
calculating the average sub-band energy of each sub-band based on the horizontal axis direction of the spectrogram region;
according to the average sub-band energy, binary coding is carried out on the spectrogram region to obtain a binary coding value of each spectrogram region;
and splicing the binary code values of each spectrogram region of the voice spectrogram to be identified to obtain the audio fingerprint to be identified.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the score calculating unit is specifically configured to:
calling a preset acoustic model to process the Mel cepstrum coefficient characteristics to obtain a plurality of candidate word lattices and acoustic scores corresponding to the candidate word lattices;
calling a preset language model to process the candidate word lattices to obtain a plurality of keywords and a language score corresponding to each keyword;
calculating the acoustic score corresponding to each candidate word lattice and the language score corresponding to each keyword based on a Viterbi algorithm to obtain an optimal score;
and determining a target keyword corresponding to the optimal score, calling a preset dynamic programming algorithm model, and retrieving in a preset malicious keyword library according to the target keyword to obtain a retrieval score corresponding to the target keyword.
A third aspect of the present invention provides a computer apparatus comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the computer device to perform the sensitive word-based accounting monitoring method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the sensitive word-based accounting monitoring method described above.
In the technical scheme provided by the invention, the repayment condition of the overdue user is constantly detected within a period of time after the service personnel send the text message to the overdue user, so that whether the user loses confidence is determined, and the service personnel directly performs voice communication with the user, thereby improving the monitoring accuracy. The method has the advantages that sensitive words are recognized through text messages sent by a salesman to a borrower, malicious recognition is conducted through voice communication information of the borrower, so that the whole business process is strictly controlled, and the quality of monitoring service is improved.
Drawings
Fig. 1 is a schematic diagram of a sensitive word-based accounting monitoring method according to a first embodiment of the present invention;
fig. 2 is a schematic diagram of a sensitive word-based accounting monitoring method according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of a third embodiment of the sensitive word-based accounting monitoring method in the embodiment of the present invention;
fig. 4 is a schematic diagram of an embodiment of an accounting monitoring apparatus based on sensitive words in the embodiment of the present invention;
fig. 5 is a schematic diagram of another embodiment of an accounting monitoring apparatus based on sensitive words in the embodiment of the present invention;
FIG. 6 is a diagram of an embodiment of a computer device in an embodiment of the invention.
Detailed Description
The embodiment of the invention provides an accounting monitoring method, device, equipment and storage medium based on sensitive words, and the monitoring quality is higher.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and referring to fig. 1, an embodiment of the sensitive word-based accounting monitoring method in the embodiment of the present invention includes:
101. detecting overdue users in a preset user database, and acquiring communication information of the overdue users;
it can be understood that the overdue user in this embodiment refers to a borrower whose current bill is not yet cleared before the specified repayment date and whose distance from the specified repayment date does not exceed the preset overdue compensation deadline, and the preset overdue buffering deadline refers to a time period during which the borrower can pay back accounts to compensate for the previous overdue behavior after the specified repayment date, for example, the overdue buffering deadline is 5 days, which indicates that the borrower can pay back on 5 natural days after the specified repayment date, so as to compensate for the previous overdue behavior, thereby reducing the penalty. In a specific implementation, the penalty may be a loan interest penalty or a credit degradation penalty of the borrower, and the like, which is not limited by the embodiment.
The communication information includes, but is not limited to, a mobile phone number, a mailbox address, a fax address, and the like, which is not limited in this embodiment. The communication information is acquired by the finance loan company for the user data, and it should be noted that the acquired communication information should conform to the relevant terms in the personal information safety protection law.
102. Receiving a text message input by a salesman at a terminal, carrying out sensitive word detection on the text message, and if the sensitive word is not detected, sending an acceptance message to the overdue user according to communication information;
it can be understood that the text message input by the clerk in the terminal is the text message for sending the overdue reminder to the user, and the specific content is not limited to the dialect specified by the financial loan company, and may also include emotional words with artificial subjective colors and related terms thereof. Therefore, the server also carries out sensitive word detection on the text information input by the server, so that the text specification is improved. The sensitive word may be a word such as abuse, terror, harassment, etc., which is not limited by the embodiment.
Further, the server may execute the sensitive word detection by constructing a sensitive word training model to recognize the sensitive word, and the specific way of the sensitive word detection is not limited in this embodiment. And when the server detects that the text message edited by the salesperson contains the sensitive words, refusing to send the text message to the overdue user and generating a corresponding prompt to request the salesperson to standardize the terms in the text message, until the server does not detect the sensitive words in the text message, allowing the text message containing the overdue reminding information to be sent to the overdue user by the server. It should be noted that, in this embodiment, the form of the text message sent to the overdue user should be adapted to the communication information thereof, for example, if the communication information is a mailbox address, the text message should be in an email form; if the communication information is a mobile phone number, the text message should be in a short message form.
103. Detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection flow of the overdue user, and otherwise, determining that the overdue user is a lost user;
it is understood that the preset time period is the overdue buffering period within which the borrower is allowed to compensate for the previous overdue behavior, and if the overdue user does not make any compensation (i.e., repayment) for the previous overdue behavior within the overdue buffering period or the made compensation does not meet the preset minimum compensation condition, the server re-evaluates the credit type of the overdue user. Specifically, the server detects the payment status of the overdue user at a certain detection frequency (for example, every 12 hours) by setting a timing scheduling process in which the background resides. In a specific business system, the repayment status is usually represented by a data table field, which may correspond to a plurality of values, for example, 0 to 3, where a value 0 represents an unpaid status, a value 1 represents an unpaid status, a value 2 represents a minimum repayment status, and a value 3 represents an accounting status. If the service detects that the payment state of the overdue user is changed from the non-payment state 1 to the lowest payment state 2 or the account payment state 3 within the overdue buffering deadline, the account collection process for the overdue user is terminated, otherwise, the server reduces the credit assessment of the overdue user and determines that the overdue user is a lost user.
104. Sending a message loss prompt receiving service instruction to a service person, and receiving a voice message collected by a terminal, wherein the message loss prompt receiving service instruction is used for instructing the service person to establish voice communication to a message loss user by using the terminal, and the voice message is a voice message in the communication process between the service person and the message loss user;
it can be understood that, after the server evaluates that the overdue user is the message loss user, a manual collection process is executed, that is, the server sends a message loss collection service instruction to the service staff, the message loss collection service instruction can be a corresponding work order task, and the task requirement is further voice communication with the message loss user. The terminal is an electronic device with a voice communication function, preferably, the terminal is a mobile phone, and can collect voice messages in a telephone communication process between a service person and a message loss user based on a call recording mode, which is not limited in this embodiment.
105. Based on a preset voice recognition algorithm, voice information is recognized to judge whether the voice information is malicious voice or not, and if the voice information is the malicious voice, a warning is sent to the terminal.
It is understood that the speech recognition algorithm is used to determine whether the speech message is malicious speech, which is speech containing information that threatens personal safety of the distressing user, such as harassment/abuse/fraud. For a specific speech recognition algorithm, the server may convert the speech message into a corresponding text representation for recognition, or convert the speech into a signal wave for recognition, which is not limited in this embodiment. And when the server determines that the voice message is malicious voice, sending a warning to the terminal to remind the business staff to standardize the financial affair collection service behavior. Optionally, when the number of times that the server sends the warning to the terminal reaches the threshold value, the terminal is warned for multiple times and the speaking behavior of the terminal in the process of executing the service designation of hastening the receipt is not standardized yet, and a corresponding penalty measure can be executed. Optionally, the server may further identify the number of malicious words or the duration of the malicious sentences in the malicious speech according to a speech recognition algorithm, so as to calculate a corresponding service score according to a preset rule and the number or the duration of the malicious words, and further normalize and restrict service behaviors of service personnel according to the service score, thereby achieving a goal of improving monitoring quality.
In the embodiment, the text message sent by the salesperson to the borrower is subjected to sensitive word recognition, and malicious recognition is carried out on the voice communication information of the borrower, so that the whole business process is strictly controlled, and the quality of monitoring service is improved.
Referring to fig. 2, a second embodiment of the sensitive word-based accounting monitoring method according to the embodiment of the present invention includes:
201. detecting overdue users in a preset user database, and acquiring communication information of the overdue users;
step 201 is similar to the step 101, and is not described herein again.
202. Receiving a text message input by an operator at a terminal, sending a data access request to a preset distributed cache server, and receiving a search tree returned by a cache node in the distributed cache server, wherein the search tree corresponds to a main key contained in the data access request;
it can be understood that in this embodiment, the word stock retrieval server retrieves the sensitive words in the text message, the word stock retrieval server provides a data support service for the word stock retrieval server through a preset database and a distributed cache server, for example, the preset database is a Mysql database, the distributed cache server is a Redis cache server cluster, the cluster at least includes one cache node, and the distributed cache server is established based on a memory.
The preset database stores a sensitive word bank containing a plurality of sensitive words, and service personnel can manage the sensitive word bank according to specific requirements and delete or modify the sensitive words in the sensitive word bank.
Optionally, before step 202, the method may further include:
acquiring a sensitive word library constructed by a user, and constructing a search tree corresponding to the sensitive word library based on an AC (automatic control) automaton algorithm;
and synchronizing the search tree into the distributed cache servers to store the search tree in a distributed mode in each cache node of the distributed cache servers.
It can be understood that the sensitive word thesaurus can be constructed based on the sensitive words selected from the Chinese standard thesaurus, optionally, the server can classify the selected sensitive words and select a core word for each classification as a reference word of the classification, and further, the reference word in each classification is used as a central node, and the vector distance from other similar words to the central node is set, so that the sensitive word thesaurus in the form of word vectors is generated, and then the server constructs a corresponding search tree according to the constructed sensitive word thesaurus.
In this embodiment, the server establishes a search tree corresponding to the sensitive word lexicon based on an AC automaton algorithm (Aho-corpasick automation). Specifically, the server acquires each sensitive word stored in the sensitive word lexicon, constructs a corresponding Trie (Trie) based on each sensitive word, and further constructs a corresponding failure pointer based on each node in the Trie, wherein the failure pointer is used for continuing to match downwards according to the failure node corresponding to the current matching node when the matching of the current matching node fails, and further completes the establishment of the search tree according to the Trie and each failure pointer.
Further, after the search tree is constructed, the server needs to cache the search tree in the cache nodes corresponding to the distributed cache server. The distributed cache server is established based on a Redis cluster, and the Redis cluster comprises at least one cache node. Specifically, the step of synchronizing the search tree to the distributed cache server by the server includes: acquiring relation information of each node corresponding to the search tree; and synchronizing each node and the relationship information into the distributed storage server in a key-value pair form (key-value) so that the distributed storage server generates and stores the search tree according to each node and the relationship information.
203. Determining a matching result of the text message in the search tree based on an AC automata algorithm, determining sensitive words in the text message according to the matching result, and if any sensitive word is not detected, sending an acceptance message to an overdue user according to communication information;
it will be appreciated that the AC automaton algorithm is a multi-pattern matching algorithm that typically first constructs a dictionary tree, i.e., a search tree equivalent thereto, and then constructs failure pointers and pattern matching procedures, respectively. When the server uses the pattern string to match on the dictionary tree, if the keyword with the current node can not be matched continuously, the server goes to the node pointed by the failure pointer of the current node to continuously match.
And the server matches each character corresponding to the text message with each node of the obtained search tree, so as to obtain sensitive words completely matched with the search tree in the text message, and further determines the sensitive words in the text to be recognized according to the matching results and by taking each sensitive word as the matching result. When the server does not obtain a sensitive word that completely matches the search tree with respect to the text message, i.e., the text message entered at the terminal by the service operator is compliant with the service standard and does not contain any objectionable information (e.g., abuse, terror, etc.), the text message is allowed to be sent to the overdue user.
204. Detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection flow of the overdue user, and otherwise, determining that the overdue user is a lost user;
205. sending a message loss prompt receiving service instruction to a service person, and receiving a voice message collected by a terminal, wherein the message loss prompt receiving service instruction is used for instructing the service person to establish voice communication to a message loss user by using the terminal, and the voice message is a voice message in the communication process between the service person and the message loss user;
206. based on a preset voice recognition algorithm, voice information is recognized to judge whether the voice information is malicious voice or not, and if the voice information is the malicious voice, a warning is sent to the terminal.
Wherein, the steps 204-206 are similar to the steps 103-105 described above, and detailed description thereof is omitted here.
In the embodiment, the process of performing sensitive word recognition on the text message edited by the salesman is described in detail, and the content of various sensitive words is cached through the dictionary tree, so that the sensitive words are quickly matched with the text message, and the efficiency of performing sensitive word recognition on the text message is improved.
Referring to fig. 3, a third embodiment of the sensitive word-based accounting monitoring method according to the embodiment of the present invention includes:
301. detecting overdue users in a preset user database, and acquiring communication information of the overdue users;
302. receiving a text message input by a salesman at a terminal, carrying out sensitive word detection on the text message, and if the sensitive word is not detected, sending an acceptance message to the overdue user according to communication information;
303. detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection flow of the overdue user, and otherwise, determining that the overdue user is a lost user;
304. sending a message loss prompt receiving service instruction to a service person, and receiving a voice message collected by a terminal, wherein the message loss prompt receiving service instruction is used for instructing the service person to establish voice communication to a message loss user by using the terminal, and the voice message is a voice message in the communication process between the service person and the message loss user;
wherein, the steps 301-304 are similar to the steps 101-104 described above, and detailed description thereof is omitted here.
305. Generating an audio fingerprint to be identified according to the voice message to be identified;
it should be understood that the audio fingerprint refers to extracting the digital features in the voice message in the form of an identifier through a specific algorithm, the audio fingerprint is often used to identify the position of a huge amount of sound samples or track and locate samples in a database, the identification process is not affected by the factors of the storage format, the encoding mode, the code rate, the compression, etc. of the audio itself, the server may extract the audio fingerprint of the voice message through Shazam, etc. algorithm, firstly, the voice message is divided into data blocks after adding a sliding window, secondly, fourier transform is performed on each divided data block, then, the spectral bands are divided, the peak signal in each spectral band is taken as the signature of the spectral band, so as to construct the fingerprint of each frame, finally, for the convenience of search, the fingerprint is usually used as the key value of a hash table, that is the audio fingerprint is stored as the key (key) of the audio search hash table, the key value points to a part including the time when the fingerprint appears in the voice message and the ID of the audio segment, so as to match the corresponding information in the database with high accuracy.
Optionally, in an embodiment, the server generates the audio fingerprint by:
3051. extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
it is understood that multi-frame Filter Bank Features (FBANK) are common features in speech recognition, and the steps to obtain such features are typically pre-emphasis, framing, windowing, short-time fourier transform, mel-filtering, de-averaging, etc. Specifically, the server performs frame windowing on the voice message to generate a multi-frame time domain signal, and then converts each frame of time domain signal into a corresponding frequency domain signal through fast Fourier transform; then calculating the sub-band energy of the frequency domain signal according to the frequency domain signal through a Mel filter bank, namely inputting the frequency domain signal into the Mel filter bank and outputting the sub-band energy of the frequency domain signal; and finally, measuring logarithms of the subband energies to generate the FBANK characteristics.
3052. Connecting the characteristics of the multi-frame filter banks to generate a voice spectrogram to be recognized;
3053. dividing a voice spectrogram to be recognized into a plurality of spectrogram regions in designated distribution, wherein each spectrogram region comprises a horizontal axis direction and a vertical axis direction, each vertical axis direction comprises a plurality of sub-bands, and each sub-band has sub-band energy;
it is to be understood that the horizontal axis direction of each spectrogram region includes a plurality of frames of speech signals, and the vertical axis direction includes a plurality of sub-bands. Optionally, each spectrogram region comprises 4 subbands of the 4-frame speech signal. Each subband has a subband number. For example, the number of the 4 sub-bands is sub-band number 0, sub-band number 1, sub-band number 2, and sub-band number 3 in this order.
Specifying the distribution includes having a specified overlap ratio in the direction of the horizontal axis between each spectrogram region. Alternatively, the overlap ratio is specified to be 50%.
3054. Calculating the average sub-band energy of each sub-band based on the horizontal axis direction of the spectrogram region;
it is understood that the server calculates an average value of the sub-band energy of each sub-band of each spectrogram region in the horizontal axis direction, which is an average sub-band energy of each sub-band.
3055. According to the average sub-band energy, binary coding is carried out on the spectrogram region to generate a binary coding value;
it can be understood that, the server firstly counts the maximum value of the average subband energy, secondly queries the subband number where the maximum value is located, and finally performs binary coding on the spectrogram region according to the subband number where the maximum value is located, so as to generate a binary coded value. For example, if the number of the subband where the maximum value is located is subband number 0, the binary code value generated by binary coding is 00; if the number of the sub-band where the maximum value is located is the sub-band No. 1, the binary code value generated by binary coding is 01; if the number of the sub-band where the maximum value is located is the sub-band No. 2, the binary code value generated by binary coding is 10; and if the number of the sub-band where the maximum value is positioned is No. 3 sub-band, the binary code value generated by binary coding is 11.
3056. And splicing the binary code values of each spectrogram area of the voice spectrogram to be recognized to obtain the audio fingerprint to be recognized.
It will be appreciated that the server concatenates the binary code values into a long string of binary data, which is the audio fingerprint to be identified. For example, the binary code values are 11, 10, 01, and 00 respectively, and the obtained audio fingerprint to be recognized is 11100100 by concatenating the binary code values.
306. Matching the audio fingerprint with a preset audio hash table, if the audio fingerprint is successfully matched with the audio hash table, identifying the voice message as malicious voice, and otherwise, extracting Mel cepstrum coefficient characteristics from the voice message;
it should be understood that the preset audio hash table includes a plurality of audio hash values, and the audio hash values include audio fingerprints of malicious voices. The malicious voice refers to voice containing harassment/abuse/fraud and the like which bring personal threat information to the distrusted user. Mel-Frequency Cepstral Coefficients (MFCCs) and multi-frame Filter Bank Features (FBANKs) are common features in speech recognition, and the steps of obtaining the FBANK features are usually pre-emphasis, framing, windowing, Short-Time Fourier Transform (STFT), Mel (Mel) filtering, de-averaging, etc., and further performing Discrete Cosine Transform (DCT for Discrete Cosine Transform) on the characteristics to obtain Mel-Frequency Cepstral Coefficients (MFCCs).
307. And performing keyword analysis on the Mel cepstrum coefficient characteristics, generating a retrieval score, if the retrieval score is larger than a preset threshold, identifying the voice message as malicious voice, and sending a warning to the terminal.
It should be understood that, in this embodiment, the server first processes the mel-frequency cepstrum coefficient features by calling a preset acoustic model to obtain a plurality of candidate word lattices and an acoustic score corresponding to each candidate word lattice; secondly, calling a preset language model to process the candidate word lattices to obtain a plurality of keywords and a language score corresponding to each keyword; then based on a Viterbi algorithm, calculating an acoustic score corresponding to each candidate word lattice and a language score corresponding to each keyword to obtain an optimal score; and finally, determining a target keyword corresponding to the optimal score, calling a preset dynamic programming algorithm model, searching in a preset malicious keyword library according to the target keyword to obtain a search score corresponding to the target keyword, and identifying the voice message as malicious voice if the search score is greater than a preset threshold value.
Specifically, the server inputs the MFCC features into an acoustic model process to output a plurality of candidate word lattices and an acoustic score corresponding to each candidate word lattice. Optionally, a specified number of candidate word lattices with the smallest acoustic scores are screened out from the multiple candidate word lattices. For example, screening 20 candidate word lattices with the minimum corresponding acoustic scores from 100 candidate word lattices; secondly, the server inputs the candidate word lattices or the screened candidate word lattices into a language model, and outputs a plurality of keywords and a language score corresponding to each keyword; then, the server inputs the acoustic score corresponding to each candidate word lattice and the language score corresponding to each keyword into a Viterbi algorithm, and outputs an optimal score. The optimal score corresponds to a language score, which corresponds to a keyword. In this embodiment, the server adds the acoustic score and the language score to obtain an added score; the maximum additive score is determined as the optimal score.
Furthermore, the server queries the corresponding language score according to the optimal score, and further queries the corresponding keyword according to the language score. The server searches in a preset malicious keyword library according to the keywords, and the process is executed in a dynamic planning algorithm model. The malicious keyword library comprises a plurality of sensitive words. Optionally, the preset threshold is 30%, and if the retrieval score is greater than the preset threshold, it indicates that the voice message is a malicious voice; if the retrieval score is less than or equal to the preset threshold value, the voice message is indicated to be normal voice.
In the embodiment, a process of identifying whether the voice between the salesperson and the loan officer is malicious voice is described in detail, an attempt is made to generate an audio fingerprint according to the original voice and match the audio fingerprint with an audio hash table, if the matching is successful, the voice is represented as the malicious voice, if the matching is unsuccessful, mel cepstrum coefficient characteristics are extracted from the original voice, then the voice is analyzed and calculated to obtain a retrieval score, and the score is compared with a preset threshold value, so that whether the voice is the malicious voice is determined.
In the above description of the sensitive word based accounting monitoring method in the embodiment of the present invention, the sensitive word based accounting monitoring apparatus in the embodiment of the present invention is described below with reference to fig. 4, and an embodiment of the sensitive word based accounting monitoring apparatus in the embodiment of the present invention includes:
the user detection module 401 is configured to detect an overdue user in a preset user database, and acquire communication information of the overdue user;
the text detection module 402 receives a text message input by a salesman at a terminal, performs sensitive word detection on the text message, and sends a message of receiving to the overdue user according to the communication message if the sensitive word is not detected;
a compensation detection module 403, configured to detect a repayment status of the overdue user based on a preset frequency within a preset time period after the text message is sent, terminate the financial affair collection procedure for the overdue user if the repayment status of the overdue user reaches a preset lowest repayment status, and otherwise determine that the overdue user is a lost user;
a voice collecting module 404, configured to send a credit loss prompt receiving service instruction to the servicer, and receive a voice message collected by the terminal, where the credit loss prompt receiving service instruction is used to instruct the servicer to establish voice communication with the credit loss user by using the terminal, and the voice message is a voice message in a communication process between the servicer and the credit loss user;
a voice recognition module 405, configured to recognize the voice message based on a preset voice recognition algorithm to determine whether the voice message is malicious voice, and send an alert to the terminal if the voice message is malicious voice.
In the embodiment, the text message sent by the salesperson to the borrower is subjected to sensitive word recognition, and malicious recognition is carried out on the voice communication information of the borrower, so that the whole business process is strictly controlled, and the quality of monitoring service is improved.
Referring to fig. 5, another embodiment of the sensitive word-based accounting monitoring apparatus according to the embodiment of the present invention includes:
the user detection module 401 is configured to detect an overdue user in a preset user database, and acquire communication information of the overdue user;
the text detection module 402 receives a text message input by a salesman at a terminal, performs sensitive word detection on the text message, and sends a message of receiving to the overdue user according to the communication message if the sensitive word is not detected;
a compensation detection module 403, configured to detect a repayment status of the overdue user based on a preset frequency within a preset time period after the text message is sent, terminate the financial affair collection procedure for the overdue user if the repayment status of the overdue user reaches a preset lowest repayment status, and otherwise determine that the overdue user is a lost user;
a voice collecting module 404, configured to send a credit loss prompt receiving service instruction to the servicer, and receive a voice message collected by the terminal, where the credit loss prompt receiving service instruction is used to instruct the servicer to establish voice communication with the credit loss user by using the terminal, and the voice message is a voice message in a communication process between the servicer and the credit loss user;
a voice recognition module 405, configured to recognize the voice message based on a preset voice recognition algorithm to determine whether the voice message is malicious voice, and send an alert to the terminal if the voice message is malicious voice.
The text detection module 402 specifically includes:
the tree construction unit 4021 is configured to acquire a sensitive word library constructed by a user, and construct a search tree corresponding to the sensitive word library based on an AC automaton algorithm;
a tree cache unit 4022, configured to synchronize the search tree to the distributed cache servers, so as to store the search tree in a distributed manner in each cache node of the distributed cache servers;
the tree acquisition unit 4023 is configured to receive a text message input by a service staff at a terminal, send a data access request to a preset distributed cache server, and receive a search tree returned by a cache node in the distributed cache server, where the search tree corresponds to a primary key included in the data access request;
the tree matching unit 4024 is configured to determine a matching result of the text message in the search tree based on an AC automata algorithm, determine a sensitive word in the text message according to the matching result, and send an acceptance message to the overdue user according to the communication information if any sensitive word is not detected.
The voice recognition module 405 specifically includes:
the fingerprint generating unit 4051 is configured to generate an audio fingerprint to be recognized according to the voice message to be recognized;
a hash matching unit 4052, configured to match the audio fingerprint with a preset audio hash table, identify the voice message as malicious voice if the audio fingerprint is successfully matched with the audio hash table, and otherwise extract mel-frequency cepstrum coefficient features from the voice message;
a score calculating unit 4053, configured to perform keyword analysis on the mel-frequency cepstrum coefficient features and generate a retrieval score;
an early warning notification unit 4054, configured to identify the voice message as malicious voice if the retrieval score is greater than a preset threshold, and send a warning to the terminal.
Wherein, the fingerprint generating unit 4051 is specifically configured to:
extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
connecting the characteristics of the multi-frame filter banks to generate a voice spectrogram to be recognized;
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions in designated distribution, wherein each spectrogram region comprises a horizontal axis direction and a vertical axis direction, each vertical axis direction comprises a plurality of sub-bands, and each sub-band has sub-band energy;
calculating the average sub-band energy of each sub-band based on the horizontal axis direction of the spectrogram region;
according to the average sub-band energy, binary coding is carried out on the spectrogram region to obtain a binary coding value of each spectrogram region;
and splicing the binary code values of each spectrogram region of the voice spectrogram to be identified to obtain the audio fingerprint to be identified.
Wherein, the score calculating unit 4053 is specifically configured to:
calling a preset acoustic model to process the Mel cepstrum coefficient characteristics to obtain a plurality of candidate word lattices and acoustic scores corresponding to the candidate word lattices;
calling a preset language model to process the candidate word lattices to obtain a plurality of keywords and a language score corresponding to each keyword;
calculating the acoustic score corresponding to each candidate word lattice and the language score corresponding to each keyword based on a Viterbi algorithm to obtain an optimal score;
and determining a target keyword corresponding to the optimal score, calling a preset dynamic programming algorithm model, and retrieving in a preset malicious keyword library according to the target keyword to obtain a retrieval score corresponding to the target keyword.
In the embodiment of the invention, the modularized design ensures that hardware of each part of the financial monitoring device based on sensitive words is concentrated on realizing a certain function, the performance of the hardware is realized to the maximum extent, and meanwhile, the modularized design also reduces the coupling between modules of the device, thereby being more convenient to maintain.
Fig. 4 and 5 describe the sensitive word-based accounting monitoring apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the computer device in the embodiment of the present invention is described in detail from the perspective of the hardware processing.
Fig. 6 is a schematic structural diagram of a computer device 600 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 610 (e.g., one or more processors) and a memory 620, and one or more storage media 630 (e.g., one or more mass storage devices) for storing applications 633 or data 632. Memory 620 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a sequence of instructions for operating on the computer device 600. Further, the processor 610 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the computer device 600.
The computer device 600 may also include one or more power supplies 640, one or more wired or wireless network interfaces 650, one or more input-output interfaces 660, and/or one or more operating systems 631, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and so forth. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 6 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.
The present invention also provides a computer device, which includes a memory and a processor, where the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, cause the processor to execute the steps of the sensitive word-based accounting monitoring method in the foregoing embodiments.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the sensitive word-based accounting monitoring method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An accounting monitoring method based on sensitive words is characterized in that the accounting monitoring method based on sensitive words comprises the following steps:
detecting overdue users in a preset user database, and acquiring communication information of the overdue users;
receiving a text message input by a salesman at a terminal, carrying out sensitive word detection on the text message, and if the sensitive word is not detected, sending a message of receiving to the overdue user according to the communication information;
detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection flow of the overdue user, and otherwise, determining that the overdue user is a lost user;
sending a message loss prompt receiving service instruction to the service staff, and receiving a voice message collected by the terminal, wherein the message loss prompt receiving service instruction is used for instructing the service staff to establish voice communication to the message loss user by using the terminal, and the voice message is a voice message in the communication process between the service staff and the message loss user;
and based on a preset voice recognition algorithm, recognizing the voice message to judge whether the voice message is malicious voice, and if the voice message is the malicious voice, sending a warning to the terminal.
2. The sensitive word-based accounting monitoring method of claim 1, wherein the receiving a text message input by a salesman at a terminal, performing sensitive word detection on the text message, and if a sensitive word is not detected, sending an acceptance message to the overdue user according to the communication information comprises:
receiving a text message input by an operator at a terminal, sending a data access request to a preset distributed cache server, and receiving a search tree returned by a cache node in the distributed cache server, wherein the search tree corresponds to a main key contained in the data access request;
and determining a matching result of the text message in the search tree based on an AC automatic machine algorithm, determining sensitive words in the text message according to the matching result, and sending an acceptance message to the overdue user according to the communication information if any sensitive word is not detected.
3. The sensitive word-based accounting monitoring method as claimed in claim 2, wherein before receiving the text message input by the salesman at the terminal, sending the data access request to the preset distributed cache server, and receiving the search tree returned by the cache node in the distributed cache server, further comprising:
acquiring a sensitive word library constructed by a user, and constructing a search tree corresponding to the sensitive word library based on an AC (automatic control) automaton algorithm;
synchronizing the search tree into the distributed cache server to store the search tree in a distributed manner in each cache node of the distributed cache server.
4. The sensitive word-based accounting monitoring method according to any one of claims 1-3, wherein the recognizing the voice message based on a preset voice recognition algorithm to determine whether the voice message is malicious voice, and if the voice message is malicious voice, sending an alert to the terminal comprises:
generating an audio fingerprint to be identified according to the voice message to be identified;
matching the audio fingerprint with a preset audio hash table, if the audio fingerprint is successfully matched with the audio hash table, identifying the voice message as malicious voice, and otherwise, extracting Mel cepstrum coefficient characteristics from the voice message;
performing keyword analysis on the Mel cepstrum coefficient characteristics, and generating a retrieval score;
and if the retrieval score is larger than a preset threshold value, identifying the voice message as malicious voice, and sending a warning to the terminal.
5. The sensitive word-based accounting monitoring method as claimed in claim 4, wherein the generating an audio fingerprint to be recognized according to the voice message to be recognized comprises:
extracting the characteristics of a multi-frame filter bank from the voice message to be recognized;
connecting the characteristics of the multi-frame filter bank to generate a voice spectrogram to be identified;
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions based on a preset dividing rule, and calculating a binary code value of each spectrogram region;
and splicing the binary code values of each spectrogram region of the voice spectrogram to be identified to obtain the audio fingerprint to be identified.
6. The sensitive word-based accounting monitoring method as claimed in claim 5, wherein the dividing the voice spectrogram to be recognized into a plurality of spectrogram regions based on a preset dividing rule, and the calculating the binary code value of each spectrogram region comprises:
dividing the voice spectrogram to be recognized into a plurality of spectrogram regions in designated distribution, wherein each spectrogram region comprises a horizontal axis direction and a vertical axis direction, each vertical axis direction comprises a plurality of sub-bands, and each sub-band has sub-band energy;
calculating the average sub-band energy of each sub-band based on the horizontal axis direction of the spectrogram region;
and carrying out binary coding on the spectrogram region according to the average sub-band energy to obtain a binary coding value of each spectrogram region.
7. The sensitive word-based accounting monitoring method of claim 4, wherein the performing keyword analysis on the Mel cepstral coefficient features and generating retrieval scores comprises:
calling a preset acoustic model to process the Mel cepstrum coefficient characteristics to obtain a plurality of candidate word lattices and acoustic scores corresponding to the candidate word lattices;
calling a preset language model to process the candidate word lattices to obtain a plurality of keywords and a language score corresponding to each keyword;
calculating the acoustic score corresponding to each candidate word lattice and the language score corresponding to each keyword based on a Viterbi algorithm to obtain an optimal score;
and determining a target keyword corresponding to the optimal score, calling a preset dynamic programming algorithm model, and retrieving in a preset malicious keyword library according to the target keyword to obtain a retrieval score corresponding to the target keyword.
8. An accounting monitoring device based on sensitive words, the accounting monitoring device based on sensitive words comprises:
the system comprises a user detection module, a user identification module and a user identification module, wherein the user detection module is used for detecting overdue users in a preset user database and acquiring communication information of the overdue users;
the text detection module receives a text message input by an operator at a terminal, performs sensitive word detection on the text message, and sends a message of receiving to the overdue user according to the communication information if the sensitive word is not detected;
the compensation detection module is used for detecting the repayment state of the overdue user based on a preset frequency within a preset time period after the text message is sent, if the repayment state of the overdue user reaches a preset lowest repayment state, terminating the financial affair collection process of the overdue user, and otherwise, determining that the overdue user is a lost user;
the voice acquisition module is used for sending a message loss prompt receiving service instruction to the waiter and receiving a voice message acquired by the terminal, wherein the message loss prompt receiving service instruction is used for instructing the waiter to establish voice communication to the message loss user by using the terminal, and the voice message is a voice message in the communication process between the waiter and the message loss user;
and the voice recognition module is used for recognizing the voice message based on a preset voice recognition algorithm so as to judge whether the voice message is malicious voice or not, and if the voice message is the malicious voice, sending a warning to the terminal.
9. A computer device, characterized in that the computer device comprises: a memory and at least one processor, the memory having instructions stored therein;
the at least one processor invoking the instructions in the memory to cause the computer device to perform a sensitive word-based accounting monitoring method of any one of claims 1-7.
10. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement a sensitive word-based accounting monitoring method as claimed in any one of claims 1-7.
CN202210289706.9A 2022-03-23 2022-03-23 Sensitive word-based accounting monitoring method, device, equipment and storage medium Pending CN114610840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210289706.9A CN114610840A (en) 2022-03-23 2022-03-23 Sensitive word-based accounting monitoring method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210289706.9A CN114610840A (en) 2022-03-23 2022-03-23 Sensitive word-based accounting monitoring method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114610840A true CN114610840A (en) 2022-06-10

Family

ID=81864534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210289706.9A Pending CN114610840A (en) 2022-03-23 2022-03-23 Sensitive word-based accounting monitoring method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114610840A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115208942A (en) * 2022-07-19 2022-10-18 中国银行股份有限公司 Bank business information processing method, device and business system
CN116822496A (en) * 2023-06-02 2023-09-29 厦门她趣信息技术有限公司 Social information violation detection method, system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108965613A (en) * 2018-10-23 2018-12-07 长沙裕邦软件开发有限公司 Method, storage medium and application server based on the monitoring of collection system quality
CN110782335A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Method, device and storage medium for processing credit data based on artificial intelligence
CN113112992A (en) * 2019-12-24 2021-07-13 ***通信集团有限公司 Voice recognition method and device, storage medium and server
CN113407662A (en) * 2021-08-19 2021-09-17 深圳市明源云客电子商务有限公司 Sensitive word recognition method, system and computer readable storage medium
CN113903363A (en) * 2021-09-29 2022-01-07 平安银行股份有限公司 Violation detection method, device, equipment and medium based on artificial intelligence

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108965613A (en) * 2018-10-23 2018-12-07 长沙裕邦软件开发有限公司 Method, storage medium and application server based on the monitoring of collection system quality
CN110782335A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Method, device and storage medium for processing credit data based on artificial intelligence
CN113112992A (en) * 2019-12-24 2021-07-13 ***通信集团有限公司 Voice recognition method and device, storage medium and server
CN113407662A (en) * 2021-08-19 2021-09-17 深圳市明源云客电子商务有限公司 Sensitive word recognition method, system and computer readable storage medium
CN113903363A (en) * 2021-09-29 2022-01-07 平安银行股份有限公司 Violation detection method, device, equipment and medium based on artificial intelligence

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115208942A (en) * 2022-07-19 2022-10-18 中国银行股份有限公司 Bank business information processing method, device and business system
CN116822496A (en) * 2023-06-02 2023-09-29 厦门她趣信息技术有限公司 Social information violation detection method, system and storage medium
CN116822496B (en) * 2023-06-02 2024-04-19 厦门她趣信息技术有限公司 Social information violation detection method, system and storage medium

Similar Documents

Publication Publication Date Title
CN112804400B (en) Customer service call voice quality inspection method and device, electronic equipment and storage medium
CN110378562B (en) Voice quality inspection method, device, computer equipment and storage medium
CN109767787B (en) Emotion recognition method, device and readable storage medium
US8145562B2 (en) Apparatus and method for fraud prevention
US8793127B2 (en) Method and apparatus for automatically determining speaker characteristics for speech-directed advertising or other enhancement of speech-controlled devices or services
US8165877B2 (en) Confidence measure generation for speech related searching
US20200118571A1 (en) Voiceprint Recognition Method, Device, Terminal Apparatus and Storage Medium
WO2019037205A1 (en) Voice fraud identifying method and apparatus, terminal device, and storage medium
CN112735383A (en) Voice signal processing method, device, equipment and storage medium
CN114610840A (en) Sensitive word-based accounting monitoring method, device, equipment and storage medium
CN113628627B (en) Electric power industry customer service quality inspection system based on structured voice analysis
CN111932296B (en) Product recommendation method and device, server and storage medium
CN115102789B (en) Anti-communication network fraud studying, judging, early warning and intercepting comprehensive platform
CN110136726A (en) A kind of estimation method, device, system and the storage medium of voice gender
CN113112992B (en) Voice recognition method and device, storage medium and server
CN113744742B (en) Role identification method, device and system under dialogue scene
CN113191787A (en) Telecommunication data processing method, device electronic equipment and storage medium
CN109817223A (en) Phoneme marking method and device based on audio fingerprints
CN112562736B (en) Voice data set quality assessment method and device
CN111145761A (en) Model training method, voiceprint confirmation method, system, device and medium
CN112052994A (en) Customer complaint upgrade prediction method and device and electronic equipment
CN112185347A (en) Language identification method, language identification device, server and storage medium
CN111666469B (en) Statement library construction method, device, equipment and storage medium
CN117313723B (en) Semantic analysis method, system and storage medium based on big data
Pattanayak et al. Significance of single frequency filter for the development of children's KWS system.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination