CN115239215A - Enterprise risk identification method and system based on deep anomaly detection - Google Patents

Enterprise risk identification method and system based on deep anomaly detection Download PDF

Info

Publication number
CN115239215A
CN115239215A CN202211161439.3A CN202211161439A CN115239215A CN 115239215 A CN115239215 A CN 115239215A CN 202211161439 A CN202211161439 A CN 202211161439A CN 115239215 A CN115239215 A CN 115239215A
Authority
CN
China
Prior art keywords
index
enterprise
items
abnormality
index items
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211161439.3A
Other languages
Chinese (zh)
Other versions
CN115239215B (en
Inventor
李倩
江鑫雨
郭红钰
刘玉龙
郑扬飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 15 Research Institute
Original Assignee
CETC 15 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 15 Research Institute filed Critical CETC 15 Research Institute
Priority to CN202211161439.3A priority Critical patent/CN115239215B/en
Publication of CN115239215A publication Critical patent/CN115239215A/en
Application granted granted Critical
Publication of CN115239215B publication Critical patent/CN115239215B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides an enterprise risk identification method and system based on deep anomaly detection. The method comprises the following steps: acquiring basic information of an enterprise, processing the basic information into index items, and integrating a plurality of index items to construct a structured enterprise information table; learning the internal distribution relation of each index item in the structured enterprise information table, and constructing abnormal detection indexes of abnormality in the index items; learning the correlation among the index items in the structured enterprise information table, and constructing abnormal detection indexes of abnormality among the index items; learning the structural semantic information of the whole enterprise which is formed and reflected by all index items of the enterprise together, and constructing an abnormal detection index of abnormal structural semantic information; fusing the indexes of the three layers to construct a comprehensive abnormity detection index; and obtaining enterprise abnormal scores according to the comprehensive abnormal detection indexes. The invention has the advantages that a hierarchical anomaly detection index system is constructed, and further, the omnibearing and multi-level enterprise anomaly detection is realized.

Description

Enterprise risk identification method and system based on deep anomaly detection
Technical Field
The application belongs to the field of enterprise risk identification research, and particularly relates to an enterprise risk identification method and system based on deep anomaly detection.
Background
At present, the identification of enterprise risks is mainly applied to the business fields of loan, credit granting and the like of financial institutions. Generally speaking, there are two ways of enterprise risk identification. One method is risk identification based on enterprise transaction information, and the method mainly focuses on misoperation according to solidified rules, so that the identification risk is single and the limitation is high. And the other method is to identify risks aiming at various operation information of the enterprise, however, the model is mainly applied to a specific business field, the application form is single, and the risk is difficult to distinguish and identify according to indexes, so that the comprehensive and multilevel enterprise abnormity detection capability is not provided.
Anomaly detection algorithms aim at finding data patterns in the data that do not conform to expected behavior. In the field of bank wind control, for example, behaviors such as money laundering, credit card fraud, enterprise fraudulent loan and the like are considered to be abnormal; in the field of medical abnormality detection, rare diseases, false medical visits, medical accidents, and the like are considered as abnormalities. In addition, the anomaly detection is widely applied to network security intrusion detection, fault detection and video monitoring. In the field of enterprise risk detection, anomaly detection can provide sensing and warning of enterprise operation anomaly. However, the application of the anomaly detection algorithm to the enterprise risk detection field still has the following technical problems. On one hand, how to convert complex tabular enterprise data into information usable by an anomaly detection algorithm is a difficult point. On the other hand, most anomaly detection methods focus more on structured information (such as anomaly detection of images), but ignore the influence of unstructured information on semantics. In enterprise data, a single index (such as an asset liability rate) often plays a decisive role in overall semantics (such as enterprise business conditions).
In order to solve the problems, the application provides an enterprise risk identification method and system based on deep anomaly detection.
Disclosure of Invention
In order to solve the defects of the prior art, the application provides an enterprise risk identification method and system based on deep anomaly detection to solve the technical problems.
The invention discloses a method for identifying enterprise risks based on deep anomaly detection in a first aspect; the method comprises the following steps:
s1, acquiring basic information of an enterprise, processing the basic information into index items, and integrating a plurality of index items to construct a structured enterprise information table;
s2, learning the internal distribution relation of each index item in the structured enterprise information table, and constructing abnormal detection indexes of abnormality in the index items;
s3, learning the correlation among the index items in the structured enterprise information table, and constructing abnormal detection indexes of abnormality among the index items;
s4, learning the structural semantic information of the whole enterprise, which is formed by all index items of the enterprise together and reflected, and constructing an abnormal detection index of abnormal structural semantic information;
s5, fusing abnormal detection indexes of abnormality in the index items, abnormal detection indexes of abnormality among the index items and abnormal detection indexes of structural semantic information abnormality to construct a comprehensive abnormal detection index;
and S6, obtaining enterprise abnormal scores according to the comprehensive abnormal detection indexes.
According to the method of the first aspect of the present invention, in the step S6, the method further comprises: and determining the source of enterprise risk according to the abnormal detection indexes of the abnormality in the index items, the abnormal detection indexes of the abnormality among the index items and the abnormal detection indexes of the abnormality of the structural semantic information, wherein the source of the enterprise risk comprises single index item abnormality, incidence relation abnormality among the index items and enterprise data integral presentation abnormality.
According to the method of the first aspect of the present invention, in the step S1, the method for processing the basic information into index items and integrating a plurality of index items to construct the structured enterprise information table includes:
arranging the collected basic information into a formal representation of a triple:
Figure 944188DEST_PATH_IMAGE001
wherein
Figure 184676DEST_PATH_IMAGE002
Represent
Figure 119003DEST_PATH_IMAGE003
A collection of home business entities that are,
Figure 735929DEST_PATH_IMAGE004
is shown as
Figure 822834DEST_PATH_IMAGE005
The number of the home enterprise,
Figure 968644DEST_PATH_IMAGE006
represent
Figure 141000DEST_PATH_IMAGE007
The collection of the items of the individual indexes,
Figure 296037DEST_PATH_IMAGE008
is a function set and a function for mapping the enterprise to the index value corresponding to each index item
Figure 486716DEST_PATH_IMAGE009
An enterprise-specific indicator value is assigned to the enterprise,
Figure 334587DEST_PATH_IMAGE010
is shown as
Figure 728659DEST_PATH_IMAGE011
The value range of each index item; on the basis of the three group formalization representation, an enterprise information table is constructed, and the specific process is as follows: will be provided with
Figure 687387DEST_PATH_IMAGE012
In the longitudinal direction, will
Figure 483305DEST_PATH_IMAGE013
Constructing a table according to horizontal arrangement; will be provided with
Figure 502077DEST_PATH_IMAGE004
And
Figure 632713DEST_PATH_IMAGE014
value assignment of corresponding position in table
Figure 129553DEST_PATH_IMAGE015
(ii) a In the enterprise information table, the
Figure 45556DEST_PATH_IMAGE005
The index value of the home enterprise is expressed in a vector form:
Figure 235229DEST_PATH_IMAGE016
according to the method of the first aspect of the present invention, in step S2, the method for learning the internal distribution relationship of each index item in the structured business information table to construct the abnormal detection index of the abnormality in the index item includes:
a frequency statistical process for converting the learning process of the distribution relation in the index into the index value by reflecting the distribution rule of each index value by adopting a frequency statistical method;
for each index item
Figure 603894DEST_PATH_IMAGE014
Learning the distribution function of index values
Figure 904425DEST_PATH_IMAGE017
Will index the item
Figure 924203DEST_PATH_IMAGE014
Index value of (2)
Figure 19198DEST_PATH_IMAGE015
Mapping to frequency of occurrence thereof
Figure 875158DEST_PATH_IMAGE018
The abnormity detection indexes for establishing abnormity in the index item are as follows:
Figure 979380DEST_PATH_IMAGE019
wherein,
Figure 604397DEST_PATH_IMAGE020
is shown as
Figure 135872DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 728396DEST_PATH_IMAGE007
indicates the total number of index items.
According to the method of the first aspect of the present invention, in step S3, the method for learning the correlation between the index items in the structured business information table and constructing an abnormal detection index of abnormality between the index items includes:
mutual information of a single index item and other index items is adopted to reflect a co-occurrence rule among the index items, and a mutual association relation learning process among the index items is converted into a learning process of a mutual information measuring function among the index items;
for each index item
Figure 370730DEST_PATH_IMAGE014
Learning mutual information measuring function between the index item and other index items
Figure 115833DEST_PATH_IMAGE021
The index item
Figure 552630DEST_PATH_IMAGE014
Index value of (2)
Figure 383183DEST_PATH_IMAGE022
I.e. by
Figure 829208DEST_PATH_IMAGE015
And the index value
Figure 694395DEST_PATH_IMAGE022
Vector formed by index values corresponding to other index items
Figure 816941DEST_PATH_IMAGE023
Mapping to an index value
Figure 869211DEST_PATH_IMAGE022
And the index value
Figure 853347DEST_PATH_IMAGE022
Corresponding to the mutual information size between the index values in other index items
Figure 573042DEST_PATH_IMAGE024
(ii) a Wherein
Figure 617221DEST_PATH_IMAGE025
Means for removing
Figure 156787DEST_PATH_IMAGE014
Other index items form the space of the vector;
the abnormity detection indexes for constructing abnormity among the index items are as follows:
Figure 928303DEST_PATH_IMAGE026
wherein,
Figure 502503DEST_PATH_IMAGE027
denotes the first
Figure 717584DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 10025DEST_PATH_IMAGE007
indicates the total number of index items.
According to the method of the first aspect of the present invention, in step S4, the method for learning the structural semantic information of the whole enterprise, which is composed of and reflected by the index items of the enterprise together, and constructing the abnormality detection index of the abnormality of the structural semantic information includes:
first, construct
Figure 335964DEST_PATH_IMAGE028
A fourth deep neural network
Figure 764672DEST_PATH_IMAGE029
;
Figure 134342DEST_PATH_IMAGE030
Will be provided with
Figure 648500DEST_PATH_IMAGE031
Normalized result
Figure 512551DEST_PATH_IMAGE032
Switch to
Figure 795764DEST_PATH_IMAGE028
In a hidden space; wherein,
Figure 87068DEST_PATH_IMAGE032
normalizing the index value to be between-1 and 1; then, a third deep neural network is constructed
Figure 806632DEST_PATH_IMAGE033
Will be
Figure 739953DEST_PATH_IMAGE032
And
Figure 877673DEST_PATH_IMAGE034
mapped to depth space
Figure 605457DEST_PATH_IMAGE035
Dimensional vectors, i.e.
Figure 77896DEST_PATH_IMAGE036
And
Figure 814908DEST_PATH_IMAGE037
(ii) a For simplicity of description, note
Figure 807135DEST_PATH_IMAGE036
Is composed of
Figure 705820DEST_PATH_IMAGE038
Memory for recording
Figure 681867DEST_PATH_IMAGE039
Is composed of
Figure 691411DEST_PATH_IMAGE040
In the depth space, an exponential cosine similarity is adopted to define a similarity measurement function, and calculation is carried out
Figure 52991DEST_PATH_IMAGE038
And
Figure 388157DEST_PATH_IMAGE040
degree of similarity of
Figure 320341DEST_PATH_IMAGE041
The abnormity detection indexes for constructing the structural semantic information abnormity are as follows:
Figure 399156DEST_PATH_IMAGE042
wherein,
Figure 365975DEST_PATH_IMAGE043
is shown as
Figure 606463DEST_PATH_IMAGE005
And (4) an abnormal detection index of the structural semantic information abnormity of the home enterprise.
According to the method of the first aspect of the present invention, in step S5, the method for fusing the abnormality detection index of the abnormality in the index item, the abnormality detection index of the abnormality between the index items, and the abnormality detection index of the abnormality in the structured semantic information to construct the comprehensive abnormality detection index includes:
Figure 540790DEST_PATH_IMAGE044
wherein,
Figure 157716DEST_PATH_IMAGE045
first, the
Figure 979042DEST_PATH_IMAGE005
The comprehensive abnormal detection indexes of the home enterprises,
Figure 656011DEST_PATH_IMAGE020
denotes the first
Figure 828366DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 983404DEST_PATH_IMAGE027
denotes the first
Figure 174083DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 21953DEST_PATH_IMAGE043
denotes the first
Figure 416025DEST_PATH_IMAGE005
Abnormality detection indexes of structural semantic information abnormality of the home enterprise;
Figure 374754DEST_PATH_IMAGE046
to represent
Figure 170672DEST_PATH_IMAGE020
The weighting coefficient of (4) is a hyper-parameter set by a person;
Figure 923864DEST_PATH_IMAGE047
represent
Figure 320079DEST_PATH_IMAGE020
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 816920DEST_PATH_IMAGE048
represent
Figure 732923DEST_PATH_IMAGE043
The weighting coefficient of (2) is a hyper-parameter set for a person.
The invention discloses a second aspect of the enterprise risk identification system based on deep anomaly detection; the system comprises:
the system comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is configured to collect basic information of an enterprise, process the basic information into index items and integrate a plurality of index items to construct a structured enterprise information table;
the second processing module is configured to learn the internal distribution relation of each index item in the structured enterprise information table and construct an abnormal detection index of abnormality in the index item;
the third processing module is configured to learn the correlation among the index items in the structured enterprise information table, and construct abnormal detection indexes of abnormality among the index items;
the fourth processing module is configured to learn the structural semantic information of the whole enterprise, which is formed by and reflected by all index items of the enterprise together, and construct an abnormal detection index of abnormal structural semantic information;
a fifth processing module, configured to fuse abnormal detection indexes of abnormalities in the index items, abnormal detection indexes of abnormalities between the index items, and abnormal detection indexes of structural semantic information abnormalities to construct a comprehensive abnormal detection index;
and the sixth processing module is configured to obtain an enterprise anomaly score according to the comprehensive anomaly detection index.
According to the system of the second aspect of the present invention, the sixth processing module is configured to further include: and determining the source of enterprise risk according to the abnormal detection indexes of the abnormality in the index items, the abnormal detection indexes of the abnormality among the index items and the abnormal detection indexes of the abnormality of the structural semantic information, wherein the source of the enterprise risk comprises single index item abnormality, incidence relation abnormality among the index items and enterprise data integral presentation abnormality.
According to the system of the second aspect of the present invention, the first processing module configured to process the basic information into index items, and the integrating multiple index items to construct the structured enterprise information table includes:
arranging the acquired basic information into a formal representation of a triple:
Figure 922596DEST_PATH_IMAGE001
wherein
Figure 25681DEST_PATH_IMAGE002
Represent
Figure 326212DEST_PATH_IMAGE003
A collection of home-based businesses that,
Figure 611569DEST_PATH_IMAGE004
is shown as
Figure 706564DEST_PATH_IMAGE005
The number of the home enterprise,
Figure 562525DEST_PATH_IMAGE006
to represent
Figure 666747DEST_PATH_IMAGE007
The collection of the items of the individual indexes,
Figure 291763DEST_PATH_IMAGE008
is a function set and a function for mapping the enterprise to the index value corresponding to each index item
Figure 823239DEST_PATH_IMAGE009
An enterprise-specific indicator value is given to the enterprise,
Figure 415763DEST_PATH_IMAGE010
is shown as
Figure 792518DEST_PATH_IMAGE011
The value range of each index item; on the basis of the triple formal representation, an enterprise information table is constructed, and the specific process is as follows: will be provided with
Figure 537620DEST_PATH_IMAGE012
In the longitudinal direction, will
Figure 239997DEST_PATH_IMAGE013
Constructing a table according to horizontal arrangement; will be provided with
Figure 804970DEST_PATH_IMAGE004
And
Figure 250995DEST_PATH_IMAGE014
value assignment of corresponding position in table
Figure 99871DEST_PATH_IMAGE015
(ii) a In the enterprise information table, the
Figure 973149DEST_PATH_IMAGE005
The index value of the home enterprise is expressed in a vector form:
Figure 25419DEST_PATH_IMAGE049
according to the system of the second aspect of the present invention, the second processing module is configured to learn the internal distribution relationship of each index item in the structured enterprise information table, and the constructing an abnormal detection index of an abnormality in an index item includes:
reflecting the distribution rule of each index value by adopting a frequency statistical method, and converting the learning process of the distribution relation in the index into a frequency statistical process of the index values;
for each index item
Figure 275135DEST_PATH_IMAGE014
Learning the distribution function of index values
Figure 463670DEST_PATH_IMAGE050
Will index the item
Figure 507850DEST_PATH_IMAGE014
Index value of (1)
Figure 562262DEST_PATH_IMAGE015
Mapping to frequency of its occurrence
Figure 350090DEST_PATH_IMAGE018
The abnormity detection indexes for establishing abnormity in the index item are as follows:
Figure 658711DEST_PATH_IMAGE051
wherein,
Figure 873792DEST_PATH_IMAGE020
is shown as
Figure 166233DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 226593DEST_PATH_IMAGE007
indicates the total number of index items.
According to the system of the second aspect of the present invention, the third processing module is configured to learn the correlation between the index items in the structured enterprise information table, and constructing the abnormality detection index of the abnormality between the index items includes:
mutual information of a single index item and other index items is adopted to reflect a co-occurrence rule among the index items, and a mutual association relation learning process among the index items is converted into a learning process of a mutual information measuring function among the index items;
for each index item
Figure 904568DEST_PATH_IMAGE014
Learning the mutual information metric function between the index item and other index items
Figure 556129DEST_PATH_IMAGE052
The index item
Figure 804708DEST_PATH_IMAGE014
Index value of (1)
Figure 934338DEST_PATH_IMAGE022
I.e. by
Figure 483131DEST_PATH_IMAGE015
And the index value
Figure 774435DEST_PATH_IMAGE022
Vector formed by index values corresponding to other index items
Figure 25157DEST_PATH_IMAGE023
Mapping to an index value
Figure 692898DEST_PATH_IMAGE022
And the index value
Figure 96198DEST_PATH_IMAGE022
Corresponding to the mutual information size between the index values in other index items
Figure 823982DEST_PATH_IMAGE053
(ii) a Wherein
Figure 47153DEST_PATH_IMAGE025
Means for removing
Figure 518586DEST_PATH_IMAGE014
Other index items form the space of the vector;
the abnormity detection indexes for constructing abnormity among the index items are as follows:
Figure 25660DEST_PATH_IMAGE026
wherein,
Figure 924345DEST_PATH_IMAGE027
denotes the first
Figure 900392DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 909936DEST_PATH_IMAGE007
indicates the total number of index items.
According to the system of the second aspect of the present invention, the fourth processing module is configured to learn the structural semantic information of the whole enterprise, which is composed of and reflected by all index items of the enterprise together, and the constructing of the abnormality detection index of the structural semantic information abnormality includes:
first, construct
Figure 22248DEST_PATH_IMAGE028
A fourth deep neural network
Figure 91836DEST_PATH_IMAGE029
;
Figure 804446DEST_PATH_IMAGE054
Will be provided with
Figure 352102DEST_PATH_IMAGE031
Normalized result
Figure 584500DEST_PATH_IMAGE032
ConversionTo the first
Figure 90567DEST_PATH_IMAGE028
In a hidden space; wherein the index value is normalized to be between-1 and 1; then, a third deep neural network is constructed
Figure 510047DEST_PATH_IMAGE033
Will be
Figure 126974DEST_PATH_IMAGE032
And
Figure 197567DEST_PATH_IMAGE034
mapped as depth space
Figure 608956DEST_PATH_IMAGE035
Dimensional vectors, i.e.
Figure 437104DEST_PATH_IMAGE055
And
Figure 326563DEST_PATH_IMAGE037
(ii) a For simplicity of description, note
Figure 533553DEST_PATH_IMAGE055
To record
Figure 115844DEST_PATH_IMAGE056
Is composed of
Figure 24763DEST_PATH_IMAGE040
In the depth space, an exponential cosine similarity is adopted to define a similarity measurement function, and calculation is carried out
Figure 983492DEST_PATH_IMAGE038
And
Figure 513830DEST_PATH_IMAGE040
degree of similarity of
Figure 532602DEST_PATH_IMAGE057
The abnormity detection indexes for constructing the structural semantic information abnormity are as follows:
Figure 679550DEST_PATH_IMAGE058
wherein,
Figure 176390DEST_PATH_IMAGE043
denotes the first
Figure 341661DEST_PATH_IMAGE005
And (4) an abnormal detection index of the structural semantic information abnormity of the home enterprise.
According to the system of the second aspect of the present invention, the fifth processing module is configured to fuse the abnormality detection index of abnormality in the index items, the abnormality detection index of abnormality between the index items, and the abnormality detection index of abnormality in the structured semantic information, and the constructing of the comprehensive abnormality detection index includes:
Figure 265755DEST_PATH_IMAGE044
wherein,
Figure 899998DEST_PATH_IMAGE045
first, the
Figure 934950DEST_PATH_IMAGE005
The comprehensive abnormal detection indexes of the home enterprises,
Figure 971039DEST_PATH_IMAGE020
is shown as
Figure 66034DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 171263DEST_PATH_IMAGE027
is shown as
Figure 275485DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 634922DEST_PATH_IMAGE043
is shown as
Figure 166397DEST_PATH_IMAGE005
Abnormality detection indexes of structural semantic information abnormality of the home enterprise; represent
Figure 509654DEST_PATH_IMAGE020
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 151988DEST_PATH_IMAGE047
to represent
Figure 146358DEST_PATH_IMAGE027
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 583155DEST_PATH_IMAGE048
represent
Figure 413708DEST_PATH_IMAGE043
The weighting coefficient of (2) is a hyper-parameter set for a person.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor executes the computer program to realize the steps of the enterprise risk identification method based on deep anomaly detection in any one of the first aspect of the invention.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program, which when executed by a processor, implements the steps in a method for enterprise risk identification based on deep anomaly detection according to any one of the first aspect of the present invention.
The technical effect that this application will reach is realized through following scheme: after the three-level deep learning from the index item to the semantic structural information is carried out, the three-level abnormal clues are fused to construct a hierarchical abnormal detection index system, so that the omnibearing and multi-level enterprise abnormal detection is realized.
Drawings
In order to more clearly illustrate the embodiments or prior art solutions of the present application, the drawings needed for describing the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings can be obtained by those skilled in the art without inventive exercise.
Fig. 1 is a flowchart of an enterprise risk identification method based on deep anomaly detection according to an embodiment of the present application;
FIG. 2 is a block diagram of an enterprise risk identification system based on deep anomaly detection according to an embodiment of the present invention;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following embodiments and accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
The invention discloses an enterprise risk identification method based on deep anomaly detection. Fig. 1 is a flowchart of an enterprise risk identification method based on deep anomaly detection according to an embodiment of the present invention, as shown in fig. 1, the method includes:
s1, acquiring basic information of an enterprise, processing the basic information into index items, and integrating a plurality of index items to construct a structured enterprise information table;
s2, learning the internal distribution relation of each index item in the structured enterprise information table, and constructing abnormal detection indexes of abnormality in the index items;
s3, learning the correlation among the index items in the structured enterprise information table, and constructing abnormal detection indexes of abnormality among the index items;
s4, learning the structural semantic information of the whole enterprise, which is formed by all index items of the enterprise together and reflected, and constructing an abnormal detection index of abnormal structural semantic information;
s5, fusing abnormal detection indexes of abnormality in the index items, abnormal detection indexes of abnormality among the index items and abnormal detection indexes of structural semantic information abnormality to construct a comprehensive abnormal detection index;
and S6, obtaining enterprise abnormal scores according to the comprehensive abnormal detection indexes.
In step S1, basic information of an enterprise is collected, the basic information is processed into index items, a plurality of index items are integrated to construct a structured enterprise information table, and a data basis is provided for deep detection of abnormal information of the enterprise.
In some embodiments, in said step S1, the underlying information sources include, but are not limited to, enterprise-disclosed financial data, underlying business data, capital market data, and the like.
The method for processing the basic information into index items and integrating a plurality of index items to construct the structured enterprise information table comprises the following steps:
arranging the collected basic information into a formal representation of a triple:
Figure 859733DEST_PATH_IMAGE001
wherein
Figure 193762DEST_PATH_IMAGE002
To represent
Figure 67040DEST_PATH_IMAGE003
A collection of home business entities that are,
Figure 368578DEST_PATH_IMAGE004
is shown as
Figure 352714DEST_PATH_IMAGE005
The number of the home enterprise,
Figure 72408DEST_PATH_IMAGE006
represent
Figure 851009DEST_PATH_IMAGE007
The collection of the items of the individual indexes,
Figure 656154DEST_PATH_IMAGE008
is a function set and a function for mapping the enterprise to the index value corresponding to each index item
Figure 443981DEST_PATH_IMAGE009
An enterprise-specific indicator value is given to the enterprise,
Figure 25308DEST_PATH_IMAGE010
is shown as
Figure 240388DEST_PATH_IMAGE011
The value range of each index item; on the basis of the triple formal representation, an enterprise information table is constructed, and the specific process is as follows: will be provided with
Figure 1671DEST_PATH_IMAGE059
In the longitudinal direction, will
Figure 593189DEST_PATH_IMAGE060
Constructing a table according to horizontal arrangement; will be provided with
Figure 21897DEST_PATH_IMAGE004
And
Figure 391567DEST_PATH_IMAGE014
value assignment of corresponding position in table
Figure 171304DEST_PATH_IMAGE015
(ii) a In the enterprise information table, the
Figure 35355DEST_PATH_IMAGE005
The index value of the home enterprise is expressed in a vector form:
Figure 318569DEST_PATH_IMAGE049
. For convenience of presentation, the following description uses
Figure 141031DEST_PATH_IMAGE022
Representing a vector
Figure 876906DEST_PATH_IMAGE031
To (1)
Figure 59494DEST_PATH_IMAGE011
Item, i.e.
Figure 197215DEST_PATH_IMAGE015
In step S2, the internal distribution relationship of each index item in the structured enterprise information table is learned, and an abnormal detection index of abnormality in the index item is constructed, so as to provide a basis for effective detection of an enterprise when a single index item is abnormal.
In some embodiments, in step S2, the learning of the internal distribution relationship of each index item in the structured enterprise information table, and the method for constructing an abnormal detection index of an abnormality in an index item includes:
reflecting the distribution rule of each index value by adopting a frequency statistical method, and converting the learning process of the distribution relation in the index into a frequency statistical process of the index values;
for each index item
Figure 924999DEST_PATH_IMAGE014
Learning the distribution function of index values
Figure 413750DEST_PATH_IMAGE050
Will index the item
Figure 885182DEST_PATH_IMAGE014
Index value of (1)
Figure 877409DEST_PATH_IMAGE015
Mapping to frequency of occurrence thereof
Figure 290942DEST_PATH_IMAGE018
The abnormity detection indexes for establishing abnormity in the index item are as follows:
Figure 1409DEST_PATH_IMAGE051
wherein,
Figure 10953DEST_PATH_IMAGE020
is shown as
Figure 388845DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 458432DEST_PATH_IMAGE007
indicates the total number of index items.
Specifically, for index items of the category type, such as the enterprise credit rating data, the present embodiment directly counts the frequency of occurrence of each category type value. If it is
Figure 656195DEST_PATH_IMAGE014
Is a category type index item, and the category type index value of the item is composed of a set
Figure 718698DEST_PATH_IMAGE061
I.e. by
Figure 685517DEST_PATH_IMAGE062
Then, then
Figure 926005DEST_PATH_IMAGE018
Can be calculated according to the following formula:
Figure 611064DEST_PATH_IMAGE063
,
Figure 227991DEST_PATH_IMAGE064
wherein
Figure 314895DEST_PATH_IMAGE065
Indicates the number of elements in the set,
Figure 975553DEST_PATH_IMAGE066
is as follows
Figure 147908DEST_PATH_IMAGE011
The mapping function of the index value in each index item to the enterprise set assigned with the value is defined as formula (2):
Figure 302946DEST_PATH_IMAGE067
。 (2)
for numerical indicators, such as total enterprise asset return rate data, the embodiment first groups the numerical indicators by using dynamic bandwidth statistics
Figure 244357DEST_PATH_IMAGE068
Wherein the indicator items are grouped
Figure 826648DEST_PATH_IMAGE069
The set of index values in the index item corresponds to a group of enterprises. Then, the embodiment calculates the density of each group of index value groups according to the reciprocal of the difference between the maximum value and the minimum value of each group of numerical index value
Figure 486300DEST_PATH_IMAGE070
. Finally, for a certain value in the index item
Figure 428717DEST_PATH_IMAGE071
In this embodiment, the frequency of occurrence is calculated based on the density value of the packet in which the value is located
Figure 490214DEST_PATH_IMAGE072
The specific process is shown as formula (3):
Figure 243406DEST_PATH_IMAGE073
(3)
the dynamic bandwidth statistics method specifically comprises the following steps: sorting numerical data from large to small, starting from maximum, and sequentially sorting the continuous data
Figure 390354DEST_PATH_IMAGE074
Several values are grouped into the same data packet, wherein ⌈ is characterized as ⌉ indicating rounding up; if the number of the index items having the same certain value exceeds the number
Figure 887194DEST_PATH_IMAGE074
Then the numbers are equally divided into the same group; if the data in a certain group are found to be equal after the group is constructed, a value subsequent to the value is classified into the group. For data packets
Figure 786886DEST_PATH_IMAGE075
The density value calculation method for the data packet described above is shown in equation (4):
Figure 976559DEST_PATH_IMAGE076
(4)
wherein the step (B),
Figure 345223DEST_PATH_IMAGE077
and
Figure 645754DEST_PATH_IMAGE078
the maximum value and the minimum value in a certain set are respectively returned as a function of the maximum value and the minimum value.
In step S3, the correlation between the index items in the structured enterprise information table is learned, an abnormal detection index for abnormality between the index items is constructed, and an abnormal situation that may occur in the enterprise is prompted by finding an abnormal change sign of the combination relationship between the index items.
In some embodiments, in step S3, the learning of the correlation between the index items in the structured enterprise information table, and the method for constructing an abnormal detection index of abnormality between the index items includes:
mutual information of a single index item and other index items is adopted to reflect a co-occurrence rule among the index items, and a mutual association relation learning process among the index items is converted into a learning process of a mutual information measuring function among the index items;
for each index item
Figure 416264DEST_PATH_IMAGE079
Learning mutual information measuring function between the index item and other index items
Figure 511259DEST_PATH_IMAGE080
The index item
Figure 882067DEST_PATH_IMAGE079
Index value of (1)
Figure 720710DEST_PATH_IMAGE081
I.e. by
Figure 345726DEST_PATH_IMAGE082
And the index value
Figure 877201DEST_PATH_IMAGE081
Vector formed by index values corresponding to other index items
Figure 220458DEST_PATH_IMAGE083
Mapping to an index value
Figure 862792DEST_PATH_IMAGE081
And the index value
Figure 857162DEST_PATH_IMAGE081
Corresponding to the mutual information size between the index values in other index items
Figure 559538DEST_PATH_IMAGE084
(ii) a Wherein
Figure 124512DEST_PATH_IMAGE085
Representing the space where other index items except the other index items form the vector;
the abnormity detection indexes for constructing abnormity among the index items are as follows:
Figure 304958DEST_PATH_IMAGE026
wherein,
Figure 904566DEST_PATH_IMAGE086
is shown as
Figure 777844DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 79382DEST_PATH_IMAGE007
indicates the total number of index items.
Specifically, the embodiment constructs the mutual information metric function by using the similarity of two samples in the deep neural network mapping space
Figure 63518DEST_PATH_IMAGE087
. In particular, a first deep neural network is constructed
Figure 783212DEST_PATH_IMAGE088
And a second deep neural network
Figure 561813DEST_PATH_IMAGE089
Figure 366957DEST_PATH_IMAGE090
Will be provided with
Figure 154785DEST_PATH_IMAGE081
Normalized result of (2)
Figure 712674DEST_PATH_IMAGE091
And
Figure 927755DEST_PATH_IMAGE011
one-hot coding of
Figure 954617DEST_PATH_IMAGE092
(wherein,
Figure 280556DEST_PATH_IMAGE093
if, if
Figure 709263DEST_PATH_IMAGE094
Is equal to
Figure 95245DEST_PATH_IMAGE011
Figure 858671DEST_PATH_IMAGE095
If, if
Figure 988301DEST_PATH_IMAGE094
Is not equal to
Figure 271514DEST_PATH_IMAGE011
) Mapped as depth space
Figure 828398DEST_PATH_IMAGE035
Dimension vector
Figure 829852DEST_PATH_IMAGE096
Figure 497593DEST_PATH_IMAGE097
Will be provided with
Figure 884581DEST_PATH_IMAGE098
Normalized result of (2)
Figure 877945DEST_PATH_IMAGE099
And
Figure 835537DEST_PATH_IMAGE011
one-hot coding of
Figure 572549DEST_PATH_IMAGE100
Mapped as depth space
Figure 830355DEST_PATH_IMAGE035
Dimension vector
Figure 729041DEST_PATH_IMAGE101
(ii) a In the above-mentioned context, it is preferred that,
Figure 688775DEST_PATH_IMAGE102
and
Figure 963899DEST_PATH_IMAGE103
normalizing the value to between-1 and 1; for the
Figure 76211DEST_PATH_IMAGE081
And
Figure 145798DEST_PATH_IMAGE098
the mutual information metric function is defined as shown in formula (5):
Figure 343561DEST_PATH_IMAGE104
。 (5)
the embodiment adopts a comparative learning mode to learn the mutual information measuring function
Figure 422376DEST_PATH_IMAGE105
. First, the present embodiment constructs a sample of comparative learning for the second
Figure 638462DEST_PATH_IMAGE005
The first of an enterprise
Figure 878951DEST_PATH_IMAGE011
Item of individual index
Figure 829589DEST_PATH_IMAGE014
Will be
Figure 180936DEST_PATH_IMAGE022
As a positive sample, will
Figure 267841DEST_PATH_IMAGE106
As negative samples, thus, a total of 1 positive sample sum can be constructed
Figure 679231DEST_PATH_IMAGE107
A negative example. The comparative learning process adopted in the embodiment makes
Figure 835275DEST_PATH_IMAGE108
The mutual information with the positive samples is maximized,
Figure 255892DEST_PATH_IMAGE108
mutual information with negative examples is minimized. The loss function for training the deep neural network is shown in equation (6):
Figure 931724DEST_PATH_IMAGE109
。 (6)
and S4, learning the structural semantic information of the whole enterprise, which is formed by all index items of the enterprise together and reflected, and constructing an abnormal detection index of abnormal structural semantic information.
In some embodiments, in step S4, the method for learning the structural semantic information of the whole enterprise, which is composed of and reflected by the index items of the enterprise together, and constructing the abnormality detection index for abnormality of the structural semantic information includes:
the data overall structural semantic information is studied and focused on structural semantic information which is formed by all index items and can reflect the overall condition of an enterprise. In the embodiment, various types of nodes of the data under normal conditions are embedded in the learning through a deep learning methodMutually independent hidden spaces of the semantic information are constructed. First, construct
Figure 779594DEST_PATH_IMAGE028
A fourth deep neural network
Figure 439245DEST_PATH_IMAGE029
;
Figure 381662DEST_PATH_IMAGE054
Will be provided with
Figure 443159DEST_PATH_IMAGE031
Normalized result
Figure 461931DEST_PATH_IMAGE110
Switch over to
Figure 77720DEST_PATH_IMAGE028
In a hidden space; wherein,
Figure 840140DEST_PATH_IMAGE032
normalizing the index value to be between-1 and 1; then, a third deep neural network is constructed
Figure 490564DEST_PATH_IMAGE033
Will be
Figure 929504DEST_PATH_IMAGE032
And
Figure 563748DEST_PATH_IMAGE034
mapped as depth space
Figure 864279DEST_PATH_IMAGE035
Dimensional vectors, i.e.
Figure 369210DEST_PATH_IMAGE036
And
Figure 729784DEST_PATH_IMAGE037
(ii) a For simplicity of description, note
Figure 320165DEST_PATH_IMAGE036
Is composed of
Figure 673655DEST_PATH_IMAGE038
Record of
Figure 564251DEST_PATH_IMAGE056
Is composed of
Figure 95726DEST_PATH_IMAGE040
Defining a similarity measure function in said depth space using an exponential cosine similarity
Figure 173404DEST_PATH_IMAGE111
Calculating
Figure 815738DEST_PATH_IMAGE038
And
Figure 560840DEST_PATH_IMAGE040
degree of similarity of (2)
Figure 246905DEST_PATH_IMAGE041
The anomaly detection index for constructing the structural semantic information anomaly is formula (7):
Figure 811879DEST_PATH_IMAGE112
(7)
wherein,
Figure 257903DEST_PATH_IMAGE113
denotes the first
Figure 123091DEST_PATH_IMAGE005
And (4) anomaly detection indexes of anomaly of the structured semantic information of the home enterprise.
Specifically, the method trains the constructed deep neural network so as to be in a deep space
Figure 730790DEST_PATH_IMAGE114
Under the measurement, the original data space and each hidden space have a larger similarity (namely, the data structured semantic information is embedded in the hidden space), and each hidden space has a smaller similarity (namely, the hidden spaces are independent from each other). The loss function for training the deep neural network is shown in equation (8):
Figure 783060DEST_PATH_IMAGE115
。(8)
and S5, fusing the abnormal detection indexes of the abnormality in the index items, the abnormal detection indexes of the abnormality between the index items and the abnormal detection indexes of the structural semantic information abnormality to construct a comprehensive abnormal detection index.
In some embodiments, in step S5, the method for fusing the abnormality detection indexes of the abnormalities in the index items, the abnormality detection indexes of the abnormalities between the index items, and the abnormality detection indexes of the abnormalities in the structured semantic information includes:
Figure 282043DEST_PATH_IMAGE044
wherein,
Figure 736158DEST_PATH_IMAGE116
first, the
Figure 780337DEST_PATH_IMAGE005
The comprehensive abnormal detection indexes of the home enterprises,
Figure 319903DEST_PATH_IMAGE117
is shown as
Figure 842151DEST_PATH_IMAGE005
An abnormal detection index of abnormality in the index items of the home enterprise,
Figure 416352DEST_PATH_IMAGE086
is shown as
Figure 615121DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 907562DEST_PATH_IMAGE113
is shown as
Figure 967922DEST_PATH_IMAGE005
Abnormality detection indexes of structural semantic information abnormality of the home enterprise;
Figure 662209DEST_PATH_IMAGE118
to represent
Figure 48191DEST_PATH_IMAGE117
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 562349DEST_PATH_IMAGE119
to represent
Figure 675667DEST_PATH_IMAGE086
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 224460DEST_PATH_IMAGE120
to represent
Figure 515764DEST_PATH_IMAGE113
The weighting coefficient of (b) is a hyper-parameter set for the person.
And S6, obtaining enterprise abnormal scores according to the comprehensive abnormal detection indexes.
In some embodiments, in step S6, it is determined whether the enterprise risk is due to a single index item abnormality, an association relation abnormality between index items, or an enterprise data overall presentation abnormality according to the abnormality detection index for an intra-index item abnormality, the abnormality detection index for an inter-index item abnormality, and the abnormality detection index for a structured semantic information abnormality.
Specifically, the embodiment only needs to perform one on the collected enterprise dataAnd performing anomaly detection on the enterprise data by training and learning. According to the comprehensive abnormality detection index
Figure 517218DEST_PATH_IMAGE121
Enterprise anomaly scores can be obtained, and the larger the score is, the more suspected anomaly of the enterprise is represented, so that enterprise risks can be prompted. Meanwhile, the annual abnormality detection indexes of a single enterprise can be considered and compared
Figure 184960DEST_PATH_IMAGE121
If, if
Figure 837527DEST_PATH_IMAGE121
And if the risk is increased, prompting the enterprise to increase the risk. In addition, because the method adopts a hierarchical abnormality detection index system, the method can be realized by
Figure 565312DEST_PATH_IMAGE122
Figure 54062DEST_PATH_IMAGE123
Figure 525494DEST_PATH_IMAGE124
And determining whether the enterprise risk is from single index item abnormality, association relation abnormality between index items or enterprise data overall presentation abnormality according to the value.
In summary, the scheme provided by the invention can fuse the abnormal clues of three layers after the deep learning of three layers from the index item to the semantic structural information is carried out, and construct a hierarchical abnormal detection index system, thereby realizing the omnibearing and multilevel enterprise abnormal detection.
The invention discloses an enterprise risk identification system based on deep anomaly detection in a second aspect. FIG. 2 is a block diagram of an enterprise risk identification system based on deep anomaly detection according to an embodiment of the present invention; as shown in fig. 2, the system 100 includes:
the system comprises a first processing module 101, a second processing module and a third processing module, wherein the first processing module is configured to collect basic information of an enterprise, process the basic information into index items, and integrate a plurality of index items to construct a structured enterprise information table;
the second processing module 102 is configured to learn the internal distribution relation of each index item in the structured enterprise information table, and construct an abnormal detection index of abnormality in the index item;
a third processing module 103, configured to learn the correlation between the index items in the structured enterprise information table, and construct an abnormal detection index of abnormality between the index items;
the fourth processing module 104 is configured to learn the structural semantic information of the whole enterprise, which is composed and reflected by all index items of the enterprise together, and construct an abnormal detection index of abnormal structural semantic information;
a fifth processing module 105, configured to fuse the abnormal detection index of the abnormality in the index items, the abnormal detection index of the abnormality between the index items, and the abnormal detection index of the structural semantic information abnormality to construct a comprehensive abnormal detection index;
a sixth processing module 106, configured to obtain an enterprise anomaly score according to the comprehensive anomaly detection index.
The system according to the second aspect of the present invention, the sixth processing module 106, is configured to further include: and determining the source of enterprise risk according to the abnormal detection indexes of the abnormality in the index items, the abnormal detection indexes of the abnormality among the index items and the abnormal detection indexes of the abnormality of the structural semantic information, wherein the source of the enterprise risk comprises single index item abnormality, incidence relation abnormality among the index items and enterprise data integral presentation abnormality.
According to the system of the second aspect of the present invention, the first processing module 101 is configured to process the basic information into index items, and the integrating a plurality of index items to construct the structured enterprise information table includes:
arranging the acquired basic information into a formal representation of a triple:
Figure 783300DEST_PATH_IMAGE001
wherein
Figure 681986DEST_PATH_IMAGE002
To represent
Figure 641721DEST_PATH_IMAGE003
A collection of home business entities that are,
Figure 916844DEST_PATH_IMAGE004
denotes the first
Figure 763578DEST_PATH_IMAGE005
The number of the home enterprise,
Figure 98744DEST_PATH_IMAGE006
to represent
Figure 30928DEST_PATH_IMAGE007
The collection of the items of the individual indexes,
Figure 109742DEST_PATH_IMAGE008
is a function set and a function for mapping the enterprise to the index value corresponding to each index item
Figure 325829DEST_PATH_IMAGE009
An enterprise-specific indicator value is given to the enterprise,
Figure 566317DEST_PATH_IMAGE010
is shown as
Figure 251377DEST_PATH_IMAGE011
The value range of each index item; on the basis of the triple formal representation, an enterprise information table is constructed, and the specific process is as follows: will be provided with
Figure 868303DEST_PATH_IMAGE012
In the longitudinal direction, will
Figure 689628DEST_PATH_IMAGE013
Constructing a table according to horizontal arrangement; will be provided with
Figure 366597DEST_PATH_IMAGE004
And
Figure 788220DEST_PATH_IMAGE014
value assignment of corresponding position in table
Figure 943258DEST_PATH_IMAGE015
(ii) a In the enterprise information table, the
Figure 884669DEST_PATH_IMAGE005
The index value of the home enterprise is expressed in a vector form:
Figure 466960DEST_PATH_IMAGE125
according to the system of the second aspect of the present invention, the second processing module 102 is configured to learn the internal distribution relationship of each index item in the structured enterprise information table, and constructing an anomaly detection index of an anomaly in an index item includes:
reflecting the distribution rule of each index value by adopting a frequency statistical method, and converting the learning process of the distribution relation in the index into a frequency statistical process of the index values;
for each index item
Figure 126612DEST_PATH_IMAGE079
Learning the distribution function of index values
Figure 85340DEST_PATH_IMAGE126
The index item
Figure 130526DEST_PATH_IMAGE079
Index value of (1)
Figure 883718DEST_PATH_IMAGE082
Mapping to frequency of its occurrence
Figure 30666DEST_PATH_IMAGE127
The abnormal detection indexes for establishing the abnormal indexes in the index item are as follows:
Figure 527506DEST_PATH_IMAGE051
wherein,
Figure 443510DEST_PATH_IMAGE117
is shown as
Figure 633182DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 985535DEST_PATH_IMAGE007
indicates the total number of index items.
According to the system of the second aspect of the present invention, the third processing module 103 is configured to learn the correlation between the index items in the structured enterprise information table, and the constructing an abnormal detection index of the abnormality between the index items includes:
mutual information of a single index item and other index items is adopted to reflect a co-occurrence rule among the index items, and a mutual association relation learning process among the index items is converted into a learning process of a mutual information measuring function among the index items;
for each index item
Figure 286067DEST_PATH_IMAGE079
Learning the mutual information metric function between the index item and other index items
Figure 56576DEST_PATH_IMAGE021
Will index the item
Figure 417151DEST_PATH_IMAGE079
Index value of (1)
Figure 273111DEST_PATH_IMAGE081
I.e. by
Figure 361022DEST_PATH_IMAGE082
And the index value
Figure 251617DEST_PATH_IMAGE081
Vector formed by index values corresponding to other index items
Figure 783093DEST_PATH_IMAGE128
Mapping to an index value
Figure 860770DEST_PATH_IMAGE081
And the index value
Figure 503104DEST_PATH_IMAGE081
Corresponding to the mutual information size between the index values in other index items
Figure 248206DEST_PATH_IMAGE084
(ii) a Wherein
Figure 934271DEST_PATH_IMAGE025
Means for removing
Figure 499245DEST_PATH_IMAGE079
Other index items form the space of the vector;
the abnormity detection indexes for constructing abnormity among the index items are as follows:
Figure 945270DEST_PATH_IMAGE026
wherein,
Figure 544878DEST_PATH_IMAGE086
denotes the first
Figure 152577DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 470426DEST_PATH_IMAGE007
indicates the total number of index items.
According to the system of the second aspect of the present invention, the fourth processing module 104 is configured to learn the structural semantic information of the whole enterprise, which is composed of and reflected by the index items of the enterprise together, and the constructing of the abnormality detection index for the abnormality of the structural semantic information includes:
first, construct
Figure 727268DEST_PATH_IMAGE028
A fourth deep neural network
Figure 181383DEST_PATH_IMAGE029
;
Figure 225562DEST_PATH_IMAGE030
Will be provided with
Figure 765128DEST_PATH_IMAGE031
Normalized result
Figure 552955DEST_PATH_IMAGE110
Switch over to
Figure 110845DEST_PATH_IMAGE028
In a hidden space; wherein,
Figure 325925DEST_PATH_IMAGE032
normalizing the index value to be between-1 and 1; then, a third deep neural network is constructed
Figure 618366DEST_PATH_IMAGE033
Will be
Figure 678726DEST_PATH_IMAGE032
And
Figure 107433DEST_PATH_IMAGE034
mapped to depth space
Figure 758995DEST_PATH_IMAGE035
Dimension vector, i.e.
Figure 256841DEST_PATH_IMAGE036
And
Figure 386471DEST_PATH_IMAGE037
(ii) a For simplicity of description, note
Figure 669685DEST_PATH_IMAGE036
Is composed of
Figure 226568DEST_PATH_IMAGE038
Memory for recording
Figure 228022DEST_PATH_IMAGE056
Is composed of
Figure 895764DEST_PATH_IMAGE040
In the depth space, an exponential cosine similarity is adopted to define a similarity measurement function, and calculation is carried out
Figure 548331DEST_PATH_IMAGE038
And
Figure 10536DEST_PATH_IMAGE040
degree of similarity of (2)
Figure 499286DEST_PATH_IMAGE041
The abnormity detection indexes for constructing the structural semantic information abnormity are as follows:
Figure 970719DEST_PATH_IMAGE058
wherein,
Figure 962946DEST_PATH_IMAGE113
is shown as
Figure 127211DEST_PATH_IMAGE005
And (4) an abnormal detection index of the structural semantic information abnormity of the home enterprise.
According to the system of the second aspect of the present invention, the fifth processing module 105 is configured to fuse the abnormality detection index of abnormality in the index items, the abnormality detection index of abnormality between the index items, and the abnormality detection index of abnormality in the structured semantic information, and the constructing of the comprehensive abnormality detection index includes:
Figure 352525DEST_PATH_IMAGE044
wherein,
Figure 96490DEST_PATH_IMAGE116
first, the
Figure 474382DEST_PATH_IMAGE005
The comprehensive abnormal detection indexes of the home enterprises,
Figure 278390DEST_PATH_IMAGE117
is shown as
Figure 741732DEST_PATH_IMAGE005
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 679601DEST_PATH_IMAGE086
is shown as
Figure 895688DEST_PATH_IMAGE005
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 136176DEST_PATH_IMAGE113
is shown as
Figure 821235DEST_PATH_IMAGE005
Abnormality detection indexes of structural semantic information abnormality of the home enterprise;
Figure 438161DEST_PATH_IMAGE118
to represent
Figure 259487DEST_PATH_IMAGE117
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 185723DEST_PATH_IMAGE119
to represent
Figure 92500DEST_PATH_IMAGE086
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 247537DEST_PATH_IMAGE120
represent
Figure 454528DEST_PATH_IMAGE113
The weighting coefficient of (2) is a hyper-parameter set for a person.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor executes the computer program to realize the steps of the enterprise risk identification method based on deep anomaly detection in any one of the first aspect of the invention.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device includes a processor, a memory, a network interface, a display screen, and an input device, which are connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the electronic device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the electronic equipment, an external keyboard, a touch pad or a mouse and the like.
It will be understood by those skilled in the art that the structure shown in fig. 3 is only a partial block diagram related to the technical solution of the present invention, and does not constitute a limitation of the electronic device to which the solution of the present application is applied, and a specific electronic device may include more or less components than those shown in the drawings, or combine some components, or have a different arrangement of components.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the method for enterprise risk identification based on deep anomaly detection according to any one of the first aspect of the present disclosure.
Note that, the technical features of the above embodiments may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description in the present specification. The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. An enterprise risk identification method based on deep anomaly detection is characterized by comprising the following steps:
s1, acquiring basic information of an enterprise, processing the basic information into index items, and integrating a plurality of index items to construct a structured enterprise information table;
s2, learning the internal distribution relation of each index item in the structured enterprise information table, and constructing abnormal detection indexes of abnormality in the index items;
s3, learning the correlation among the index items in the structured enterprise information table, and constructing abnormal detection indexes of abnormality among the index items;
s4, learning the structural semantic information of the whole enterprise, which is formed by all index items of the enterprise together and reflected, and constructing an abnormal detection index of abnormal structural semantic information;
s5, fusing abnormal detection indexes of abnormality in the index items, abnormal detection indexes of abnormality among the index items and abnormal detection indexes of structural semantic information abnormality to construct a comprehensive abnormal detection index;
and S6, obtaining enterprise anomaly scores according to the comprehensive anomaly detection indexes.
2. The enterprise risk identification method based on deep anomaly detection according to claim 1, wherein in the step S6, the method further comprises: and determining the source of enterprise risk according to the abnormal detection indexes of the abnormality in the index items, the abnormal detection indexes of the abnormality among the index items and the abnormal detection indexes of the abnormality of the structural semantic information, wherein the source of the enterprise risk comprises single index item abnormality, incidence relation abnormality among the index items and enterprise data integral presentation abnormality.
3. The method for enterprise risk identification based on deep anomaly detection according to claim 1, wherein in the step S1, the method for processing the basic information into index items and integrating a plurality of index items to construct a structured enterprise information table comprises:
arranging the acquired basic information into a formal representation of a triple:
Figure 463890DEST_PATH_IMAGE001
wherein
Figure 919011DEST_PATH_IMAGE002
Representing a collection of N enterprises, o i Which represents the ith enterprise, is a company,
Figure 645658DEST_PATH_IMAGE003
a set of D index items is represented,
Figure 341082DEST_PATH_IMAGE004
is a function set and a function for mapping the enterprise to the index value corresponding to each index item
Figure 520391DEST_PATH_IMAGE005
An enterprise-specific indicator value is assigned to the enterprise,
Figure 529935DEST_PATH_IMAGE006
a value range representing the jth index item; on the basis of the triple formal representation, an enterprise information table is constructed, and the specific process is as follows: arranging the O in a longitudinal direction, and arranging the A in a transverse direction to construct a table; will be provided with
Figure 438985DEST_PATH_IMAGE007
And
Figure 492261DEST_PATH_IMAGE008
value assignment of corresponding position in table
Figure 424444DEST_PATH_IMAGE009
(ii) a In the enterprise information table, the
Figure 768838DEST_PATH_IMAGE010
The index value of the home enterprise is expressed in a vector form:
Figure 735657DEST_PATH_IMAGE011
4. the method for enterprise risk identification based on deep anomaly detection according to claim 1, wherein in the step S2, the learning of the internal distribution relationship of each index item in the structured enterprise information table and the constructing of the anomaly detection index of anomaly within an index item comprises:
reflecting the distribution rule of each index value by adopting a frequency statistical method, and converting the learning process of the distribution relation in the index into a frequency statistical process of the index values;
for each index item
Figure 444987DEST_PATH_IMAGE008
Learning the distribution function of index values
Figure 379314DEST_PATH_IMAGE012
Will index the item
Figure 527399DEST_PATH_IMAGE008
Index of (1)
Figure 83145DEST_PATH_IMAGE009
Value mapping as frequency of its occurrence
Figure 291272DEST_PATH_IMAGE013
The abnormity detection indexes for establishing abnormity in the index item are as follows:
Figure 198048DEST_PATH_IMAGE014
wherein,
Figure 821928DEST_PATH_IMAGE015
is shown as
Figure 560077DEST_PATH_IMAGE010
Abnormal detection indexes of abnormality in the index items of the home enterprise,
Figure 860477DEST_PATH_IMAGE016
indicates the total number of index items.
5. The method for enterprise risk identification based on deep anomaly detection according to claim 1, wherein in the step S3, the learning of the correlation between the index items in the structured enterprise information table and the constructing of the anomaly detection index of anomaly between the index items comprise:
the mutual information of a single index item and other index items is adopted to reflect the co-occurrence rule among the index items, and the learning process of the correlation relation among the index items is converted into the learning process of the mutual information measuring function among the index items;
for each index item
Figure 254549DEST_PATH_IMAGE008
Learning mutual information measuring function between the index item and other index items
Figure 744436DEST_PATH_IMAGE017
The index item
Figure 274775DEST_PATH_IMAGE008
Index value of (1)
Figure 27967DEST_PATH_IMAGE018
I.e. by
Figure 706073DEST_PATH_IMAGE009
And the index value
Figure 921023DEST_PATH_IMAGE018
Vector formed by index values corresponding to other index items
Figure 368185DEST_PATH_IMAGE019
Mapping to an index value
Figure 292278DEST_PATH_IMAGE018
And the index value
Figure 395363DEST_PATH_IMAGE018
Corresponding to the mutual information size between the index values in other index items
Figure 227053DEST_PATH_IMAGE020
(ii) a Wherein
Figure 731984DEST_PATH_IMAGE021
Express except
Figure 810667DEST_PATH_IMAGE008
Other index items form the space of the vector;
the abnormity detection indexes for constructing abnormity among the index items are as follows:
Figure 666628DEST_PATH_IMAGE022
wherein,
Figure 36429DEST_PATH_IMAGE023
is shown as
Figure 661446DEST_PATH_IMAGE010
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 661763DEST_PATH_IMAGE016
indicates the total number of index items.
6. The method for enterprise risk identification based on deep anomaly detection according to claim 3, wherein in the step S4, the method for learning the structural semantic information of the enterprise which is composed of and reflected by all index items of the enterprise together and constructing the anomaly detection index of the structural semantic information anomaly comprises the following steps:
first, construct
Figure 536178DEST_PATH_IMAGE024
A fourth deep neural network
Figure 162200DEST_PATH_IMAGE025
;
Figure 438461DEST_PATH_IMAGE026
Will be provided with
Figure 875258DEST_PATH_IMAGE027
Normalized result
Figure 174652DEST_PATH_IMAGE028
Switch to
Figure 886256DEST_PATH_IMAGE024
In a hidden space; wherein,
Figure 220286DEST_PATH_IMAGE028
normalizing the index value to be between-1 and 1; then, a third deep neural network is constructed
Figure 77252DEST_PATH_IMAGE029
Will be
Figure 926260DEST_PATH_IMAGE028
And
Figure 379238DEST_PATH_IMAGE030
mapped to depth space
Figure 630090DEST_PATH_IMAGE031
Dimensional vectors, i.e.
Figure 408691DEST_PATH_IMAGE032
And
Figure 682677DEST_PATH_IMAGE033
(ii) a For simplicity of description, note
Figure 1663DEST_PATH_IMAGE032
Is composed of
Figure 293973DEST_PATH_IMAGE034
Record of
Figure 40212DEST_PATH_IMAGE035
Is composed of
Figure 801495DEST_PATH_IMAGE036
In the depth space, an exponential cosine similarity is adopted to define a similarity measurement function, and calculation is carried out
Figure 861855DEST_PATH_IMAGE034
And
Figure 821720DEST_PATH_IMAGE036
degree of similarity of
Figure 676544DEST_PATH_IMAGE037
The abnormity detection indexes for constructing the structural semantic information abnormity are as follows:
Figure 439969DEST_PATH_IMAGE038
wherein,
Figure 100758DEST_PATH_IMAGE039
denotes the first
Figure 852813DEST_PATH_IMAGE010
And (4) anomaly detection indexes of anomaly of the structured semantic information of the home enterprise.
7. The method for enterprise risk identification based on deep anomaly detection according to claim 1, wherein in the step S5, the method for fusing the anomaly detection indexes of anomalies in the index items, the anomaly detection indexes of anomalies between the index items and the anomaly detection indexes of anomalies in structured semantic information to construct a comprehensive anomaly detection index comprises:
Figure 206434DEST_PATH_IMAGE040
wherein,
Figure 942309DEST_PATH_IMAGE041
first, the
Figure 78892DEST_PATH_IMAGE010
The comprehensive abnormal detection index of the home enterprise,
Figure 13350DEST_PATH_IMAGE015
indicates the first family
Figure 990402DEST_PATH_IMAGE010
Abnormal detection indicators of abnormalities within the indicator items of the enterprise,
Figure 682415DEST_PATH_IMAGE023
is shown as
Figure 153847DEST_PATH_IMAGE010
Abnormal detection indexes of abnormal indexes among index items of the home enterprise,
Figure 942812DEST_PATH_IMAGE039
is shown as
Figure 310339DEST_PATH_IMAGE010
Abnormality detection indexes of structural semantic information abnormality of the home enterprise;
Figure 817544DEST_PATH_IMAGE042
represent
Figure 76356DEST_PATH_IMAGE015
The weighting coefficient of (1) is a human set hyper-parameter;
Figure 657510DEST_PATH_IMAGE043
to represent
Figure 523835DEST_PATH_IMAGE023
The weighting coefficient of (4) is a hyper-parameter set by a person;
Figure 456019DEST_PATH_IMAGE044
represent
Figure 987363DEST_PATH_IMAGE039
The weighting coefficient of (2) is a hyper-parameter set for a person.
8. An enterprise risk identification system for deep anomaly detection based, the system comprising:
the system comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is configured to collect basic information of an enterprise, process the basic information into index items and integrate a plurality of index items to construct a structured enterprise information table;
the second processing module is configured to learn the internal distribution relation of each index item in the structured enterprise information table and construct abnormal detection indexes of abnormality in the index items;
the third processing module is configured to learn the correlation among the index items in the structured enterprise information table, and construct abnormal detection indexes of abnormality among the index items;
the fourth processing module is configured to learn structural semantic information of the whole enterprise, which is formed by and reflected by all index items of the enterprise together, and construct an abnormal detection index of abnormal structural semantic information;
the fifth processing module is configured to fuse abnormal detection indexes of abnormality in the index items, abnormal detection indexes of abnormality between the index items and abnormal detection indexes of structural semantic information abnormality to construct a comprehensive abnormal detection index;
and the sixth processing module is configured to obtain an enterprise anomaly score according to the comprehensive anomaly detection index.
9. An electronic device, characterized in that the electronic device includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the enterprise risk identification method based on deep anomaly detection in any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps of the enterprise risk identification method based on deep anomaly detection according to any one of claims 1 to 7.
CN202211161439.3A 2022-09-23 2022-09-23 Enterprise risk identification method and system based on deep anomaly detection Active CN115239215B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211161439.3A CN115239215B (en) 2022-09-23 2022-09-23 Enterprise risk identification method and system based on deep anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211161439.3A CN115239215B (en) 2022-09-23 2022-09-23 Enterprise risk identification method and system based on deep anomaly detection

Publications (2)

Publication Number Publication Date
CN115239215A true CN115239215A (en) 2022-10-25
CN115239215B CN115239215B (en) 2022-12-20

Family

ID=83667033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211161439.3A Active CN115239215B (en) 2022-09-23 2022-09-23 Enterprise risk identification method and system based on deep anomaly detection

Country Status (1)

Country Link
CN (1) CN115239215B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795056A (en) * 2023-01-04 2023-03-14 中国电子科技集团公司第十五研究所 Method, server and storage medium for constructing knowledge graph by unstructured information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190132343A1 (en) * 2016-09-07 2019-05-02 Patternex, Inc. Method and system for generating synthetic feature vectors from real, labelled feature vectors in artificial intelligence training of a big data machine to defend
CN110020048A (en) * 2017-10-27 2019-07-16 北京宸信征信有限公司 A kind of business risk evaluation system and method based on open source data
CN111178704A (en) * 2019-12-17 2020-05-19 东方微银科技(北京)有限公司 Risk target identification method and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190132343A1 (en) * 2016-09-07 2019-05-02 Patternex, Inc. Method and system for generating synthetic feature vectors from real, labelled feature vectors in artificial intelligence training of a big data machine to defend
CN110020048A (en) * 2017-10-27 2019-07-16 北京宸信征信有限公司 A kind of business risk evaluation system and method based on open source data
CN111178704A (en) * 2019-12-17 2020-05-19 东方微银科技(北京)有限公司 Risk target identification method and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795056A (en) * 2023-01-04 2023-03-14 中国电子科技集团公司第十五研究所 Method, server and storage medium for constructing knowledge graph by unstructured information

Also Published As

Publication number Publication date
CN115239215B (en) 2022-12-20

Similar Documents

Publication Publication Date Title
Ghori et al. Performance analysis of machine learning classifiers for non-technical loss detection
Keramati et al. A proposed classification of data mining techniques in credit scoring
CN109472318B (en) Method and device for selecting features for constructed machine learning model
Cao et al. A two‐stage Bayesian network model for corporate bankruptcy prediction
CN113011973B (en) Method and equipment for financial transaction supervision model based on intelligent contract data lake
Liang et al. Credit risk and limits forecasting in e-commerce consumer lending service via multi-view-aware mixture-of-experts nets
CN111612038B (en) Abnormal user detection method and device, storage medium and electronic equipment
CN111612039A (en) Abnormal user identification method and device, storage medium and electronic equipment
CN111612041A (en) Abnormal user identification method and device, storage medium and electronic equipment
CN110222733B (en) High-precision multi-order neural network classification method and system
CN109829721B (en) Online transaction multi-subject behavior modeling method based on heterogeneous network characterization learning
CN111080117A (en) Method and device for constructing equipment risk label, electronic equipment and storage medium
CN112990386B (en) User value clustering method and device, computer equipment and storage medium
WO2023207557A1 (en) Method and apparatus for evaluating robustness of service prediction model, and computing device
Liu et al. Predicting financial crises with machine learning methods
CN112465231B (en) Method, apparatus and readable storage medium for predicting regional population health status
CN113032525A (en) False news detection method and device, electronic equipment and storage medium
CN111415167B (en) Network fraud transaction detection method and device, computer storage medium and terminal
Lin Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment
CN114092097B (en) Training method of risk identification model, transaction risk determining method and device
Liu et al. Hybridizing kernel‐based fuzzy c‐means with hierarchical selective neural network ensemble model for business failure prediction
CN115239215B (en) Enterprise risk identification method and system based on deep anomaly detection
CN109543712B (en) Method for identifying entities on temporal data set
CN114117418B (en) Method, system, device and storage medium for detecting abnormal account based on community
Iosif et al. A robust blockchain readiness index model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant