CN117874541A

CN117874541A - Method and device for generating resource flow data, computer equipment and storage medium

Info

Publication number: CN117874541A
Application number: CN202410080252.3A
Authority: CN
Inventors: 赵华煜; 刘晓辉; 王志远; 尹科峰
Original assignee: Jindiyun Technology Co ltd
Current assignee: Jindiyun Technology Co ltd
Priority date: 2024-01-19
Filing date: 2024-01-19
Publication date: 2024-04-12

Abstract

The present application relates to a method, an apparatus, a computer device, a storage medium and a computer program product for generating resource traffic data. The method comprises the following steps: acquiring credential data; processing the credential data to obtain to-be-processed influence factor data; calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data; and determining the resource flow data corresponding to the credential data according to the matching degree. The method can improve the generation efficiency of the resource flow data. The present invention can be applied to various business systems such as enterprise resource planning systems (Enterprise Resource Planning, ERP), enterprise management systems, financial systems, human systems, supply chain systems, etc.

Description

Method and device for generating resource flow data, computer equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for generating resource traffic data.

Background

With the development of science and technology, the corresponding generated data volume is also larger and larger. Among these, financial data is particularly apparent because it is data that each business would involve. Cash flow is the collective term for the inflow of cash, the outflow of cash and the total amount thereof, which are produced by an enterprise during a certain accounting period according to cash receipt and payment, through a certain economic activity, including business activity, investment activity, financing activity and infrequent items, i.e., the amount of inflow and outflow of cash and cash equivalents of the enterprise during a certain period. Cash flow is classified into three categories according to source properties: cash flow from business activities, cash flow from investment activities, and cash flow from financing activities. Analysis of cash flow may evaluate an enterprise's ability to obtain cash, repayment, revenue quality, investment activity, and financing activity.

In the related art, when an enterprise analyzes resource flow data in a certificate, resource flow items (such as cash flow items, accounting subjects, accounting dimensions and the like) in the certificate data need to be preset and specified, so that the data of the needed resource flow items can be directly obtained, but the certificate data items of the enterprise are numerous and complex, and various business analysis requirements are adapted, so that the project preset configuration time is high, the accurate configuration is difficult, and the efficiency of generating the resource flow data is low.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, a computer-readable storage medium, and a computer program product for generating resource traffic data that can improve the efficiency of generating resource traffic data.

In a first aspect, the present application provides a method for generating resource traffic data. The method comprises the following steps:

acquiring credential data;

processing the credential data to obtain to-be-processed influence factor data;

calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data;

and determining the resource flow data corresponding to the credential data according to the matching degree.

In one embodiment, the processing the credential data to obtain the to-be-processed influence factor data includes:

if the credential data has a negative credential entry with a resource value, converting the negative credential entry into a positive credential entry to obtain first credential data;

carrying out resource combination on the first credential data according to the attribute of the credential entry to obtain second credential data;

performing pairing treatment on the two sides of the resource interaction on the second credential data to obtain credential data in a preset format;

and carrying out normalization processing on the initial influence factor data in the credential data in the preset format to obtain the influence factor data to be processed.

In one embodiment, the calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data includes:

acquiring initial influence factor weights;

performing weighted calculation on the to-be-processed influence factor data according to the initial influence factor weight to obtain weighted influence factor data;

calculating a first distance between the weighted influence factor data and a reference origin;

and calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data based on the first distance.

In one embodiment, the calculating, based on the first distance, a matching degree between the to-be-processed influence factor data and the reference influence factor data includes:

acquiring a second distance corresponding to the reference influence factor data based on a resource flow model;

and calculating the matching degree between the first distance and the second distance, and taking the matching degree between the first distance and the second distance as the matching degree between the to-be-processed influence factor data and the reference influence factor data.

In one embodiment, the determining, according to the matching degree, the resource traffic data corresponding to the credential data includes:

if at least two second distances are the same and the matching degree between the second distances and the first distances is the highest, determining reference influence factor data with the highest matching degree of the to-be-processed influence factor data corresponding to the first distances from the reference influence factor data corresponding to the at least two second distances according to the initial influence factor weight, and taking the reference influence factor data as target influence factor data;

and taking the resource flow data corresponding to the target influence factor data as the resource flow data of the influence factor data to be processed corresponding to the first distance.

In one embodiment, the resource traffic model building method includes:

acquiring a first training data set; the first training data set includes first impact factor data and first resource traffic data;

normalizing the first influence factor data to obtain normalized first influence factor data;

weighting calculation is carried out on the normalized first influence factor data according to preset influence factor weights, so that first weighted influence factor data are obtained;

calculating a reference distance between first weighted influence factor data of each piece of data in the first training data set and a reference origin;

and generating the resource flow model according to the reference distance, the preset influence factor weight and the first resource flow data.

In one embodiment, the method further comprises:

acquiring a second training data set; the second training data set includes second impact factor data;

normalizing the second influence factor data to obtain normalized second influence factor data;

performing iterative computation for a preset number of times to obtain a preset number of candidate distances, wherein each iterative computation comprises: randomly shifting the preset influence factor weight to obtain a shifted influence factor weight; performing weighted calculation on the normalized second influence factor data based on the offset influence factor weight to obtain second weighted influence factor data; calculating a candidate distance between second weighted influence factor data in each piece of data in the second training set and a reference origin;

And determining target influence factor weights from the offset influence factor weights according to the matching condition between each candidate distance and the reference distance, and optimizing the resource flow model according to the target influence factor weights.

In a second aspect, the present application further provides a device for generating resource traffic data. The device comprises:

the credential data acquisition module is used for acquiring credential data;

the data preprocessing module is used for processing the credential data to obtain to-be-processed influence factor data;

the matching degree calculation module is used for calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data;

and the resource flow data determining module is used for determining the resource flow data corresponding to the credential data according to the matching degree.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

acquiring credential data;

processing the credential data to obtain to-be-processed influence factor data;

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

acquiring credential data;

processing the credential data to obtain to-be-processed influence factor data;

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

acquiring credential data;

processing the credential data to obtain to-be-processed influence factor data;

According to the method, the device, the computer equipment, the storage medium and the computer program product for generating the resource flow data, the credential data are obtained, the credential data are processed to obtain the to-be-processed influence factor data, the matching degree between the to-be-processed influence factor data and the reference influence factor data is calculated, the resource flow data corresponding to the credential data are determined according to the matching degree, the resource flow data corresponding to the credential data can be automatically obtained without pre-configuring and specifying resource items, the generation efficiency of the resource flow data is improved, meanwhile, uncontrollable factors doped by manual pre-configuring can be avoided, and the accuracy of the generated resource flow data can be improved.

Drawings

FIG. 1 is an application environment diagram of a method of generating resource traffic data in one embodiment;

FIG. 2 is a flow diagram of a method of generating resource traffic data in one embodiment;

FIG. 3 is a flow chart of step 206 in one embodiment;

FIG. 4 is a schematic diagram of an optimization flow of a resource traffic model in one embodiment;

FIG. 5 is a flow diagram of a method of generating resource traffic data in one embodiment;

FIG. 6 is a block diagram of an apparatus for generating resource traffic data in one embodiment;

fig. 7 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The method for generating the resource flow data, provided by the embodiment of the application, can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The terminal 102 sends the credential data to the server 104, the server 104 obtains the credential data sent by the terminal 102, the server 104 processes the credential data to obtain to-be-processed influence factor data, calculates a matching degree between the to-be-processed influence factor data and the reference influence factor data, and determines resource flow data corresponding to the credential data according to the matching degree, wherein the resource flow data can be used for generating a resource flow table, for example, a cash flow table and the like. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

It can be understood that the embodiment of the application can be used not only in the application scenario where the server and the terminal interact, but also in the terminal alone or in the scenario corresponding to the server alone.

In one embodiment, as shown in fig. 2, a method for generating resource traffic data is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps 202 to 208.

Step 202, obtaining credential data.

The server obtains the credential data by receiving the credential data sent by other terminals or the server, or obtains the credential data through a man-machine interaction interface of the server itself.

The credential data refers to accounting credential data that has not undergone any data processing process, i.e., data on accounting credentials originally recorded by an enterprise. Accounting documents are written proofs that record economic business, define economic responsibilities, and are formatted to register an accounting book. According to the different programming and usage, the accounting vouchers are divided into original vouchers and accounting vouchers, wherein the original vouchers are also called documents, are original written vouchers such as sales invoices, money receipts and the like which are filled when the economic business initially occurs, and the accounting vouchers are also called accounting vouchers which are classified according to the content of the matters of the economic business based on the original vouchers which are checked without errors, and are filled after the accounting is recorded according to the accounting vouchers. The method is a direct basis for logging in an account book, and common accounting vouchers include a collection voucher, a payment voucher, a transfer voucher and the like. The voucher data in this embodiment includes data of the original voucher or data of the billing voucher.

And 204, processing the credential data to obtain to-be-processed influence factor data.

The server can process the credential data to obtain credential data in a preset format, wherein the credential data in the preset format comprises to-be-processed influence factor data. The to-be-processed influencing factor data refers to factors influencing the resource flow data corresponding to the credential data, and the to-be-processed influencing factor data includes, but is not limited to, accounting books, accounting subjects, accounting dimensions, lending directions or other preset fields. The accounting dimension is the dimension and range which define the accounting subjects need to assist accounting. It will be appreciated that any business material may be customized, such as by a service provider, vendor, engineering project, etc., as an accounting dimension, and the number of accounting dimensions may include a plurality, without limitation.

In an alternative embodiment, the format of the credential data may be converted to obtain credential data in a preset format. For example, the lending direction of the resource values in the credential data may be converted to resource values in the same lending direction; or merging the certificate entries with the same attribute to reduce the data volume of subsequent calculation; or, the two resource interaction parties are paired, namely, transaction data corresponding to the two resource interaction parties appearing in pairs are contained in the same piece of entry data. And extracting initial influence factor data from the credential data in a preset format, and carrying out normalization processing on the initial influence factor data to obtain the influence factor to be processed.

In this embodiment, the credential data may be data of multiple or one credential of the same account book.

Step 206, calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data.

The reference influence factor data is preset influence factor data, and can be obtained by training a machine learning model according to training data, or can be influence factor data predefined by a user. The reference impact factor data is a standard factor that characterizes the impact on the resource traffic data.

Alternatively, the server calculates the matching degree between the to-be-processed influence factor data and the reference influence factor data, and the to-be-processed influence factor data and the reference influence factor data can be characterized by calculating the distance between the to-be-processed influence factor data and the reference influence factor data. For example, the distance between the to-be-processed influence factor data and the reference influence factor data may be calculated by KNN (K-Nearest Neighbor algorithm), SVM (Support Vector Machine ). The degree of matching between the to-be-processed influence factor data and the reference influence factor data can also be directly calculated, for example, by a naive bayes algorithm or the like.

Optionally, the influence factor data to be processed is provided with a corresponding influence factor weight, the influence factor data to be processed can be weighted through the influence factor weight to obtain weighted influence factor data, and then the matching degree between the weighted influence factor data and the reference influence factor data is calculated.

And step 208, determining the resource flow data corresponding to the credential data according to the matching degree.

In this embodiment, the resource flow data corresponding to the credential data may be determined according to the degree of matching between the to-be-processed impact factor data and the reference impact factor data. For example, the resource traffic data corresponding to the reference influence factor data, which corresponds to the reference influence factor data having the highest matching degree between the influence factor data to be processed and the reference influence factor data, may be selected as the resource traffic data corresponding to the influence factor data to be processed, that is, the resource traffic data corresponding to the credential data.

According to the method for generating the resource flow data, the credential data are acquired and processed to obtain the to-be-processed influence factor data, the matching degree between the to-be-processed influence factor data and the reference influence factor data is calculated, the resource flow data corresponding to the credential data are determined according to the matching degree, the resource flow data corresponding to the credential data can be automatically obtained, the generation efficiency of the resource flow data is improved compared with the mode of configuring the designated resource flow item in advance, meanwhile, the uncontrollable factors doped by human pre-configuration can be avoided, and the accuracy of the generated resource flow data can be improved. In addition, the user can define any influence factor to be processed according to the needs, and the method is applicable to wider application scenes. The method and the device can be applied to various business systems, such as enterprise resource planning systems (Enterprise Resource Planning and ERP), enterprise management systems, financial systems, manpower systems, supply chain systems and the like, and are wide in applicable application scenes.

In some embodiments, the step 204 of processing the credential data to obtain the impact factor data to be processed includes:

if the voucher data contains a voucher entry with a negative resource value, converting the negative voucher entry into a positive voucher entry to obtain first voucher data; carrying out resource combination on the first credential data according to the attribute of the credential entry to obtain second credential data; performing pairing treatment on the two sides of the resource interaction on the second credential data to obtain credential data in a preset format; and carrying out normalization processing on the initial influence factor data in the credential data in the preset format to obtain influence factor data to be processed.

In this embodiment, if the credential data has a negative credential entry, the credential entry is a negative credential entry, and the negative credential entry is converted into a positive credential entry, so as to obtain the second credential data. The positive direction and the negative direction are used for representing the lending direction, for example, if the voucher data is the data corresponding to the receipt voucher, and the data corresponding to a voucher entry in the voucher data is the data related to payment, the voucher entry is the negative voucher entry. In an alternative example, if there is a voucher entry with a negative resource value in the voucher data, the voucher entry is converted into an opposite lending direction, that is, a positive voucher entry, and the voucher entry may be converted into a description opposite to the lending direction of the voucher data, that is, if the resource value of the voucher entry of the payment type is negative, the voucher entry is converted into a description of the collection type, that is, into a positive voucher entry. It will be appreciated that, in the credential data, one record corresponds to one credential data as one credential entry. The embodiment can uniformly convert the positive credential entry into the negative credential entry, namely, the positive credential entry is converted into the negative credential entry, and the selection can be specifically performed according to the actual application scene.

The server can carry out resource combination on the first credential data according to the attribute of the credential entry to obtain second credential data. The attribute of the voucher entry can include subjects, accounting dimension, resource types, lending directions and the like, the subjects are used for representing different voucher subjects, and the resource types represent different types of resources. Optionally, the second credential data may be obtained by merging the credential entries of at least one of the same order, the same accounting dimension, the same resource type, or the same lending direction.

The server can perform pairing processing on the two sides of the resource interaction on the second credential data to obtain credential data in a preset format. The pairing process of the two resource interaction parties means that the interaction parties of the same resource object are paired and located in the same piece of credential data, for example, if 2 resources are paid for B by A, 2 resources paid for A by B correspondingly exist, namely, a pairing data is formed between A and B in one interaction dimension. That is, each piece of credential data in a preset format includes at least one pairing data formed by two resource interaction parties. It can be understood that each piece of credential data in the preset format may include paired data formed by a plurality of resource interaction parties, where subjects or dimensions corresponding to each paired data are factors affecting the resource traffic data, that is, initial impact factor data.

And the server normalizes the initial influence factor data in the credential data in a preset format to obtain influence factor data to be processed. The normalization processing is to process the initial influence factor data into a numerical value between 0 and 1 to obtain influence factor data to be processed. The normalization processing may be performed by at least one of a feature scaling method, an inner code method, or a sequence number method, for example. The feature scaling method can splice initial influence factor data with larger influence on the resource flow data in the initial influence factor data by codes, and the initial influence factor data is used as a normalization basis; the internal code method can be used as a normalization basis through an internal code corresponding to the digital type of the initial influence factor data, wherein the internal code can be understood as a number identifier; the serial number method can be used as a normalization basis by using the serial numbers after the character identifications corresponding to the initial influence factor data are sequenced.

It can be understood that, in this embodiment, the execution flow of processing the credential data to obtain the to-be-processed influence factor data does not have to be executed according to the above sequence, and the execution sequence may be adjusted according to the actual application scenario, for example, the credential data may be first merged with resources according to the attribute of the credential entry, and then the direction of the credential entry is converted.

In this embodiment, if a negative credential entry exists in the credential data, the negative credential entry is converted into a positive credential entry to obtain first credential data, the first credential data is resource-combined according to the attribute of the credential entry to obtain second credential data, the second credential data is paired by both sides of the resource interaction to obtain credential data in a preset format, the initial influencing factor data in the credential data in the preset format is normalized to obtain influencing factor data to be processed, so that the credential data in the preset format can be obtained quickly, and then the normalization processing is performed on the initial influencing factor data, so that the processing efficiency of the influencing factor data to be processed can be improved, and the generating efficiency of the resource flow data can be improved.

In some embodiments, the step 206 of calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data includes the following steps 302 to 308.

In step 302, initial impact factor weights are obtained.

The initial influence factor weight refers to the weight corresponding to the influence factor data to be processed. The initial influence factor weight can be a preset weight or a weight obtained through model training. The initial influence factor weight is used for representing the influence degree of the influence factor data to be processed on the resource flow data, and the larger the initial influence factor weight is, the larger the influence degree of the influence factor data to be processed on the resource flow data is.

And step 304, carrying out weighted calculation on the influence factor data to be processed according to the initial influence factor weight to obtain weighted influence factor data.

The server can perform weighted calculation on the influence factor data to be processed according to the initial influence factor weight to obtain weighted influence factor data. Alternatively, the initial influence factor weight and the corresponding influence factor data to be processed may be multiplied to obtain weighted influence factor data.

Step 306, a first distance between the weighted impact factor data and the reference origin is calculated.

In this embodiment, the reference origin may be understood as a coordinate origin, each credential entry in the credential data in the preset format corresponds to a point in the coordinate system except for the coordinate origin, and each credential entry corresponds to a plurality of weighting influence factor data, each weighting influence factor data is equivalent to a one-dimensional space coordinate value on the point, that is, the number of weighting influence factor data in each credential entry is the dimension of the corresponding coordinate system. The first distance between the weighted influence factor data and the reference origin may be calculated by at least one of euclidean distance and manhattan distance, for example, euclidean distance and manhattan distance may be calculated simultaneously, and then an average value of the euclidean distance and manhattan distance is calculated as the first distance.

Step 308, calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data based on the first distance.

In this embodiment, the matching degree between the to-be-processed influence factor data and the reference influence factor data may be represented by a distance, and the closer the distance is, the larger the matching degree between the to-be-processed influence factor data and the reference influence factor data is represented, whereas the larger the distance difference is, the smaller the matching degree between the to-be-processed influence factor data and the reference influence factor data is represented. Optionally, the distance between the weighted influence factor data and the reference origin is a first distance, the distance between the reference influence factor data and the reference origin is a second distance, and the matching degree between the influence factor data to be processed and the reference influence factor data can be determined according to the first distance and the second distance.

In this embodiment, by acquiring the initial influence factor weight, performing weighted calculation on the influence factor data to be processed according to the initial influence factor weight to obtain weighted influence factor data, calculating a first distance between the weighted influence factor data and the reference origin, and calculating the matching degree between the influence factor data to be processed and the reference influence factor data based on the first distance, that is, calculating the matching degree between the influence factor data to be processed and the reference influence factor data by the distance, the calculation efficiency of the matching degree between the influence factor data to be processed and the reference influence factor data can be improved, and further the generation efficiency of the resource flow data is improved.

In some embodiments, the calculating, based on the first distance, a degree of matching between the to-be-processed influence factor data and the reference influence factor data includes:

acquiring a second distance corresponding to the reference influence factor data based on the resource flow model; and calculating the matching degree between the first distance and the second distance, and taking the matching degree between the first distance and the second distance as the matching degree between the to-be-processed influence factor data and the reference influence factor data.

The resource traffic model refers to a model that can generate resource traffic data based on credential data. And a second distance corresponding to the reference influence factor data is stored in the resource flow model, namely the second distance is the distance between the reference influence factor data and the reference origin. The server calculates the matching degree between the first distance and each second distance, and takes the matching degree between the first distance and the second distance as the matching degree between the to-be-processed influence factor data and the reference influence factor data. Wherein the closer the first distance and the second distance are, i.e. the smaller the difference between the first distance and the second distance is, the larger the matching degree of the first distance and the second distance is, the larger the difference between the first distance and the second distance is, and the smaller the matching degree of the first distance and the second distance is.

In the embodiment, a second distance corresponding to the reference influence factor data is obtained based on the resource flow model; the matching degree between the first distance and the second distance is calculated, and the matching degree between the first distance and the second distance is used as the matching degree between the to-be-processed influence factor data and the reference influence factor data, so that the efficiency of obtaining the matching degree between the to-be-processed influence factor data and the reference influence factor data can be improved.

In some embodiments, determining the resource traffic data corresponding to the credential data according to the degree of matching includes:

if at least two second distances are the same and the matching degree between the second distances and the first distances is the highest, determining reference influence factor data with the highest matching degree of the to-be-processed influence factor data corresponding to the first distances from the reference influence factor data corresponding to the at least two second distances according to the initial influence factor weight, and taking the reference influence factor data as target influence factor data; and taking the resource flow data corresponding to the target influence factor data as the resource flow data of the influence factor data to be processed corresponding to the first distance.

In this embodiment, the second distance is a distance between the reference influence factor data and the reference origin, the second distance includes a plurality of second distances, and the second distances in the resource flow model are calculated based on the reference influence factor data in the model training data set, where the reference influence factor data in each model training data set corresponds to a second distance, and the second distances corresponding to different reference influence factor data may be the same. If at least two second distances are the same and the matching degree between the second distances and the first distances is the highest, determining one target influence factor data which is the most matched with the to-be-processed influence factor data corresponding to the first distances from the reference influence factor data corresponding to the second distances, determining the reference influence factor data which is the highest in matching degree with the to-be-processed influence factor data corresponding to the first distances from the reference influence factor data corresponding to the at least two second distances according to the initial influence factor weight, taking the resource flow data corresponding to the target influence factor data as the resource flow data of the to-be-processed influence factor data corresponding to the first distances, and taking the resource flow data of all the to-be-processed influence factor data as the resource flow data corresponding to the credential data.

In an alternative embodiment, the matching degree between the to-be-processed influence factor data corresponding to the first distance in the reference influence factor data corresponding to the at least two second distances may be sequentially compared according to the order of the initial influence factor weights from large to small, until each reference influence factor data corresponding to the same initial influence factor weight is inconsistent, and the reference influence factor data closest to the to-be-processed influence factor data in each reference influence factor data corresponding to the initial influence factor weight is used as the target influence factor data. For example, there are two second distances that are the same and have the highest matching degree with the first distance, the reference influence factor data corresponding to the two second distances are a= { A1, A2, A3, A4, A5} and b= { B1, B2, B3, B4, B5} respectively, the initial influence factor weights are { P1, P2, P3, P4, P5}, and the order of the initial influence factor weights from large to small is: if P2> P1> P3> P5> P4, according to the order of initial influence factor weight from big to small, firstly comparing reference influence factor data A2 and B2 corresponding to two second distances corresponding to P2, if A2 and B2 are different, comparing the matching degree between A2 and B2 and the processing influence factor data corresponding to the first distances respectively, and if the matching degree between A2 and the processing influence factor data corresponding to the first distances is greater than the matching degree between B2 and the processing influence factor data corresponding to the first distances, determining the reference influence factor data A as target influence factor data; if A2 and B2 are the same, comparing the reference influence factor data A1 and B1 corresponding to two second distances corresponding to P1, and the like until the reference influence factor data with the highest matching degree of the to-be-processed influence factor data corresponding to the first distance in the reference influence factor data corresponding to the second distances corresponding to the corresponding initial influence factor weights is determined to be used as target influence factor data.

In this embodiment, if at least two second distances are the same and the matching degree between the second distances and the first distances is the highest, determining, according to the initial influence factor weight, reference influence factor data with the highest matching degree between the to-be-processed influence factor data corresponding to the first distances from the reference influence factor data corresponding to the at least two second distances, as target influence factor data; the resource flow data corresponding to the target influence factor data is used as the resource flow data of the to-be-processed influence factor data corresponding to the first distance, so that the accuracy of matching between the to-be-processed influence factor data and the reference influence factor data can be improved, and the accuracy of the obtained resource flow data is improved.

In one embodiment, the resource traffic model construction method includes:

acquiring a first training data set; wherein the first training data set comprises first impact factor data and first resource traffic data; normalizing the first influence factor data to obtain normalized first influence factor data; weighting calculation is carried out on the normalized first influence factor data according to preset influence factor weights, so that first weighted influence factor data are obtained; calculating a reference distance between the first weighted influence factor data of each piece of data in the first training data set and the reference origin; and generating a resource flow model according to the reference distance, the preset influence factor weight and the first resource flow data.

In this embodiment, a resource flow model is generated based on a first training data set, where the first training data set includes first impact factor data and first resource flow data, and the first resource flow data may be, for example, a resource flow main table item and a resource flow auxiliary table item, where the resource flow main table item and the resource flow auxiliary table item represent specific flow directions of various resources, and the resource flow auxiliary table item is a further supplement and refinement to the resource flow main table item. In one example, the structure of the first training data set is shown in table 1 below, the first training data set including the credential inner code, the first influencing factor data and the first resource traffic data. The internal code of the voucher is the code of the unique identification voucher, and the internal code of the voucher in the first training data set is derived from the voucher data corresponding to the first influence factor data; the method and the other party are interaction parties, the first influence factor data comprise subjects, accounting dimensions and other factors, and the first influence factor data comprise flow interaction data which are formed by the interaction parties in pairs; the first resource traffic data includes a resource traffic master table entry and a resource traffic appendage table entry.

TABLE 1

And carrying out normalization processing on the first influence factor data to obtain normalized first influence factor data, carrying out weighted calculation on the normalized first influence factor data according to preset influence factor weights to obtain first weighted influence factor data, and calculating a reference distance between the first weighted influence factor data of each piece of data in the first training data set and a reference origin, wherein each piece of data in the first training data set can be understood as data corresponding to each record in a table of the first training data set. And generating a resource flow model according to the reference distance, the preset influence factor weight and the first resource flow data. Alternatively, the resource flow model may be generated from the book internal code, the preset impact factor weight, the reference distance, and the first resource flow data. The book internal code refers to the code of the book to which the first resource flow data belongs, and the book internal code can be preset. It should be noted that the resource flow model may be a report storing the reference impact factor data, the initial impact factor weight, the second distance corresponding to the reference impact factor data, and the corresponding resource flow data.

The normalization, weighting calculation and distance calculation in this embodiment may be referred to as description of the related content in the above embodiment.

In this embodiment, through a first training data set, the first training data set includes first influence factor data and first resource flow data, normalization processing is performed on the first influence factor data to obtain normalized first influence factor data, weighting calculation is performed on the normalized first influence factor data according to preset influence factor weights to obtain first weighted influence factor data, a reference distance between the first weighted influence factor data of each piece of data in the first training data set and a reference origin is calculated, and a resource flow model is generated according to the reference distance, the preset influence factor weights and the first resource flow data, so that rapid generation of resource flow data can be realized by the resource flow model, resource flow items are avoided from being manually specified, and generation efficiency of the resource flow data can be improved.

The above embodiments describe the generation process of the resource flow model, and the following mainly describes the optimization process of the resource flow model, and in particular mainly optimizes the data stored in the resource flow model.

In some embodiments, as shown in fig. 4, the method for generating resource traffic data further includes an optimization procedure of the resource traffic model, including the following steps 402 to 408.

Step 402, obtaining a second training data set; wherein the second training data set comprises second impact factor data.

And step 404, normalizing the second influence factor data to obtain normalized second influence factor data.

And acquiring a second training data set, wherein the second training data set is different from the first training data set, the second training data set comprises second influence factor data, and normalizing the second influence factor data to obtain normalized second influence factor data. Alternatively, the second training set may be obtained by: and processing the credential data to obtain influence factor data to be processed, calculating the matching degree between the influence factor data to be processed and reference influence factor data in the resource flow model, and carrying out balance check on the reference influence factor data with the matching degree higher than the corresponding reference influence factor data of the preset matching degree, wherein the balance check refers to checking the pairing accuracy of two resource interaction parties, for example, if the reference influence factor data has data of only one party of resource interaction but not the other party of resource interaction or has two resource interaction parties, but the corresponding amounts of the two resource interaction parties are different, the balance check is not passed. The reference influence factor data which does not pass the balance verification can be removed; adding reference influence factor data with the matching degree not higher than the corresponding reference influence factor data with the preset matching degree into a list to be confirmed, and correcting the reference influence factor data in the list to be confirmed to obtain corrected reference influence factor data; it can be appreciated that the corrected reference influence factor data at least satisfies that the balance verification is passed and the matching degree between the reference influence factor data and the influence factor data to be processed is higher than the preset matching degree, and the correction can be performed by a correction algorithm or manually. Then, the reference influence factor data passing the balance verification and the corrected reference influence factor data are used as a second training data set.

Step 406, performing iterative computation for a preset number of times to obtain a preset number of candidate distances, where each iterative computation includes: randomly shifting the preset influence factor weight to obtain the shifted influence factor weight; weighting calculation is carried out on the normalized second influence factor data based on the offset influence factor weight, so that second weighted influence factor data are obtained; a candidate distance between the second weighted influence factor data and the reference origin in each piece of data in the second training set is calculated.

And carrying out iterative computation for preset times to obtain a preset number of candidate distances, namely obtaining one candidate distance by one iterative computation. Each iterative calculation includes: randomly shifting the preset influence factor weight to obtain a shifted influence factor weight, carrying out weighted calculation on the normalized second influence factor data based on the shifted influence factor weight to obtain second weighted influence factor data, and calculating a candidate distance between the second weighted influence factor data in each piece of data in the second training set and the reference origin. The random offset of the preset influence factor weight can be realized through a random algorithm or a correction algorithm, for example, the random offset of the preset influence factor weight can be obtained through training of a machine learning algorithm such as RNN. In this embodiment, a candidate distance is corresponding between the second weighted impact factor data of each piece of data in the second training set and the reference origin, that is, a plurality of candidate distances are corresponding to a plurality of pieces of data.

The iterative calculation is performed on the basis of the previous calculation, that is, the first calculation is to randomly offset the preset influence factor weights to obtain first offset influence factor weights, and the second calculation is to randomly offset the first offset influence factor weights obtained by the previous calculation to obtain second offset influence factor weights, and so on, performing iteration for preset times to correspondingly obtain preset number of influence factor weights, that is, correspondingly obtaining preset number of candidate distances.

And step 408, determining a target influence factor weight from the offset influence factor weights according to the matching condition between each candidate distance and the reference distance, and optimizing the resource flow model according to the target influence factor weight.

In this embodiment, for each reference distance, each candidate distance is matched with the reference distance to obtain a matching condition between the candidate distance and the reference distance, a candidate distance with the highest matching degree with each reference distance is selected as a target candidate distance, an offset influence factor weight corresponding to the target candidate distance is used as a target influence factor weight, and the resource flow model is optimized through the target influence factor weight. Optionally, the reference distance and the preset influence factor weight in the resource flow model may be optimized, the reference distance in the resource flow model is replaced by the target candidate distance, and the preset influence factor weight is replaced by the target influence factor weight.

In an alternative embodiment, for a reference distance between the first weighted influence factor data and the reference origin of each datum, matching with a preset number of candidate distances obtained after iterative computation for a preset number of times, selecting a candidate distance with the highest matching degree with the reference distance as a sub-target candidate distance, thereby respectively obtaining sub-target candidate distances matched with each reference distance, performing frequency statistics on each sub-target candidate distance, taking the sub-target candidate distance with the highest frequency as a target candidate distance, taking the offset influence factor weight corresponding to the target candidate distance as a target influence factor weight, and optimizing a resource flow model through the target influence factor weight. Optionally, the preset influence factor weight stored in the resource flow model may be replaced by the target influence factor weight, so as to obtain an optimized resource flow model. The optimized resource flow model can generate more accurate resource flow data.

It can be understood that after the resource flow model is generated, the resource flow data can be generated through the resource flow model, after the generated resource flow data is checked and corrected, the resource flow model can be optimized as new training data, so that the model application and the model optimization flow can be repeated continuously, the optimized resource flow model can be suitable for richer application scenes, the generated resource flow data is more accurate, the more lean resource flow statistical requirements of enterprises can be met, and the development requirements of the enterprise business can be met at lower cost.

In one embodiment, a method of generating resource flow data is illustrated in fig. 5, and cash flow data is generated as an example. The method comprises the steps of obtaining historical cash flow data as a first training data set, carrying out normalization processing on first influence factor data in the historical cash flow data to obtain normalized first influence factor data, carrying out weighted calculation on the normalized first influence factor data according to preset influence factor weights to obtain first weighted influence factor data, calculating a reference distance between first weighted influence factor data corresponding to each piece of data in the historical cash flow data and a reference origin through a KNN algorithm, and storing account book inner codes, the reference distance, preset influence factor weights and cash flow main table items and cash flow auxiliary table items in the historical cash flow data as model data to obtain a resource flow model.

The method comprises the steps of obtaining credential data, splitting and combining the credential data to obtain initial influence factor data, carrying out normalization processing on the initial influence factor data to obtain to-be-processed influence factor data, obtaining initial influence factor weights based on a resource flow model, carrying out weighted calculation on the to-be-processed influence factor data according to the initial influence factor weights to obtain weighted influence factor data, calculating a first distance between the weighted influence factor data and a reference origin, calculating the matching degree between the first distance and the reference distance, and combining cash flow main table items and cash flow auxiliary table items in historical cash flow data corresponding to K reference distances from high to low in the matching degree of the first distance to serve as cash flow data corresponding to the credential data. Carrying out balance verification on first influence factor data in the historical cash flow data corresponding to the K reference distances, eliminating the historical cash flow data (marked as N) which does not pass the balance verification, and correcting the historical cash flow data (marked as Q) corresponding to the reference distances with the matching degree not higher than the preset matching degree of the first distances to obtain corrected historical cash flow data; and taking the historical cash flow data (marked as M) passing the balance verification and the corrected historical cash flow data as a second training data set, and optimizing the initial influence factor weight and the reference distance of the resource flow model based on the second training data set. The number of the historical cash flow data is K+Q, K=M+N, K and Q are positive integers, and M and N are non-negative integers.

Normalizing the second influence factor data in the second training data set to obtain normalized second influence factor data, randomly shifting the preset influence factor weight to obtain shifted influence factor weight, weighting the normalized second influence factor data based on the shifted influence factor weight to obtain second weighted influence factor data, calculating the candidate distance between the second weighted influence factor data in each data in the second training set and the reference origin, iterating the calculation for preset times to randomly shift the preset influence factor weight to obtain shifted influence factor weight, weighting the normalized second influence factor data based on the shifted influence factor weight to obtain second weighted influence factor data, calculating the candidate distance between the second weighted influence factor data in each data in the second training set and the reference origin to obtain the candidate distance of the preset data, determining the target influence factor weight from the shifted influence factor weight according to the matching condition between each candidate distance and the reference origin, replacing the reference distance in the resource flow model with the target candidate distance, and optimizing the target flow model by replacing the preset influence factor weight.

Optionally, the resource flow model may be continuously optimized, or the optimization of the resource flow model may be stopped when the matching degree between the K reference distances and the first distance is higher than the preset matching degree. The preset matching degree can be different according to different application scenes or different training data sets.

In the above embodiment, by constructing the resource flow model, the resource flow data corresponding to the credential data can be generated through the resource flow model, and the resource flow model is optimized as new training data after the resource flow data is corrected, so that the problems that a large number of resource flow items are manually specified and configured, and all situations cannot be exhausted are avoided, the automatic generation of the resource flow data is realized, the generation efficiency of the resource flow data is improved, and the accuracy of the resource flow model is also continuously improved, so that more accurate resource flow data is generated, and the business development requirement of an enterprise is met.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a device for generating the resource flow data, which is used for realizing the method for generating the resource flow data. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the generating device for generating one or more resource traffic data provided below may refer to the limitation of the generating method for the resource traffic data hereinabove, and will not be repeated herein.

In one embodiment, as shown in fig. 6, there is provided a generating apparatus of resource traffic data, including: a credential data acquisition module 602, a data preprocessing module 604, a matching degree calculation module 606, and a resource traffic data determination module 608, wherein:

a credential data acquisition module 602, configured to acquire credential data;

the data preprocessing module 604 is configured to process the credential data to obtain to-be-processed influence factor data;

the matching degree calculating module 606 is configured to calculate a matching degree between the to-be-processed influence factor data and the reference influence factor data;

the resource traffic data determining module 608 is configured to determine resource traffic data corresponding to the credential data according to the matching degree.

In one embodiment, the data preprocessing module 604 is further configured to, if there is a credential entry with a negative resource value in the credential data, convert the negative credential entry into a positive credential entry to obtain the first credential data; carrying out resource combination on the first credential data according to the attribute of the credential entry to obtain second credential data; performing pairing treatment on the two sides of the resource interaction on the second credential data to obtain credential data in a preset format; and carrying out normalization processing on the initial influence factor data in the credential data in the preset format to obtain influence factor data to be processed.

In one embodiment, the matching degree calculation module 606 is further configured to obtain an initial impact factor weight; weighting calculation is carried out on the influence factor data to be processed according to the initial influence factor weight, so that weighted influence factor data are obtained; calculating a first distance between the weighted influence factor data and the reference origin; and calculating the matching degree between the to-be-processed influence factor data and the reference influence factor data based on the first distance.

In one embodiment, the matching degree calculating module 606 is further configured to obtain a second distance corresponding to the reference impact factor data based on the resource traffic model; and calculating the matching degree between the first distance and the second distance, and taking the matching degree between the first distance and the second distance as the matching degree between the to-be-processed influence factor data and the reference influence factor data.

In one embodiment, the resource traffic data determining module 608 is further configured to determine, as the target influence factor data, reference influence factor data with the highest matching degree with the to-be-processed influence factor data corresponding to the first distance from the reference influence factor data corresponding to the at least two second distances according to the initial influence factor weight if the at least two second distances are the same and the matching degree between the second distance and the first distance is the highest; and taking the resource flow data corresponding to the target influence factor data as the resource flow data of the influence factor data to be processed corresponding to the first distance.

In one embodiment, the generating device of the resource traffic data further includes a resource traffic model building module, configured to implement building of a resource traffic model, including:

acquiring a first training data set; the first training data set includes first impact factor data and first resource traffic data; normalizing the first influence factor data to obtain normalized first influence factor data; weighting calculation is carried out on the normalized first influence factor data according to preset influence factor weights, so that first weighted influence factor data are obtained; calculating a reference distance between the first weighted influence factor data of each piece of data in the first training data set and the reference origin; and generating a resource flow model according to the reference distance, the preset influence factor weight and the first resource flow data.

In one embodiment, the generating device of the resource traffic data further includes a resource traffic model optimization module, configured to obtain a second training data set; the second training data set includes second impact factor data; normalizing the second influence factor data to obtain normalized second influence factor data; performing iterative computation for a preset number of times to obtain a preset number of candidate distances, wherein each iterative computation comprises: randomly shifting the preset influence factor weight to obtain the shifted influence factor weight; weighting calculation is carried out on the normalized second influence factor data based on the offset influence factor weight, so that second weighted influence factor data are obtained; calculating a candidate distance between the second weighted influence factor data in each piece of data in the second training set and the reference origin; and determining a target influence factor weight from the offset influence factor weights according to the matching condition between each candidate distance and the reference distance, and optimizing the resource flow model according to the target influence factor weight.

The above-described respective modules in the resource flow data generation apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing resource traffic data. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method of generating resource traffic data.

It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method of generating resource flow data described above when the computer program is executed.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the method of generating resource traffic data described above.

In an embodiment a computer program product is provided comprising a computer program which, when executed by a processor, implements the steps of the method of generating resource traffic data described above.

It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in this application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant regulations and standards.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not thereby to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method for generating resource traffic data, the method comprising:

acquiring credential data;

processing the credential data to obtain to-be-processed influence factor data;

2. The method of claim 1, wherein the processing the credential data to obtain to-be-processed impact factor data comprises:

3. The method according to claim 1, wherein said calculating a degree of matching between the to-be-processed influence factor data and reference influence factor data comprises:

acquiring initial influence factor weights;

4. A method according to claim 3, wherein said calculating a degree of matching between the to-be-processed influence factor data and reference influence factor data based on the first distance comprises:

5. The method of claim 4, wherein the determining the resource traffic data corresponding to the credential data according to the degree of matching comprises:

6. The method according to claim 4, wherein the resource traffic model construction mode comprises:

7. The method of claim 6, wherein the method further comprises:

8. A device for generating resource traffic data, the device comprising:

the credential data acquisition module is used for acquiring credential data;

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.