CN118194018A - Transaction numerical class feature extraction method, device, equipment and storage medium - Google Patents

Transaction numerical class feature extraction method, device, equipment and storage medium Download PDF

Info

Publication number
CN118194018A
CN118194018A CN202410379307.0A CN202410379307A CN118194018A CN 118194018 A CN118194018 A CN 118194018A CN 202410379307 A CN202410379307 A CN 202410379307A CN 118194018 A CN118194018 A CN 118194018A
Authority
CN
China
Prior art keywords
transaction
class
feature
transaction numerical
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410379307.0A
Other languages
Chinese (zh)
Inventor
梁志生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202410379307.0A priority Critical patent/CN118194018A/en
Publication of CN118194018A publication Critical patent/CN118194018A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a transaction numerical value class feature extraction method, a device, equipment and a storage medium, and relates to the technical field of big data processing. The transaction numerical class feature extraction method comprises the following steps: acquiring transaction numerical value class feature data; discretizing the transaction numerical value class characteristic data; carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time section; and carrying out data compression and dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class. The method provided by the application can improve the accuracy of the extracted data features.

Description

Transaction numerical class feature extraction method, device, equipment and storage medium
Technical Field
The present application relates to the field of big data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for extracting transaction numerical class features.
Background
In the field of anti-fraud, the extraction of the transaction numerical value characteristics is generally realized by calculating the statistical characteristics such as the average value, the median, the mode, the maximum value, the minimum value, the standard deviation and the like of the transaction numerical value characteristics in the transaction time interval. However, this approach is sensitive to outliers, and extreme transaction value class features may significantly affect statistical features, resulting in lower accuracy of extracted data features.
Disclosure of Invention
The application discloses a transaction numerical class feature extraction method, a device, equipment and a storage medium. The technical scheme of the application is as follows:
in a first aspect, the application discloses a transaction numerical class feature extraction method, which comprises the following steps:
Acquiring transaction numerical value class feature data;
discretizing the transaction numerical value class characteristic data;
Carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time section;
and carrying out data compression and dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class.
In one possible implementation manner, the discretizing the transaction numerical class feature data includes:
discretizing the transaction numerical value class characteristic data in a preset discretization mode; the preset discretization mode comprises at least one of equidistant discretization, equal-frequency discretization or user-defined distance discretization.
In a possible implementation manner, the statistical processing is performed on the discretized transaction numerical class feature data through a word bag model to obtain feature vectors of transaction numerical class features of each discretized section of each text word in a preset time interval, where the statistical processing includes:
Numbering text words in the discretized transaction numerical class characteristics;
Counting word frequency of each text word in each discretization interval in a preset time interval;
And obtaining a feature vector based on word frequency of each text word in each discretization interval in a preset time interval.
In one possible implementation manner, the statistical processing of the discretized transaction numerical class feature data through a word bag model includes:
and carrying out statistical processing on the discretized transaction numerical class characteristic data through a count vector CounterVectorizer.
In one possible implementation manner, the performing data compression and dimension reduction on the feature vector to obtain feature information of a transaction numerical class includes:
And carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical value class.
In a second aspect, the application discloses a transaction numerical class feature extraction device, comprising:
The data acquisition module is used for acquiring transaction numerical value class characteristic data;
the discretization module is used for discretizing the transaction numerical value class characteristic data;
the statistics module is used for carrying out statistics processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time interval;
and the dimension reduction module is used for carrying out data compression dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class.
In one possible embodiment, the discrete module is configured to:
discretizing the transaction numerical value class characteristic data in a preset discretization mode; the preset discretization mode comprises at least one of equidistant discretization, equal-frequency discretization or user-defined distance discretization.
In one possible implementation, the statistics module includes:
The numbering unit is used for numbering text words in the discretized transaction numerical value class characteristics;
The statistics unit is used for counting word frequency of each text word in each discretization interval in a preset time interval;
And the determining module is used for obtaining the feature vector based on the word frequency of each text word in each discretization interval in a preset time interval.
In one possible implementation manner, the statistical processing of the discretized transaction numerical class feature data through a word bag model includes:
and carrying out statistical processing on the discretized transaction numerical class characteristic data through a count vector CounterVectorizer.
In one possible implementation manner, the dimension reduction module is used for:
And carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical value class.
In a third aspect, the present application discloses an electronic device, comprising:
A processor;
A memory for storing the processor-executable instructions;
Wherein the processor is configured to execute the instructions to implement the method of the first aspect.
In a fourth aspect, the present application discloses a computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method according to the first aspect.
In a fifth aspect, the present application discloses a computer program product comprising a computer program/instruction, characterized in that the computer program/instruction, when executed by a processor, implements the method according to the first aspect.
The technical scheme disclosed by the application has at least the following beneficial effects:
In the technical scheme disclosed by the application, transaction numerical value class characteristic data are obtained; discretizing the transaction numerical value class characteristic data; carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time section; and carrying out data compression and dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class. Therefore, the word bag model method of the natural language text feature extraction method is used for extracting the numerical value type features, the influence of the extreme historical transaction data type features on the processed normal transaction data type features can be effectively reduced, the extreme transaction numerical value type feature data can be well reserved, and the accuracy of the extracted data features is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application and do not constitute a undue limitation on the application.
FIG. 1 is a flow chart of a transaction numerical class feature extraction method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a discretization method according to an embodiment of the present application;
FIG. 3 is a flow chart of a transaction numerical class feature extraction method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a transaction numerical class feature extraction device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to enable a person skilled in the art to better understand the technical solutions of the present application, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions.
The technical scheme of the application obtains, stores, uses, processes and the like the data, which all meet the relevant regulations of national laws and regulations.
It should be noted that, in the embodiments of the present application, some software, components, models, etc. may exist in the industry, and they should be considered as exemplary, only for illustrating the feasibility of implementing the technical solution of the present application, but not meant to imply that the applicant has or must not use the solution.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flowchart of a transaction numerical class feature extraction method according to an embodiment of the present application, which can be applied to a server, for example, a server cluster or a single server. As shown in fig. 1, the transaction value class feature extraction method may include the following steps:
s101, acquiring transaction numerical value class feature data.
In the embodiment of the application, the transaction numerical class feature data collection can be performed first when the transaction numerical class feature extraction is performed. For example, historical transaction data of the user may be obtained, for example, transaction value class feature data in different transaction time intervals may be obtained according to different requirements, for example, transaction value class feature data in transaction time intervals of about 12 months, about 6 months, about 3 months, about 1 month, and the like may be obtained.
S102, discretizing the transaction numerical value class characteristic data.
In the embodiment of the application, after the transaction numerical value class feature data is acquired, discretization processing can be performed on the transaction numerical value class feature data, that is, binning processing can be performed on the transaction numerical value class feature data. Exemplary, the specific implementation manner of discretizing the transaction numerical class feature data may be: discretizing the transaction numerical value class characteristic data in a preset discretization mode; the preset discretization mode comprises at least one of equidistant discretization, equal-frequency discretization or self-defined distance discretization.
In the embodiment of the present application, as shown in fig. 2, the preset discretization mode (i.e., the transaction numerical class feature discretization mode) includes at least one of equidistant discretization, equal-frequency discretization, or custom distance discretization, that is, the discretization of the transaction numerical class feature data may select the equidistant discretization, equal-frequency discretization, or custom distance discretization, etc. As a specific example, in equidistant discretization, the distances may be equally divided according to a maximum value and a minimum value, for example, the equally divided distances may be equally divided by 1000, 100, or directly specified. In equal frequency discretization, equally, the frequency may be divided equally according to the frequency, or the frequency value may be directly specified. In the discretization of the custom distance, the discretization distance of each discrete section may be determined by itself according to the analysis of the historical transaction value class feature data, for example, when the transaction value class feature data is a value feature in terms of amount of money, specific examples of discretizing the custom distance may be [0,1e-2,1e-1, 10,1e2,1e3,1e4,1e5,1e6,1e7,1e8,1e9,1e10,1e11,1e13] and the like, and when the transaction value class feature data is a value feature of time type, specific examples of discretizing the custom distance may be [1, 3, 5, 7, 10, 15, 30, 60, 90] and the like.
And S103, carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of the transaction numerical value class features of each discretized section of each text word in a preset time interval.
In the embodiment of the application, after discretizing the transaction numerical feature data, word frequencies of each discretization interval of each text word in the discretized transaction numerical feature data in a preset time interval can be counted to obtain feature vectors of the transaction numerical features of each discretization interval of each text word in the preset time interval. For example, the bag-of-word model may be used to perform statistical processing on the discretized transaction numerical class feature data to obtain feature vectors of the transaction numerical class features of each discretized section of each text word in a preset time interval. The word bag model is a text feature extraction method commonly used in natural language processing and information retrieval, can ignore grammar and sequence in a text, cut the text into individual text words, count the occurrence frequency of each text word, and finally obtain a word frequency matrix containing a text word set, namely, feature vectors of transaction numerical class features of each discrete interval of each text word in a preset time interval. It will be appreciated that the text terms refer to discretized transaction value class feature data, and the documents refer to all discretized transaction value class feature data for a user over a specified transaction time interval. After the word bag model processing, the frequency of the transaction numerical value class characteristics of the user in each discretization interval in the appointed transaction time interval can be obtained, and a word frequency matrix containing a text word set, namely the characteristic vector, is obtained.
And S104, performing data compression and dimension reduction processing on the feature vector to obtain feature information of the transaction numerical value class.
In the embodiment of the application, after the feature vector of the transaction numerical class feature of each discretization interval of each text word in the preset time interval is obtained, the feature information of the transaction numerical class can be determined based on the feature vector. Illustratively, it is considered that the transaction value class feature data is generally different for each user, and the difference between the maximum transaction value class feature and the minimum transaction value class feature in the transaction value class feature data may be very large, resulting in that the feature vector obtained after the above processing may be very sparse. Therefore, the feature vector can be subjected to data compression and dimension reduction, and the feature vector of the transaction numerical value type feature data after the compression and dimension reduction is used as the finally required feature information of the transaction numerical value type. In this way, important transaction value class feature information can be retained while the data dimension is significantly reduced.
In the technical scheme disclosed by the application, transaction numerical value class characteristic data are obtained; discretizing the transaction numerical value class characteristic data; carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time section; and carrying out data compression and dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class. Therefore, the word bag model method of the natural language text feature extraction method is used for extracting the numerical value type features, the influence of the extreme historical transaction data type features on the processed normal transaction data type features can be effectively reduced, the extreme transaction numerical value type feature data can be well reserved, and the accuracy of the extracted data features is improved. Moreover, the data dimension can be obviously reduced, meanwhile, important transaction value class characteristic information is reserved, and all data of user transaction value class data characteristics in a designated transaction time interval are reserved, so that the data distribution condition of the user transaction value class characteristics can be better reflected, and the data reliability and generalization capability are effectively improved.
In some possible embodiments, the step of performing statistical processing on the discretized transaction numerical class feature data through a bag-of-word model to obtain feature vectors of the transaction numerical class features of each discretized section of each text word in a preset time interval includes:
numbering text words in the discretized transaction numerical value class characteristics;
Counting word frequency of each text word in each discretization interval in a preset time interval;
And obtaining the feature vector based on the word frequency of each text word in each discretization interval in the preset time interval.
In the embodiment of the application, when the bag-of-word model performs statistical processing on the discretized transaction numerical value class feature data to obtain the feature vectors of the transaction numerical value class features of each discretized section of each text word in the preset time section, the text words in the discretized transaction numerical value class features can be acquired first, and the text words are numbered. The frequency of occurrence, i.e., word frequency, of each text word in each discretized interval within the time interval after march can then be counted. Then, a word frequency matrix can be generated based on the word frequency of each text word in each discretization interval in the preset time interval, and the word frequency matrix is used as a feature vector, namely, the feature vector of the transaction numerical class feature of each discretization interval of each text word in the preset time interval. Thus, the intrinsic relation of the user transaction numerical class characteristics can be displayed in the form of transaction frequency.
In a further possible embodiment, the bag of words model may be implemented based on a count vector CounterVectorizer. CounterVector, a counting vector device, which is a mode for realizing a basic bag-of-words model, is to number all text words and count the word frequency of the words in the text as vector characteristics, and finally obtain a word frequency matrix.
In a further possible implementation manner, the step of performing data compression and dimension reduction on the feature vector to obtain feature information of the transaction numerical class includes:
and carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical value class.
In the embodiment of the application, the feature vector can be subjected to data compression and dimension reduction processing through the tested SVD so as to obtain the feature information of the transaction numerical value class. The Truncated singular value decomposition, which is the Truncated SVD, may be very sparse due to the feature data processed by the transaction number class feature number S103 of each user. Therefore, the data compression and dimension reduction processing is required to be performed through the measured SVD, so that the data dimension can be remarkably reduced, and meanwhile, the important transaction numerical value type characteristic information is reserved.
In order to make the transaction value class feature extraction method provided by the embodiment of the present application clearer, the following description will be made with reference to fig. 3, and as shown in fig. 3, the transaction value class feature extraction method may include the following processes:
(1) Collecting transaction value class feature data
The user historical transaction data can be used, for example, transaction numerical value type characteristic data of different transaction time intervals can be obtained according to different requirements, and the transaction time intervals can be 12 months, 6 months, 3 months, 1 month and the like.
(2) Discretization of transaction numerical class features
Discretizing the transaction numerical class features, namely carrying out box-sorting processing on the transaction numerical class features. Discretization of transaction numerical class features can be selected from equidistant discretization, equal-frequency discretization or custom distance discretization. In equidistant discretization, the distance may be equally divided by 1000, 100, or directly specified according to the maximum and minimum values. Equal frequency discretization can equally divide equally according to frequency or directly designate frequency value to divide equally. The discretization of the self-defined distance, as the name implies, is to determine the discretization distance of each discrete interval by self according to the analysis of the numerical characteristics of the historical transaction, for example, the numerical characteristics of the amount can be [0,1e-2,1e-1, 10,1e2,1e3,1e4,1e5,1e6,1e7,1e8,1e9,1e10,1e11,1e13] and the like, and the numerical characteristics of the time type can be [1, 3, 5,7, 10, 15, 30, 60, 90] and the like.
(3) CounterVectorizer treatment
The word bag model is the most basic mode for processing text features in natural language, and mainly comprises the steps of numbering all text words and then counting word frequencies of corresponding words in a document as feature vectors. The bag of words model may be implemented using CounterVectorizer in embodiments of the application. The text words in the embodiment of the application refer to the discretized transaction numerical value class characteristics, and the documents refer to all the discretized transaction numerical value class characteristics of the client in the appointed time interval. After CounterVectorizer processing, the frequency of the transaction numerical value class characteristics of the client in each discretization interval in the appointed time interval can be obtained, so that the characteristic vector of the transaction numerical value class characteristics of each text word in each discretization interval in the preset time interval is obtained.
(4) Truncated SVD processing
The Truncated SVD, i.e. Truncated singular value decomposition. Since the transaction value class features of each user are different, the difference between the maximum transaction value class feature and the minimum transaction value class feature may be very large, and thus the transaction value class data feature processed in step (3) may be very sparse. Therefore, data compression and dimension reduction are required through the tested SVD, so that important transaction numerical value class characteristic information is reserved while the data dimension is remarkably reduced.
As a specific example, to make the transaction value class feature extraction method provided by the embodiment of the present application clearer, the transaction value class feature extraction method may include the following processes:
First, customer historical transaction data may be obtained within 6 months from the current time. The transaction value class features may then be binned. For example, the customer history transaction data within 6 months may be binned in an equidistant discretized manner, e.g., the maximum and minimum of the transaction value class features may be grouped by 1000 equal transaction value class features.
Then, counterVectorizer can be adopted to carry out statistical processing on the discretized transaction numerical value class feature data to obtain feature vectors of the transaction numerical value class features of each discretized section of each text word in a preset time interval. Specifically, grammar and order in the text can be ignored, the text is segmented into individual words, all text words are numbered, the occurrence frequency of the individual words is counted, and finally a word frequency matrix containing word sets is obtained, namely the frequency of transaction numerical value class characteristics of clients in each discretization interval in a designated time interval. And counting word frequencies of corresponding words in the document to serve as feature vectors. And finally, carrying out data compression and dimension reduction processing on the feature vector by adopting the tested SVD to obtain the feature information of the transaction numerical value class.
The specific implementation manner and technical effects of each step in the embodiment provided in the embodiment of the present application are similar to those of the above method embodiment, and are not repeated here.
Based on the same inventive concept, the embodiment of the application also provides a transaction numerical class feature extraction device. As shown in fig. 4, the transaction value class feature extraction device 400 includes:
a data acquisition module 410, configured to acquire transaction value class feature data;
A discretizing module 420, configured to discretize the transaction numerical class feature data;
The statistics module 430 is configured to perform statistics processing on the discretized transaction numerical class feature data through a word bag model, so as to obtain feature vectors of transaction numerical class features of each discretized section of each text word in a preset time interval;
and the dimension reduction module 440 is configured to perform data compression dimension reduction processing on the feature vector to obtain feature information of the transaction numerical class. .
In one possible implementation, the discrete module 420 is configured to:
discretizing the transaction numerical value class characteristic data in a preset discretization mode; the preset discretization mode comprises at least one of equidistant discretization, equal-frequency discretization or user-defined distance discretization.
In one possible implementation, the statistics module 430 includes:
The numbering unit is used for numbering text words in the discretized transaction numerical value class characteristics;
The statistics unit is used for counting word frequency of each text word in each discretization interval in a preset time interval;
And the determining module is used for obtaining the feature vector based on the word frequency of each text word in each discretization interval in a preset time interval.
In a possible implementation manner, the statistics module 430 is configured to:
and carrying out statistical processing on the discretized transaction numerical class characteristic data through a count vector CounterVectorizer.
In one possible implementation manner, the dimension reduction module 440 is configured to:
And carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical value class.
The specific implementation manner and technical effects of the device provided by the embodiment of the present application are similar to those of the above method embodiment, and are not described herein again.
According to embodiments of the present application, an electronic device, a computer-readable storage medium and a computer program product are also disclosed.
Fig. 5 shows a schematic block diagram of an example electronic device 500 that may be used to implement an embodiment of the application. Electronic device 500 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 5, the electronic device 500 includes a computing unit 501 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM502, and RAM503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in electronic device 500 are connected to I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as the transaction value class feature extraction method. For example, in some embodiments, the transaction value class feature extraction method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by the computing unit 501, one or more steps of the transaction value class feature extraction method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the transaction value class feature extraction method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for a computer program product for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present application, a computer-readable storage medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may be a machine readable signal medium or a machine readable storage medium. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a computer-readable storage medium would include one or more wire-based electrical connections, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual PRIVATE SERVER" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, so long as the desired result of the technical solution of the present disclosure is achieved, and the present disclosure is not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (10)

1. A transaction numerical class feature extraction method, comprising:
Acquiring transaction numerical value class feature data;
discretizing the transaction numerical value class characteristic data;
Carrying out statistical processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time section;
and carrying out data compression and dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class.
2. The method of claim 1, wherein discretizing the transaction value class feature data comprises:
discretizing the transaction numerical value class characteristic data in a preset discretization mode; the preset discretization mode comprises at least one of equidistant discretization, equal-frequency discretization or user-defined distance discretization.
3. The method for extracting transaction numerical class features according to claim 1, wherein the step of performing statistical processing on the discretized transaction numerical class feature data by a word bag model to obtain feature vectors of transaction numerical class features of each discretized section of each text word in a preset time section comprises:
Numbering text words in the discretized transaction numerical class characteristics;
Counting word frequency of each text word in each discretization interval in a preset time interval;
And obtaining a feature vector based on word frequency of each text word in each discretization interval in a preset time interval.
4. The method for extracting transaction numerical class features according to claim 1, wherein the statistical processing of the discretized transaction numerical class feature data by a bag-of-word model includes:
and carrying out statistical processing on the discretized transaction numerical class characteristic data through a count vector CounterVectorizer.
5. The method for extracting characteristics of a transaction numerical class according to claim 1, wherein the performing data compression and dimension reduction on the feature vector to obtain the characteristic information of the transaction numerical class includes:
And carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical class.
6. A transaction numerical class feature extraction device is characterized by comprising
The data acquisition module is used for acquiring transaction numerical value class characteristic data;
the discretization module is used for discretizing the transaction numerical value class characteristic data;
the statistics module is used for carrying out statistics processing on the discretized transaction numerical value class feature data through a word bag model to obtain feature vectors of transaction numerical value class features of each discretized section of each text word in a preset time interval;
and the dimension reduction module is used for carrying out data compression dimension reduction processing on the feature vector to obtain the feature information of the transaction numerical value class.
7. The transaction value class feature extraction device of claim 6, wherein the dimension reduction module is configured to:
And carrying out data compression and dimension reduction processing on the feature vector through the tested SVD to obtain the feature information of the transaction numerical class.
8. An electronic device, comprising:
A processor;
A memory for storing the processor-executable instructions;
Wherein the processor is configured to execute the instructions to implement the transaction value class feature extraction method of any one of claims 1-5.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the transaction value class feature extraction method of any one of claims 1-5.
10. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the transaction value class feature extraction method of any one of claims 1 to 5.
CN202410379307.0A 2024-03-29 2024-03-29 Transaction numerical class feature extraction method, device, equipment and storage medium Pending CN118194018A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410379307.0A CN118194018A (en) 2024-03-29 2024-03-29 Transaction numerical class feature extraction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410379307.0A CN118194018A (en) 2024-03-29 2024-03-29 Transaction numerical class feature extraction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118194018A true CN118194018A (en) 2024-06-14

Family

ID=91394472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410379307.0A Pending CN118194018A (en) 2024-03-29 2024-03-29 Transaction numerical class feature extraction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118194018A (en)

Similar Documents

Publication Publication Date Title
EP4191478A1 (en) Method and apparatus for compressing neural network model
CN112818013A (en) Time sequence database query optimization method, device, equipment and storage medium
CN115222444A (en) Method, apparatus, device, medium and product for outputting model information
CN114861059A (en) Resource recommendation method and device, electronic equipment and storage medium
CN118194018A (en) Transaction numerical class feature extraction method, device, equipment and storage medium
CN115794744A (en) Log display method, device, equipment and storage medium
CN115309730A (en) Data auditing method and device, electronic equipment and storage medium
CN115407150A (en) System, method, meter and medium for determining use condition of protective pressing plate
CN114661562A (en) Data warning method, device, equipment and medium
CN114021642A (en) Data processing method and device, electronic equipment and storage medium
CN116049555A (en) Information recommendation method, device, equipment and storage medium
CN117150215B (en) Assessment result determining method and device, electronic equipment and storage medium
CN114490347A (en) Research and development efficiency index calculation method and device, electronic equipment and storage medium
CN114330300A (en) Penetration test document analysis method, device, equipment and storage medium
CN117331924A (en) Data model matching degree checking method, device, equipment and storage medium
CN116166506A (en) System operation data processing method, device, equipment and storage medium
CN115131148A (en) Transaction data processing method, device, equipment and storage medium
CN117009356A (en) Method, device and equipment for determining application success of public data
CN114548077A (en) Word stock construction method and device
CN114840798A (en) Information generation method, device, equipment and storage medium
CN114782383A (en) Webpage quality monitoring method, device, equipment and storage medium
CN114820193A (en) Chip change curve generation method, device, equipment and storage medium
CN114637787A (en) Data statistical method and device, electronic equipment and storage medium
CN117033148A (en) Alarm method, device, electronic equipment and medium of risk service interface
CN114565402A (en) Information recommendation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination