CN111984898A - Label pushing method and device based on big data, electronic equipment and storage medium - Google Patents

Label pushing method and device based on big data, electronic equipment and storage medium Download PDF

Info

Publication number
CN111984898A
CN111984898A CN202010610771.8A CN202010610771A CN111984898A CN 111984898 A CN111984898 A CN 111984898A CN 202010610771 A CN202010610771 A CN 202010610771A CN 111984898 A CN111984898 A CN 111984898A
Authority
CN
China
Prior art keywords
data
node
label
preset
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010610771.8A
Other languages
Chinese (zh)
Inventor
张永强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010610771.8A priority Critical patent/CN111984898A/en
Publication of CN111984898A publication Critical patent/CN111984898A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of big data, and provides a label pushing method and device based on big data, electronic equipment and a storage medium, wherein the label pushing method and device based on big data comprise the following steps: collecting original data from a plurality of preset data sources, and cleaning the original data according to a preset data cleaning strategy to obtain sample data; extracting multi-dimensional target characteristics from sample data of each node and classifying according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when the target node is monitored to be triggered, pushing an initial label in a label system of the target node. According to the invention, the initial label of each node is obtained by cleaning the original data and extracting the multi-dimensional target characteristics, and is clustered into label systems of different objects, so that the accuracy of label recommendation is improved. In addition, the invention also relates to the technical field of block chains, and the initial label is stored in the block chain node.

Description

Label pushing method and device based on big data, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of big data, in particular to a label pushing method and device based on big data, electronic equipment and a storage medium.
Background
The traditional label technology is based on data source extraction, acquires user behavior data through means such as embedding points and the like, and marks various labels according to behavior habits and basic information of users, and the traditional labels are based on business or product angles, relevant dimension combination and threshold setting are carried out by depending on experience, and most labels are in no need of attention.
The label library of the existing business execution system is made based on business execution flow, labels in the label library include a large amount of manual operations, most labels come from simple collection and arrangement of data, data of each business node is not cleaned, valuable information is extracted to create the label library, a user cannot quickly obtain required data and data according to recommended labels, a large amount of data and data need to be consulted before each operation is determined, and the accuracy rate of the recommended labels is low.
Disclosure of Invention
In view of the above, it is necessary to provide a label pushing method, a device, an electronic device and a storage medium based on big data, where the original data is cleaned and multidimensional target features are extracted to obtain an initial label of each node, and the initial label is clustered into label systems of different objects, so as to improve the accuracy of label recommendation.
The first aspect of the present invention provides a big data based tag pushing method, where the big data based tag pushing method includes:
collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier;
performing data cleaning on each original data according to a preset data cleaning strategy to obtain sample data;
extracting multidimensional target characteristics from sample data corresponding to each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node;
performing clustering analysis on the initial label of each node to form label systems of different objects;
and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.
Preferably, the clustering the initial labels of each node to form a label system of different objects includes:
clustering the initial label of each node according to a k-means clustering algorithm to obtain a plurality of objects;
and setting the target object and the initial label corresponding to the target object as a label system corresponding to the target object by taking any one of the plurality of objects as the target object.
Preferably, after performing cluster analysis on the initial label of each node to form a label system of different objects, the method further includes:
monitoring the click rate and the conversion rate of each initial label in a preset period in real time;
judging whether the click rate of each initial label is larger than a corresponding click rate threshold value or not, and judging whether the conversion rate of each initial label is larger than a corresponding conversion rate threshold value or not;
when the click rate of each initial label is greater than or equal to the corresponding click rate threshold value and the conversion rate of each initial label is greater than or equal to the corresponding conversion rate threshold value, dividing the initial labels into hot labels;
and when the click rate of each initial label is smaller than the corresponding click rate threshold value, or the conversion rate of each initial label is smaller than the corresponding conversion rate threshold value, dividing the initial labels into useless labels.
Preferably, the data cleaning the original data according to a preset data cleaning policy, and obtaining sample data includes:
identifying a node identification for each raw data;
acquiring a preset data cleaning strategy corresponding to the node identification;
cleaning the original data corresponding to the node identification according to the preset data cleaning strategy;
converting the cleaned original data into structured data of a preset type;
classifying the structured data according to the node identification to obtain sample data, and storing the sample data into a preset database.
Preferably, the extracting multidimensional target features from the sample data corresponding to each node, and classifying the multidimensional target features according to a preset classification model to obtain the initial label of each node includes:
reading sample data of each node from a preset database according to the node identification of each node and a query language HQL grammar rule facing the node identification;
extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm;
and inputting the multi-dimensional target characteristics into the preset classification model for classification to obtain an initial label of each node, wherein the initial label is stored in a block chain node.
Preferably, the extracting the multidimensional target features from the read sample data of each node according to a preset algorithm includes:
extracting a first feature from the read sample data of each node according to a preset feature dimension;
processing the read sample data of each node through the trained model to obtain a second characteristic;
and combining the first characteristic and the second characteristic to obtain a multi-dimensional target characteristic.
Preferably, after the pushing of the initial tag in the tag system corresponding to the target node, the method further includes:
when a reprocessing instruction of a user to the pushed initial label is monitored, analyzing the reprocessing instruction to obtain a reprocessing condition of the user;
inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label;
and pushing the advanced label.
A second aspect of the present invention provides a big-data-based tag pushing apparatus, including:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of original data from a plurality of preset data sources, and each original data corresponds to a node identifier;
the cleaning module is used for cleaning data of each original data according to a preset data cleaning strategy to obtain sample data;
the classification module is used for extracting multi-dimensional target characteristics from sample data corresponding to each node, classifying the multi-dimensional target characteristics according to a preset classification model, and obtaining an initial label of each node;
the analysis module is used for carrying out clustering analysis on the initial label of each node to form a label system of different objects;
and the pushing module is used for pushing the initial label in the label system corresponding to the target node when the target node is monitored to be triggered.
A third aspect of the present invention provides an electronic device comprising a processor for implementing the big-data based tag pushing method when executing a computer program stored in a memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the big-data based tag pushing method.
In summary, according to the method, the device, the terminal and the storage medium for pushing the label based on the big data, on one hand, the original data collected from different data sources are cleaned through the preset data cleaning strategy, the original data are cleaned to obtain the sample data of each node, the problem data are deleted, the consistency and the integrity of the obtained sample data are ensured, the quality of the sample data is improved, on the other hand, the multidimensional target characteristics in the sample data of each node are extracted, the multidimensional target characteristics are input into the preset classification model to be classified to obtain the initial label of each node, the efficiency of obtaining the initial label through calculation is improved, and meanwhile, the accuracy of label recommendation is improved by clustering the initial label of each node into label systems of different objects.
In addition, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is guaranteed, and meanwhile the accuracy of recommending the labels is improved.
Drawings
Fig. 1 is a flowchart of a big data based tag pushing method according to an embodiment of the present invention.
Fig. 2 is a structural diagram of a tag pushing apparatus based on big data according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Example one
Fig. 1 is a flowchart of a big data based tag pushing method according to an embodiment of the present invention.
In this embodiment, the method for pushing a tag based on big data may be applied to an electronic device, and for an electronic device that needs to perform tag pushing based on big data, a function of the tag pushing based on big data provided by the method of the present invention may be directly integrated on the electronic device, or may be run in the electronic device in the form of a Software Development Kit (SKD).
As shown in fig. 1, the big data based tag pushing method specifically includes the following steps, and the order of the steps in the flowchart may be changed and some may be omitted according to different requirements.
S11: the method comprises the steps of collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier.
In this embodiment, the original data includes: basic information of an executed object, basic information of a case, execution main body information, property information and the like, wherein the executed person is old, and the executed person information mainly comprises: name, identification card number, age, sex, occupation, unit, etc.; the execution subject information mainly includes: case number, executed person identity information, user information, involved links, return state, operation time, and the like; property information refers to all properties under the name of the executed person, such as: bank deposits, real estate, vehicles, etc. Take the property as an example: province, floor, orientation, area, etc. where the property resides. The data source may be an execution service system, and raw data is collected from each process node of the execution service system.
S12: and performing data cleaning on each original data according to a preset data cleaning strategy to obtain sample data.
In this embodiment, a data cleaning policy may be set in advance according to a cleaning condition of a tag corresponding to each node, where the preset data cleaning policy may be missing value cleaning, format content cleaning, logic error cleaning, and non-demand data cleaning, and after the raw data is collected, the raw data is cleaned according to the preset data cleaning policy to obtain sample data.
In this embodiment, the preset data cleaning strategy corresponding to missing value cleaning is to directly delete data records with missing values or to complement data records with missing values.
Illustratively, the target labels of the data records with missing values are mainly concentrated in a certain class or several classes, if the data records are deleted, a large amount of characteristic information of the corresponding classified data samples is lost, so that the model is over-fitted or the classification is inaccurate, and a preset data cleaning strategy is adopted to complement the data records with the missing values.
In this embodiment, the preset data cleaning policy corresponding to the format content cleaning is to clean data, such as time, date, numerical value, full half angle, and the like, of which the display formats are inconsistent, the content has characters which should not exist, and the content is inconsistent with the field content.
Illustratively, when the display formats of time, date, numerical value, full half angle and the like are inconsistent, the preset data cleaning strategy is to process the display formats of time, date, numerical value, full half angle and the like into a consistent format; when there are characters that should not exist in the content, the preset data cleansing strategy is to find out possible problems in a semi-automatic verification semi-manual manner and remove unnecessary characters, for example: chinese characters appear in the identification number.
In this embodiment, the preset data cleansing strategy corresponding to the logical error cleansing is to remove duplicate, remove unreasonable values, and correct contradictory contents.
Illustratively, the preset data cleaning strategy for the deduplication setting is to delete repeated fields, and only one field is reserved; deleting unreasonable values corresponding to the ages according to preset data cleaning strategies set for removing unreasonable values, such as the age of 200 years; the preset data cleaning strategy set for correcting the contradictory contents is that it is required to judge which field provides more reliable information according to the data source of the field, remove or reconstruct unreliable fields, for example, the identification number is 1101031980XXXXXXX, and then age is 18 years old, and it is required to judge which identification number and age are more reliable to reconstruct or delete the contradictory contents.
In this embodiment, the preset data cleansing policy corresponding to the non-required data cleansing means that unnecessary fields are deleted.
Preferably, the data cleaning the original data according to a preset data cleaning policy, and obtaining sample data includes:
identifying a node identification for each raw data;
acquiring a preset data cleaning strategy corresponding to the node identification;
cleaning the original data corresponding to the node identification according to the preset data cleaning strategy;
converting the cleaned original data into structured data of a preset type;
classifying the structured data according to the node identification to obtain sample data, and storing the sample data into a preset database.
In this embodiment, the preset database may be a Hive database, the Hive is a data warehouse tool based on Hadoop, structured data may be stored, a complete sql query function is provided, sql statements may be converted into MapReduce tasks to run, the original data is cleaned by a preset data cleaning policy and then converted into structured data of a preset type, the structured data is classified according to the node identifier to obtain sample data, and the sample data is stored in the preset database.
Further, the method further comprises:
placing problem data which does not accord with the preset data cleaning strategy in the original data into a problem database;
when a re-cleaning instruction is not received within a preset time period, finishing the processing of the problem data;
and deleting the problem data.
In this embodiment, in the process of cleaning data through a preset data cleaning strategy, if problem data occurs, the problem data may be stored in a problem database, and if a re-cleaning instruction is not received within a preset time period, it is determined that the problem data may be deleted.
In the embodiment, the original data is cleaned through a preset data cleaning strategy, the problem data is deleted, the consistency and the integrity of the obtained sample data are ensured, and the quality of the sample data is improved.
S13: extracting multidimensional target characteristics from sample data corresponding to each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node.
In this embodiment, because the data corresponding to different nodes is different, different nodes correspond to different sample data, multidimensional target features are extracted from the sample data corresponding to each node, the multidimensional target features are trained in a pre-trained classification model to obtain an initial label of each node, and the initial labels of the nodes are sorted to form a label library.
Preferably, the extracting multidimensional target features from the sample data corresponding to each node, and classifying the multidimensional target features according to a preset classification model to obtain the initial label of each node includes:
reading sample data of each node from a preset database according to the node identification of each node and a query language HQL grammar rule facing the node identification;
extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm;
and inputting the multi-dimensional target characteristics into the preset classification model for classification to obtain an initial label of each node, wherein the initial label is stored in a block chain node.
In this embodiment, sample data of different nodes are different, sample data of a corresponding node is read from the preset database by using a query language HQL grammar rule, and a multi-dimensional target feature is extracted from the read sample data of each node by using a preset algorithm, which is the prior art, and the present invention is not described in detail herein.
It is emphasized that the initial label may also be stored in a node of a blockchain in order to further ensure the privacy and security of the initial label.
In the embodiment, the initial label of each node is obtained by inputting the multidimensional target characteristics into a preset classification model for classification, so that the efficiency of obtaining the initial label by calculation is improved.
Further, the extracting the multidimensional target features from the read sample data of each node according to a preset algorithm includes:
extracting a first feature from the read sample data of each node according to a preset feature dimension;
processing the read sample data of each node through the trained model to obtain a second characteristic;
and combining the first characteristic and the second characteristic to obtain a multi-dimensional target characteristic.
In this implementation, the multi-dimensional target features include basic features and behavior features, and the basic features are natural attribute descriptions of the executed object, such as sex and age of the executed object; the behavior feature is a feature generated by the behavior of the executed object, for example, no property, no room, no car, and the like.
In this embodiment, different from a conventional label system, the label system obtains sample data of each node by analyzing original data of each node in a business process, and extracts a multidimensional target feature in the sample data of each node to obtain an initial label of each node, thereby ensuring accuracy of the initial label.
S14: and performing cluster analysis on the initial label of each node to form a label system of different objects.
In this embodiment, the objects may be classified according to the main dimension to which the initial tag belongs, for example, may be classified into case tags, executed person tags, property information tags, and the like; the objects may also be classified according to a policy model of the application of the initial tags, e.g., may be classified as property control model tags, etc.
Preferably, the clustering the initial labels of each node to form a label system of different objects includes:
clustering the initial label of each node according to a k-means clustering algorithm to obtain a plurality of objects;
and setting the target object and the initial label corresponding to the target object as a label system corresponding to the target object by taking any one of the plurality of objects as the target object.
In this embodiment, the K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and includes the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and assigning each object to the closest clustering center; the cluster centers and the objects assigned to them represent a cluster; each sample is allocated, and the clustering center of the cluster is recalculated according to the existing object in the cluster; this process is repeated until a predetermined termination condition is met, wherein the predetermined termination condition may be that no object is reassigned to a different cluster, that no cluster center changes, and that the sum of squared errors is locally minimal.
In the embodiment, the initial labels are clustered by adopting a k-means clustering algorithm to obtain a plurality of objects, so that the accuracy of obtaining label systems of different objects is improved.
Further, after performing cluster analysis on the initial label of each node to form a label system of different objects, the method further includes:
monitoring the click rate and the conversion rate of each initial label in a preset period in real time;
judging whether the click rate of each initial label is larger than a corresponding click rate threshold value or not, and judging whether the conversion rate of each initial label is larger than a corresponding conversion rate threshold value or not;
when the click rate of each initial label is greater than or equal to the corresponding click rate threshold value and the conversion rate of each initial label is greater than or equal to the corresponding conversion rate threshold value, dividing the initial labels into hot labels;
and when the click rate of each initial label is smaller than the corresponding click rate threshold value, or the conversion rate of each initial label is smaller than the corresponding conversion rate threshold value, dividing the initial labels into useless labels.
Further, the method further comprises:
when the initial label is a hot label, retaining the initial label;
and when the initial label is a useless label, deleting the initial label.
In this embodiment, the conversion rate refers to a ratio of the initial label to the advanced label.
In the implementation, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is ensured, and meanwhile the accuracy of recommending the labels is improved.
S15: and when the target node is monitored to be triggered, pushing an initial label in a label system corresponding to the target node.
Illustratively, when a node triggering property control is monitored, property control model labels in a label system corresponding to the property control models are pushed to assist a user to make a quick decision and determine the priority of property control, for example, a bank deposit of an executed person is frozen, or a local property of the executed person is checked and sealed, so that the efficiency of handling cases of the user is improved.
In the embodiment, different application models are directly given through the labels, and the model strategy operation process is reduced.
Further, after the pushing of the initial tag in the tag system corresponding to the target node, the method further includes:
when a reprocessing instruction of a user to the pushed initial label is monitored, analyzing the reprocessing instruction to obtain a reprocessing condition of the user;
inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label;
and pushing the advanced label.
In the implementation, the reprocessing condition is input into the preset classification model to obtain a new label by analyzing the reprocessing instruction fed back by the user, and the new label and the pushed initial label are combined to obtain a high-grade label, so that the user is responded in time, the timeliness of the recommended label is improved, and the case handling efficiency is improved.
In summary, in the tag pushing method based on big data according to this embodiment, original data is collected from a plurality of preset data sources, where the original data corresponds to a node identifier; performing data cleaning on the original data according to a preset data cleaning strategy to obtain sample data of a plurality of nodes; extracting multidimensional target characteristics from sample data of each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.
According to the big data-based label pushing method, on one hand, original data collected from different data sources are cleaned through a preset data cleaning strategy, the original data are cleaned to obtain sample data of each node, problem data are deleted, the consistency and integrity of the obtained sample data are guaranteed, the quality of the sample data is improved, on the other hand, multi-dimensional target features in the sample data of each node are extracted, the multi-dimensional target features are input into a preset classification model to be classified to obtain initial labels of each node, the efficiency of obtaining the initial labels through calculation is improved, and meanwhile, the accuracy of label recommendation is improved as to label systems with the initial labels of each node clustered into different objects.
In addition, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is guaranteed, and meanwhile the accuracy of recommending the labels is improved.
Example two
Fig. 2 is a structural diagram of a tag pushing apparatus based on big data according to a second embodiment of the present invention.
In some embodiments, the big-data based tag pushing apparatus 20 may include a plurality of functional modules composed of program code segments. Program codes of respective program segments in the big-data based tag pushing apparatus 20 may be stored in a memory of the electronic device and executed by the at least one processor to perform (see detailed description of fig. 1) the pushing of the big-data based tag.
In this embodiment, the tag pushing apparatus 20 based on big data may be divided into a plurality of functional modules according to the functions performed by the tag pushing apparatus. The functional module may include: the system comprises an acquisition module 201, a cleaning module 202, a classification module 203, an analysis module 204, a monitoring module 205, a judgment module 206 and a pushing module 207. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The acquisition module 201: the method is used for collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier.
In this embodiment, the original data includes: basic information of an executed object, basic information of a case, execution main body information, property information and the like, wherein the executed person is old, and the executed person information mainly comprises: name, identification card number, age, sex, occupation, unit, etc.; the execution subject information mainly includes: case number, executed person identity information, user information, involved links, return state, operation time, and the like; property information refers to all properties under the name of the executed person, such as: bank deposits, real estate, vehicles, etc. Take the property as an example: province, floor, orientation, area, etc. where the property resides. The data source may be an execution service system, and raw data is collected from each process node of the execution service system.
The cleaning module 202: and the data cleaning module is used for cleaning data of each original data according to a preset data cleaning strategy to obtain sample data.
In this embodiment, a data cleaning policy may be set in advance according to a cleaning condition of a tag corresponding to each node, where the preset data cleaning policy may be missing value cleaning, format content cleaning, logic error cleaning, and non-demand data cleaning, and after the raw data is collected, the raw data is cleaned according to the preset data cleaning policy to obtain sample data.
In this embodiment, the preset data cleaning strategy corresponding to missing value cleaning is to directly delete data records with missing values or to complement data records with missing values.
Illustratively, the target labels of the data records with missing values are mainly concentrated in a certain class or several classes, if the data records are deleted, a large amount of characteristic information of the corresponding classified data samples is lost, so that the model is over-fitted or the classification is inaccurate, and a preset data cleaning strategy is adopted to complement the data records with the missing values.
In this embodiment, the preset data cleaning policy corresponding to the format content cleaning is to clean data, such as time, date, numerical value, full half angle, and the like, of which the display formats are inconsistent, the content has characters which should not exist, and the content is inconsistent with the field content.
Illustratively, when the display formats of time, date, numerical value, full half angle and the like are inconsistent, the preset data cleaning strategy is to process the display formats of time, date, numerical value, full half angle and the like into a consistent format; when there are characters that should not exist in the content, the preset data cleansing strategy is to find out possible problems in a semi-automatic verification semi-manual manner and remove unnecessary characters, for example: chinese characters appear in the identification number.
In this embodiment, the preset data cleansing strategy corresponding to the logical error cleansing is to remove duplicate, remove unreasonable values, and correct contradictory contents.
Illustratively, the preset data cleaning strategy for the deduplication setting is to delete repeated fields, and only one field is reserved; deleting unreasonable values corresponding to the ages according to preset data cleaning strategies set for removing unreasonable values, such as the age of 200 years; the preset data cleaning strategy set for correcting the contradictory contents is that it is required to judge which field provides more reliable information according to the data source of the field, remove or reconstruct unreliable fields, for example, the identification number is 1101031980XXXXXXX, and then age is 18 years old, and it is required to judge which identification number and age are more reliable to reconstruct or delete the contradictory contents.
In this embodiment, the preset data cleansing policy corresponding to the non-required data cleansing means that unnecessary fields are deleted.
Preferably, the cleaning module 202 performs data cleaning on the original data according to a preset data cleaning policy, and obtaining sample data includes:
identifying a node identification for each raw data;
acquiring a preset data cleaning strategy corresponding to the node identification;
cleaning the original data corresponding to the node identification according to the preset data cleaning strategy;
converting the cleaned original data into structured data of a preset type;
classifying the structured data according to the node identification to obtain sample data, and storing the sample data into a preset database.
In this embodiment, the preset database may be a Hive database, the Hive is a data warehouse tool based on Hadoop, structured data may be stored, a complete sql query function is provided, sql statements may be converted into MapReduce tasks to run, the original data is cleaned by a preset data cleaning policy and then converted into structured data of a preset type, the structured data is classified according to the node identifier to obtain sample data, and the sample data is stored in the preset database.
Further, in the process of data cleaning, problem data which does not accord with the preset data cleaning strategy in the original data is placed in a problem database; when a re-cleaning instruction is not received within a preset time period, finishing the processing of the problem data; and deleting the problem data.
In this embodiment, in the process of cleaning data through a preset data cleaning strategy, if problem data occurs, the problem data may be stored in a problem database, and if a re-cleaning instruction is not received within a preset time period, it is determined that the problem data may be deleted.
In the embodiment, the original data is cleaned through a preset data cleaning strategy, the problem data is deleted, the consistency and the integrity of the obtained sample data are ensured, and the quality of the sample data is improved.
The classification module 203: the method is used for extracting multi-dimensional target features from sample data corresponding to each node, and classifying the multi-dimensional target features according to a preset classification model to obtain an initial label of each node.
In this embodiment, because the data corresponding to different nodes is different, different nodes correspond to different sample data, multidimensional target features are extracted from the sample data corresponding to each node, the multidimensional target features are trained in a pre-trained classification model to obtain an initial label of each node, and the initial labels of the nodes are sorted to form a label library.
Preferably, the classifying module 203 extracts multidimensional target features from sample data corresponding to each node, and classifies the multidimensional target features according to a preset classification model, so as to obtain an initial label of each node, including:
reading sample data of each node from a preset database according to the node identification of each node and a query language HQL grammar rule facing the node identification;
extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm;
and inputting the multi-dimensional target characteristics into the preset classification model for classification to obtain an initial label of each node, wherein the initial label is stored in a block chain node.
In this embodiment, sample data of different nodes are different, sample data of a corresponding node is read from the preset database by using a query language HQL grammar rule, and a multi-dimensional target feature is extracted from the read sample data of each node by using a preset algorithm, which is the prior art, and the present invention is not described in detail herein.
It is emphasized that the initial label may also be stored in a node of a blockchain in order to further ensure the privacy and security of the initial label.
In the embodiment, the initial label of each node is obtained by inputting the multidimensional target characteristics into a preset classification model for classification, so that the efficiency of obtaining the initial label by calculation is improved.
Further, the extracting the multidimensional target features from the read sample data of each node according to a preset algorithm includes:
extracting a first feature from the read sample data of each node according to a preset feature dimension;
processing the read sample data of each node through the trained model to obtain a second characteristic;
and combining the first characteristic and the second characteristic to obtain a multi-dimensional target characteristic.
In this implementation, the multi-dimensional target features include basic features and behavior features, and the basic features are natural attribute descriptions of the executed object, such as sex and age of the executed object; the behavior feature is a feature generated by the behavior of the executed object, for example, no property, no room, no car, and the like.
In this embodiment, different from a conventional label system, the label system obtains sample data of each node by analyzing original data of each node in a business process, and extracts a multidimensional target feature in the sample data of each node to obtain an initial label of each node, thereby ensuring accuracy of the initial label.
The analysis module 204: and the label system is used for carrying out cluster analysis on the initial label of each node to form different objects.
In this embodiment, the objects may be classified according to the main dimension to which the initial tag belongs, for example, may be classified into case tags, executed person tags, property information tags, and the like; the objects may also be classified according to a policy model of the application of the initial tags, e.g., may be classified as property control model tags, etc.
Preferably, the clustering analysis performed by the analysis module 204 on the initial label of each node to form a label system of different objects includes:
clustering the initial label of each node according to a k-means clustering algorithm to obtain a plurality of objects;
and setting the target object and the initial label corresponding to the target object as a label system corresponding to the target object by taking any one of the plurality of objects as the target object.
In this embodiment, the K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and includes the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and assigning each object to the closest clustering center; the cluster centers and the objects assigned to them represent a cluster; each sample is allocated, and the clustering center of the cluster is recalculated according to the existing object in the cluster; this process is repeated until a predetermined termination condition is met, wherein the predetermined termination condition may be that no object is reassigned to a different cluster, that no cluster center changes, and that the sum of squared errors is locally minimal.
In the embodiment, the initial labels are clustered by adopting a k-means clustering algorithm to obtain a plurality of objects, so that the accuracy of obtaining label systems of different objects is improved.
Further, after the analysis module 204 performs cluster analysis on the initial label of each node to form a label system of different objects, the monitoring module 205: the method is used for monitoring the click rate and the conversion rate of each initial label in a preset period in real time.
The judging module 206: and the conversion module is used for judging whether the click rate of each initial label is greater than the corresponding click rate threshold value or not and judging whether the conversion rate of each initial label is greater than the corresponding conversion rate threshold value or not.
In this embodiment, when the click rate of each initial tag is greater than or equal to the corresponding click rate threshold and the conversion rate of each initial tag is greater than or equal to the corresponding conversion rate threshold, the initial tags are divided into hot tags.
In this embodiment, when the click rate of each initial tag is smaller than the corresponding click rate threshold, or the conversion rate of each initial tag is smaller than the corresponding conversion rate threshold, the initial tags are divided into useless tags.
Further, after the initial label is divided into a useless label and a hot label, the type of the initial label is judged, and when the initial label is the hot label, the initial label is reserved; and when the initial label is a useless label, deleting the initial label.
In this embodiment, the conversion rate refers to a ratio of the initial label to the advanced label.
In the implementation, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is ensured, and meanwhile the accuracy of recommending the labels is improved.
The pushing module 207: and the method is used for pushing the initial label in the label system corresponding to the target node when the target node is monitored to be triggered.
Illustratively, when a node triggering property control is monitored, property control model labels in a label system corresponding to the property control models are pushed to assist a user to make a quick decision and determine the priority of property control, for example, a bank deposit of an executed person is frozen, or a local property of the executed person is checked and sealed, so that the efficiency of handling cases of the user is improved.
In the embodiment, different application models are directly given through the labels, and the model strategy operation process is reduced.
Further, after the pushing module 207 pushes the initial tag in the tag system corresponding to the target node, when a reprocessing instruction of the pushed initial tag by the user is monitored, the reprocessing instruction is analyzed to obtain a reprocessing condition of the user; inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label; and pushing the advanced label.
In the implementation, the reprocessing condition is input into the preset classification model to obtain a new label by analyzing the reprocessing instruction fed back by the user, and the new label and the pushed initial label are combined to obtain a high-grade label, so that the user is responded in time, the timeliness of the recommended label is improved, and the case handling efficiency is improved.
In summary, in the tag pushing apparatus based on big data according to this embodiment, original data is collected from a plurality of preset data sources, where the original data corresponds to a node identifier; performing data cleaning on the original data according to a preset data cleaning strategy to obtain sample data of a plurality of nodes; extracting multidimensional target characteristics from sample data of each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.
According to the big data-based label pushing method, on one hand, original data collected from different data sources are cleaned through a preset data cleaning strategy, the original data are cleaned to obtain sample data of each node, problem data are deleted, the consistency and integrity of the obtained sample data are guaranteed, the quality of the sample data is improved, on the other hand, multi-dimensional target features in the sample data of each node are extracted, the multi-dimensional target features are input into a preset classification model to be classified to obtain initial labels of each node, the efficiency of obtaining the initial labels through calculation is improved, and meanwhile, the accuracy of label recommendation is improved as to label systems with the initial labels of each node clustered into different objects.
In addition, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is guaranteed, and meanwhile the accuracy of recommending the labels is improved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention. In the preferred embodiment of the present invention, the electronic device 3 comprises a memory 31, at least one processor 32, at least one communication bus 33 and a transceiver 34.
It will be appreciated by those skilled in the art that the configuration of the electronic device shown in fig. 3 does not constitute a limitation of the embodiment of the present invention, and may be a bus-type configuration or a star-type configuration, and the electronic device 3 may include more or less other hardware or software than those shown, or a different arrangement of components.
In some embodiments, the electronic device 3 is an electronic device capable of automatically performing numerical calculation and/or information processing according to instructions set or stored in advance, and the hardware thereof includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The electronic device 3 may also include a client device, which includes, but is not limited to, any electronic product that can interact with a client through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, and the like.
It should be noted that the electronic device 3 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
In some embodiments, the memory 31 is used for storing program codes and various data, such as the big data based tag pushing apparatus 20 installed in the electronic device 3, and realizes high-speed and automatic access to programs or data during the operation of the electronic device 3. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable rewritable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium readable by a computer capable of carrying or storing data.
In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The at least one processor 32 is a Control Unit (Control Unit) of the electronic device 3, connects various components of the electronic device 3 by various interfaces and lines, and executes various functions of the electronic device 3 and processes data, for example, a function of tag pushing based on big data, by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31.
In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.
Although not shown, the electronic device 3 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, an electronic device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
In a further embodiment, in conjunction with fig. 2, the at least one processor 32 may execute an operating device of the electronic device 3 and various installed application programs (such as the big data based tag pushing device 20), program codes, and the like, for example, the above modules.
The memory 31 has program code stored therein, and the at least one processor 32 can call the program code stored in the memory 31 to perform related functions. For example, the modules illustrated in fig. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to implement the functions of the modules for the purpose of large data based tag pushing.
In one embodiment of the present invention, the memory 31 stores a plurality of instructions that are executed by the at least one processor 32 for the purpose of big data based tag push.
Specifically, the at least one processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, and details are not repeated here.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A big data based label pushing method is characterized in that the big data based label pushing method comprises the following steps:
collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier;
performing data cleaning on each original data according to a preset data cleaning strategy to obtain sample data;
extracting multidimensional target characteristics from sample data corresponding to each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node;
performing clustering analysis on the initial label of each node to form label systems of different objects;
and when the target node is monitored to be triggered, pushing an initial label in a label system corresponding to the target node.
2. The big-data-based label pushing method according to claim 1, wherein the clustering the initial label of each node to form a label system of different objects comprises:
clustering the initial label of each node according to a k-means clustering algorithm to obtain a plurality of objects;
and setting the target object and the initial label corresponding to the target object as a label system corresponding to the target object by taking any one of the plurality of objects as the target object.
3. The big-data based tag pushing method according to claim 1, wherein after clustering the initial tags of each node to form a tag hierarchy of different objects, the method further comprises:
monitoring the click rate and the conversion rate of each initial label in a preset period in real time;
judging whether the click rate of each initial label is larger than a corresponding click rate threshold value or not, and judging whether the conversion rate of each initial label is larger than a corresponding conversion rate threshold value or not;
when the click rate of each initial label is greater than or equal to the corresponding click rate threshold value and the conversion rate of each initial label is greater than or equal to the corresponding conversion rate threshold value, dividing the initial labels into hot labels;
and when the click rate of each initial label is smaller than the corresponding click rate threshold value, or the conversion rate of each initial label is smaller than the corresponding conversion rate threshold value, dividing the initial labels into useless labels.
4. The big-data-based tag pushing method according to claim 1, wherein the data cleaning of the original data according to a preset data cleaning policy, and obtaining sample data comprises:
identifying a node identification for each raw data;
acquiring a preset data cleaning strategy corresponding to the node identification;
cleaning the original data corresponding to the node identification according to the preset data cleaning strategy;
converting the cleaned original data into structured data of a preset type;
classifying the structured data according to the node identification to obtain sample data, and storing the sample data into a preset database.
5. The big-data-based tag pushing method according to claim 1, wherein the extracting multidimensional target features from the sample data corresponding to each node, and classifying the multidimensional target features according to a preset classification model to obtain an initial tag of each node comprises:
reading sample data of each node from a preset database according to the node identification of each node and a query language HQL grammar rule facing the node identification;
extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm;
and inputting the multi-dimensional target characteristics into the preset classification model for classification to obtain an initial label of each node, wherein the initial label is stored in a block chain node.
6. The big-data-based tag pushing method according to claim 5, wherein the extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm comprises:
extracting a first feature from the read sample data of each node according to a preset feature dimension;
processing the read sample data of each node through the trained model to obtain a second characteristic;
and combining the first characteristic and the second characteristic to obtain a multi-dimensional target characteristic.
7. The big-data-based tag pushing method according to claim 1, wherein after the pushing of the initial tag in the tag hierarchy corresponding to the target node, the method further comprises:
when a reprocessing instruction of a user to the pushed initial label is monitored, analyzing the reprocessing instruction to obtain a reprocessing condition of the user;
inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label;
and pushing the advanced label.
8. A big data based tag pushing apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of original data from a plurality of preset data sources, and each original data corresponds to a node identifier;
the cleaning module is used for cleaning data of each original data according to a preset data cleaning strategy to obtain sample data;
the classification module is used for extracting multi-dimensional target characteristics from sample data corresponding to each node, classifying the multi-dimensional target characteristics according to a preset classification model, and obtaining an initial label of each node;
the analysis module is used for carrying out clustering analysis on the initial label of each node to form a label system of different objects;
and the pushing module is used for pushing the initial label in the label system corresponding to the target node when the target node is monitored to be triggered.
9. An electronic device, comprising a processor configured to implement the big-data based tag pushing method according to any one of claims 1 to 7 when executing a computer program stored in a memory.
10. A computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the big data based tag pushing method according to any one of claims 1 to 7.
CN202010610771.8A 2020-06-29 2020-06-29 Label pushing method and device based on big data, electronic equipment and storage medium Pending CN111984898A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010610771.8A CN111984898A (en) 2020-06-29 2020-06-29 Label pushing method and device based on big data, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010610771.8A CN111984898A (en) 2020-06-29 2020-06-29 Label pushing method and device based on big data, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111984898A true CN111984898A (en) 2020-11-24

Family

ID=73437640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010610771.8A Pending CN111984898A (en) 2020-06-29 2020-06-29 Label pushing method and device based on big data, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111984898A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112532750A (en) * 2021-01-18 2021-03-19 深圳博士创新技术转移有限公司 Big data push processing method and system and cloud platform
CN112860675A (en) * 2021-02-06 2021-05-28 高云 Big data processing method under online cloud service environment and cloud computing server
CN114756149A (en) * 2022-05-12 2022-07-15 北京达佳互联信息技术有限公司 Method and device for presenting data label, electronic equipment and storage medium
CN114791915A (en) * 2022-06-22 2022-07-26 深圳高灯计算机科技有限公司 Data aggregation method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129710A1 (en) * 2016-11-10 2018-05-10 Yahoo Japan Corporation Information processing apparatus, information processing method, and non-transitory computer readable recording medium
CN111062750A (en) * 2019-12-13 2020-04-24 中国平安财产保险股份有限公司 User portrait label modeling and analyzing method, device, equipment and storage medium
CN111177129A (en) * 2019-12-16 2020-05-19 中国平安财产保险股份有限公司 Label system construction method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129710A1 (en) * 2016-11-10 2018-05-10 Yahoo Japan Corporation Information processing apparatus, information processing method, and non-transitory computer readable recording medium
CN111062750A (en) * 2019-12-13 2020-04-24 中国平安财产保险股份有限公司 User portrait label modeling and analyzing method, device, equipment and storage medium
CN111177129A (en) * 2019-12-16 2020-05-19 中国平安财产保险股份有限公司 Label system construction method, device, equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112532750A (en) * 2021-01-18 2021-03-19 深圳博士创新技术转移有限公司 Big data push processing method and system and cloud platform
CN112860675A (en) * 2021-02-06 2021-05-28 高云 Big data processing method under online cloud service environment and cloud computing server
CN114756149A (en) * 2022-05-12 2022-07-15 北京达佳互联信息技术有限公司 Method and device for presenting data label, electronic equipment and storage medium
CN114756149B (en) * 2022-05-12 2023-12-29 北京达佳互联信息技术有限公司 Method, device, electronic equipment and storage medium for presenting data tag
CN114791915A (en) * 2022-06-22 2022-07-26 深圳高灯计算机科技有限公司 Data aggregation method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111984898A (en) Label pushing method and device based on big data, electronic equipment and storage medium
CN112541745B (en) User behavior data analysis method and device, electronic equipment and readable storage medium
CN112445854B (en) Multi-source service data real-time processing method, device, terminal and storage medium
CN111950738A (en) Machine learning model optimization effect evaluation method and device, terminal and storage medium
CN112016905B (en) Information display method and device based on approval process, electronic equipment and medium
CN115146865A (en) Task optimization method based on artificial intelligence and related equipment
CN113592019A (en) Fault detection method, device, equipment and medium based on multi-model fusion
CN111950625A (en) Risk identification method and device based on artificial intelligence, computer equipment and medium
CN112560465A (en) Method and device for monitoring batch abnormal events, electronic equipment and storage medium
CN113919336A (en) Article generation method and device based on deep learning and related equipment
CN114237829B (en) Data acquisition and processing method for power equipment
CN114862520A (en) Product recommendation method and device, computer equipment and storage medium
CN111651452B (en) Data storage method, device, computer equipment and storage medium
CN112199417B (en) Data processing method, device, terminal and storage medium based on artificial intelligence
CN111950707A (en) Behavior prediction method, apparatus, device and medium based on behavior co-occurrence network
CN116562894A (en) Vehicle insurance claim fraud risk identification method, device, electronic equipment and storage medium
CN112328752B (en) Course recommendation method and device based on search content, computer equipment and medium
CN114996386A (en) Business role identification method, device, equipment and storage medium
CN114881313A (en) Behavior prediction method and device based on artificial intelligence and related equipment
CN114140241A (en) Abnormity identification method and device for transaction monitoring index
CN114399318A (en) Link processing method and device, computer equipment and storage medium
CN113312409B (en) Task monitoring method and device, electronic equipment and computer readable storage medium
CN112699285A (en) Data classification method and device, computer equipment and storage medium
CN115146078A (en) Index data processing method and device, computer equipment and storage medium
CN113688924A (en) Abnormal order detection method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination