CN111984898A

CN111984898A - Label pushing method and device based on big data, electronic equipment and storage medium

Info

Publication number: CN111984898A
Application number: CN202010610771.8A
Authority: CN
Inventors: 张永强
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2020-11-24

Abstract

The invention relates to the technical field of big data, and provides a label pushing method and device based on big data, electronic equipment and a storage medium, wherein the label pushing method and device based on big data comprise the following steps: collecting original data from a plurality of preset data sources, and cleaning the original data according to a preset data cleaning strategy to obtain sample data; extracting multi-dimensional target characteristics from sample data of each node and classifying according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when the target node is monitored to be triggered, pushing an initial label in a label system of the target node. According to the invention, the initial label of each node is obtained by cleaning the original data and extracting the multi-dimensional target characteristics, and is clustered into label systems of different objects, so that the accuracy of label recommendation is improved. In addition, the invention also relates to the technical field of block chains, and the initial label is stored in the block chain node.

Description

Label pushing method and device based on big data, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of big data, in particular to a label pushing method and device based on big data, electronic equipment and a storage medium.

Background

The traditional label technology is based on data source extraction, acquires user behavior data through means such as embedding points and the like, and marks various labels according to behavior habits and basic information of users, and the traditional labels are based on business or product angles, relevant dimension combination and threshold setting are carried out by depending on experience, and most labels are in no need of attention.

The label library of the existing business execution system is made based on business execution flow, labels in the label library include a large amount of manual operations, most labels come from simple collection and arrangement of data, data of each business node is not cleaned, valuable information is extracted to create the label library, a user cannot quickly obtain required data and data according to recommended labels, a large amount of data and data need to be consulted before each operation is determined, and the accuracy rate of the recommended labels is low.

Disclosure of Invention

In view of the above, it is necessary to provide a label pushing method, a device, an electronic device and a storage medium based on big data, where the original data is cleaned and multidimensional target features are extracted to obtain an initial label of each node, and the initial label is clustered into label systems of different objects, so as to improve the accuracy of label recommendation.

The first aspect of the present invention provides a big data based tag pushing method, where the big data based tag pushing method includes:

collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier;

performing data cleaning on each original data according to a preset data cleaning strategy to obtain sample data;

extracting multidimensional target characteristics from sample data corresponding to each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node;

performing clustering analysis on the initial label of each node to form label systems of different objects;

and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.

Preferably, the clustering the initial labels of each node to form a label system of different objects includes:

clustering the initial label of each node according to a k-means clustering algorithm to obtain a plurality of objects;

and setting the target object and the initial label corresponding to the target object as a label system corresponding to the target object by taking any one of the plurality of objects as the target object.

Preferably, after performing cluster analysis on the initial label of each node to form a label system of different objects, the method further includes:

monitoring the click rate and the conversion rate of each initial label in a preset period in real time;

judging whether the click rate of each initial label is larger than a corresponding click rate threshold value or not, and judging whether the conversion rate of each initial label is larger than a corresponding conversion rate threshold value or not;

when the click rate of each initial label is greater than or equal to the corresponding click rate threshold value and the conversion rate of each initial label is greater than or equal to the corresponding conversion rate threshold value, dividing the initial labels into hot labels;

and when the click rate of each initial label is smaller than the corresponding click rate threshold value, or the conversion rate of each initial label is smaller than the corresponding conversion rate threshold value, dividing the initial labels into useless labels.

Preferably, the data cleaning the original data according to a preset data cleaning policy, and obtaining sample data includes:

identifying a node identification for each raw data;

acquiring a preset data cleaning strategy corresponding to the node identification;

cleaning the original data corresponding to the node identification according to the preset data cleaning strategy;

converting the cleaned original data into structured data of a preset type;

classifying the structured data according to the node identification to obtain sample data, and storing the sample data into a preset database.

Preferably, the extracting multidimensional target features from the sample data corresponding to each node, and classifying the multidimensional target features according to a preset classification model to obtain the initial label of each node includes:

reading sample data of each node from a preset database according to the node identification of each node and a query language HQL grammar rule facing the node identification;

extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm;

and inputting the multi-dimensional target characteristics into the preset classification model for classification to obtain an initial label of each node, wherein the initial label is stored in a block chain node.

Preferably, the extracting the multidimensional target features from the read sample data of each node according to a preset algorithm includes:

extracting a first feature from the read sample data of each node according to a preset feature dimension;

processing the read sample data of each node through the trained model to obtain a second characteristic;

and combining the first characteristic and the second characteristic to obtain a multi-dimensional target characteristic.

Preferably, after the pushing of the initial tag in the tag system corresponding to the target node, the method further includes:

when a reprocessing instruction of a user to the pushed initial label is monitored, analyzing the reprocessing instruction to obtain a reprocessing condition of the user;

inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label;

and pushing the advanced label.

A second aspect of the present invention provides a big-data-based tag pushing apparatus, including:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a plurality of original data from a plurality of preset data sources, and each original data corresponds to a node identifier;

the cleaning module is used for cleaning data of each original data according to a preset data cleaning strategy to obtain sample data;

the classification module is used for extracting multi-dimensional target characteristics from sample data corresponding to each node, classifying the multi-dimensional target characteristics according to a preset classification model, and obtaining an initial label of each node;

the analysis module is used for carrying out clustering analysis on the initial label of each node to form a label system of different objects;

and the pushing module is used for pushing the initial label in the label system corresponding to the target node when the target node is monitored to be triggered.

A third aspect of the present invention provides an electronic device comprising a processor for implementing the big-data based tag pushing method when executing a computer program stored in a memory.

A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the big-data based tag pushing method.

In summary, according to the method, the device, the terminal and the storage medium for pushing the label based on the big data, on one hand, the original data collected from different data sources are cleaned through the preset data cleaning strategy, the original data are cleaned to obtain the sample data of each node, the problem data are deleted, the consistency and the integrity of the obtained sample data are ensured, the quality of the sample data is improved, on the other hand, the multidimensional target characteristics in the sample data of each node are extracted, the multidimensional target characteristics are input into the preset classification model to be classified to obtain the initial label of each node, the efficiency of obtaining the initial label through calculation is improved, and meanwhile, the accuracy of label recommendation is improved by clustering the initial label of each node into label systems of different objects.

In addition, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is guaranteed, and meanwhile the accuracy of recommending the labels is improved.

Drawings

Fig. 1 is a flowchart of a big data based tag pushing method according to an embodiment of the present invention.

Fig. 2 is a structural diagram of a tag pushing apparatus based on big data according to a second embodiment of the present invention.

Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

Example one

In this embodiment, the method for pushing a tag based on big data may be applied to an electronic device, and for an electronic device that needs to perform tag pushing based on big data, a function of the tag pushing based on big data provided by the method of the present invention may be directly integrated on the electronic device, or may be run in the electronic device in the form of a Software Development Kit (SKD).

As shown in fig. 1, the big data based tag pushing method specifically includes the following steps, and the order of the steps in the flowchart may be changed and some may be omitted according to different requirements.

S11: the method comprises the steps of collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier.

In this embodiment, the original data includes: basic information of an executed object, basic information of a case, execution main body information, property information and the like, wherein the executed person is old, and the executed person information mainly comprises: name, identification card number, age, sex, occupation, unit, etc.; the execution subject information mainly includes: case number, executed person identity information, user information, involved links, return state, operation time, and the like; property information refers to all properties under the name of the executed person, such as: bank deposits, real estate, vehicles, etc. Take the property as an example: province, floor, orientation, area, etc. where the property resides. The data source may be an execution service system, and raw data is collected from each process node of the execution service system.

S12: and performing data cleaning on each original data according to a preset data cleaning strategy to obtain sample data.

In this embodiment, a data cleaning policy may be set in advance according to a cleaning condition of a tag corresponding to each node, where the preset data cleaning policy may be missing value cleaning, format content cleaning, logic error cleaning, and non-demand data cleaning, and after the raw data is collected, the raw data is cleaned according to the preset data cleaning policy to obtain sample data.

In this embodiment, the preset data cleaning strategy corresponding to missing value cleaning is to directly delete data records with missing values or to complement data records with missing values.

Illustratively, the target labels of the data records with missing values are mainly concentrated in a certain class or several classes, if the data records are deleted, a large amount of characteristic information of the corresponding classified data samples is lost, so that the model is over-fitted or the classification is inaccurate, and a preset data cleaning strategy is adopted to complement the data records with the missing values.

In this embodiment, the preset data cleaning policy corresponding to the format content cleaning is to clean data, such as time, date, numerical value, full half angle, and the like, of which the display formats are inconsistent, the content has characters which should not exist, and the content is inconsistent with the field content.

Illustratively, when the display formats of time, date, numerical value, full half angle and the like are inconsistent, the preset data cleaning strategy is to process the display formats of time, date, numerical value, full half angle and the like into a consistent format; when there are characters that should not exist in the content, the preset data cleansing strategy is to find out possible problems in a semi-automatic verification semi-manual manner and remove unnecessary characters, for example: chinese characters appear in the identification number.

In this embodiment, the preset data cleansing strategy corresponding to the logical error cleansing is to remove duplicate, remove unreasonable values, and correct contradictory contents.

Illustratively, the preset data cleaning strategy for the deduplication setting is to delete repeated fields, and only one field is reserved; deleting unreasonable values corresponding to the ages according to preset data cleaning strategies set for removing unreasonable values, such as the age of 200 years; the preset data cleaning strategy set for correcting the contradictory contents is that it is required to judge which field provides more reliable information according to the data source of the field, remove or reconstruct unreliable fields, for example, the identification number is 1101031980XXXXXXX, and then age is 18 years old, and it is required to judge which identification number and age are more reliable to reconstruct or delete the contradictory contents.

In this embodiment, the preset data cleansing policy corresponding to the non-required data cleansing means that unnecessary fields are deleted.

identifying a node identification for each raw data;

converting the cleaned original data into structured data of a preset type;

In this embodiment, the preset database may be a Hive database, the Hive is a data warehouse tool based on Hadoop, structured data may be stored, a complete sql query function is provided, sql statements may be converted into MapReduce tasks to run, the original data is cleaned by a preset data cleaning policy and then converted into structured data of a preset type, the structured data is classified according to the node identifier to obtain sample data, and the sample data is stored in the preset database.

Further, the method further comprises:

placing problem data which does not accord with the preset data cleaning strategy in the original data into a problem database;

when a re-cleaning instruction is not received within a preset time period, finishing the processing of the problem data;

and deleting the problem data.

In this embodiment, in the process of cleaning data through a preset data cleaning strategy, if problem data occurs, the problem data may be stored in a problem database, and if a re-cleaning instruction is not received within a preset time period, it is determined that the problem data may be deleted.

In the embodiment, the original data is cleaned through a preset data cleaning strategy, the problem data is deleted, the consistency and the integrity of the obtained sample data are ensured, and the quality of the sample data is improved.

S13: extracting multidimensional target characteristics from sample data corresponding to each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node.

In this embodiment, because the data corresponding to different nodes is different, different nodes correspond to different sample data, multidimensional target features are extracted from the sample data corresponding to each node, the multidimensional target features are trained in a pre-trained classification model to obtain an initial label of each node, and the initial labels of the nodes are sorted to form a label library.

In this embodiment, sample data of different nodes are different, sample data of a corresponding node is read from the preset database by using a query language HQL grammar rule, and a multi-dimensional target feature is extracted from the read sample data of each node by using a preset algorithm, which is the prior art, and the present invention is not described in detail herein.

It is emphasized that the initial label may also be stored in a node of a blockchain in order to further ensure the privacy and security of the initial label.

In the embodiment, the initial label of each node is obtained by inputting the multidimensional target characteristics into a preset classification model for classification, so that the efficiency of obtaining the initial label by calculation is improved.

Further, the extracting the multidimensional target features from the read sample data of each node according to a preset algorithm includes:

In this implementation, the multi-dimensional target features include basic features and behavior features, and the basic features are natural attribute descriptions of the executed object, such as sex and age of the executed object; the behavior feature is a feature generated by the behavior of the executed object, for example, no property, no room, no car, and the like.

In this embodiment, different from a conventional label system, the label system obtains sample data of each node by analyzing original data of each node in a business process, and extracts a multidimensional target feature in the sample data of each node to obtain an initial label of each node, thereby ensuring accuracy of the initial label.

S14: and performing cluster analysis on the initial label of each node to form a label system of different objects.

In this embodiment, the objects may be classified according to the main dimension to which the initial tag belongs, for example, may be classified into case tags, executed person tags, property information tags, and the like; the objects may also be classified according to a policy model of the application of the initial tags, e.g., may be classified as property control model tags, etc.

In this embodiment, the K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and includes the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and assigning each object to the closest clustering center; the cluster centers and the objects assigned to them represent a cluster; each sample is allocated, and the clustering center of the cluster is recalculated according to the existing object in the cluster; this process is repeated until a predetermined termination condition is met, wherein the predetermined termination condition may be that no object is reassigned to a different cluster, that no cluster center changes, and that the sum of squared errors is locally minimal.

In the embodiment, the initial labels are clustered by adopting a k-means clustering algorithm to obtain a plurality of objects, so that the accuracy of obtaining label systems of different objects is improved.

Further, after performing cluster analysis on the initial label of each node to form a label system of different objects, the method further includes:

Further, the method further comprises:

when the initial label is a hot label, retaining the initial label;

and when the initial label is a useless label, deleting the initial label.

In this embodiment, the conversion rate refers to a ratio of the initial label to the advanced label.

In the implementation, the click rate and the conversion rate of each initial label are monitored in real time in a preset period, useless labels are deleted, the whole label system is optimized through training and learning continuously, the timeliness of the initial labels in the label system is ensured, and meanwhile the accuracy of recommending the labels is improved.

S15: and when the target node is monitored to be triggered, pushing an initial label in a label system corresponding to the target node.

Illustratively, when a node triggering property control is monitored, property control model labels in a label system corresponding to the property control models are pushed to assist a user to make a quick decision and determine the priority of property control, for example, a bank deposit of an executed person is frozen, or a local property of the executed person is checked and sealed, so that the efficiency of handling cases of the user is improved.

In the embodiment, different application models are directly given through the labels, and the model strategy operation process is reduced.

Further, after the pushing of the initial tag in the tag system corresponding to the target node, the method further includes:

and pushing the advanced label.

In the implementation, the reprocessing condition is input into the preset classification model to obtain a new label by analyzing the reprocessing instruction fed back by the user, and the new label and the pushed initial label are combined to obtain a high-grade label, so that the user is responded in time, the timeliness of the recommended label is improved, and the case handling efficiency is improved.

In summary, in the tag pushing method based on big data according to this embodiment, original data is collected from a plurality of preset data sources, where the original data corresponds to a node identifier; performing data cleaning on the original data according to a preset data cleaning strategy to obtain sample data of a plurality of nodes; extracting multidimensional target characteristics from sample data of each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.

According to the big data-based label pushing method, on one hand, original data collected from different data sources are cleaned through a preset data cleaning strategy, the original data are cleaned to obtain sample data of each node, problem data are deleted, the consistency and integrity of the obtained sample data are guaranteed, the quality of the sample data is improved, on the other hand, multi-dimensional target features in the sample data of each node are extracted, the multi-dimensional target features are input into a preset classification model to be classified to obtain initial labels of each node, the efficiency of obtaining the initial labels through calculation is improved, and meanwhile, the accuracy of label recommendation is improved as to label systems with the initial labels of each node clustered into different objects.

Example two

In some embodiments, the big-data based tag pushing apparatus 20 may include a plurality of functional modules composed of program code segments. Program codes of respective program segments in the big-data based tag pushing apparatus 20 may be stored in a memory of the electronic device and executed by the at least one processor to perform (see detailed description of fig. 1) the pushing of the big-data based tag.

In this embodiment, the tag pushing apparatus 20 based on big data may be divided into a plurality of functional modules according to the functions performed by the tag pushing apparatus. The functional module may include: the system comprises an acquisition module 201, a cleaning module 202, a classification module 203, an analysis module 204, a monitoring module 205, a judgment module 206 and a pushing module 207. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.

The acquisition module 201: the method is used for collecting a plurality of original data from a plurality of preset data sources, wherein each original data corresponds to a node identifier.

The cleaning module 202: and the data cleaning module is used for cleaning data of each original data according to a preset data cleaning strategy to obtain sample data.

Preferably, the cleaning module 202 performs data cleaning on the original data according to a preset data cleaning policy, and obtaining sample data includes:

identifying a node identification for each raw data;

converting the cleaned original data into structured data of a preset type;

Further, in the process of data cleaning, problem data which does not accord with the preset data cleaning strategy in the original data is placed in a problem database; when a re-cleaning instruction is not received within a preset time period, finishing the processing of the problem data; and deleting the problem data.

The classification module 203: the method is used for extracting multi-dimensional target features from sample data corresponding to each node, and classifying the multi-dimensional target features according to a preset classification model to obtain an initial label of each node.

Preferably, the classifying module 203 extracts multidimensional target features from sample data corresponding to each node, and classifies the multidimensional target features according to a preset classification model, so as to obtain an initial label of each node, including:

The analysis module 204: and the label system is used for carrying out cluster analysis on the initial label of each node to form different objects.

Preferably, the clustering analysis performed by the analysis module 204 on the initial label of each node to form a label system of different objects includes:

Further, after the analysis module 204 performs cluster analysis on the initial label of each node to form a label system of different objects, the monitoring module 205: the method is used for monitoring the click rate and the conversion rate of each initial label in a preset period in real time.

The judging module 206: and the conversion module is used for judging whether the click rate of each initial label is greater than the corresponding click rate threshold value or not and judging whether the conversion rate of each initial label is greater than the corresponding conversion rate threshold value or not.

In this embodiment, when the click rate of each initial tag is greater than or equal to the corresponding click rate threshold and the conversion rate of each initial tag is greater than or equal to the corresponding conversion rate threshold, the initial tags are divided into hot tags.

In this embodiment, when the click rate of each initial tag is smaller than the corresponding click rate threshold, or the conversion rate of each initial tag is smaller than the corresponding conversion rate threshold, the initial tags are divided into useless tags.

Further, after the initial label is divided into a useless label and a hot label, the type of the initial label is judged, and when the initial label is the hot label, the initial label is reserved; and when the initial label is a useless label, deleting the initial label.

The pushing module 207: and the method is used for pushing the initial label in the label system corresponding to the target node when the target node is monitored to be triggered.

Further, after the pushing module 207 pushes the initial tag in the tag system corresponding to the target node, when a reprocessing instruction of the pushed initial tag by the user is monitored, the reprocessing instruction is analyzed to obtain a reprocessing condition of the user; inputting the reprocessing condition into the preset classification model to obtain a new label, and performing combined operation on the new label and the pushed initial label to obtain a high-grade label; and pushing the advanced label.

In summary, in the tag pushing apparatus based on big data according to this embodiment, original data is collected from a plurality of preset data sources, where the original data corresponds to a node identifier; performing data cleaning on the original data according to a preset data cleaning strategy to obtain sample data of a plurality of nodes; extracting multidimensional target characteristics from sample data of each node, and classifying the multidimensional target characteristics according to a preset classification model to obtain an initial label of each node; performing clustering analysis on the initial label of each node to form label systems of different objects; and when it is monitored that a target node in the plurality of nodes is triggered, pushing an initial label in a label system corresponding to the target node.

EXAMPLE III

Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention. In the preferred embodiment of the present invention, the electronic device 3 comprises a memory 31, at least one processor 32, at least one communication bus 33 and a transceiver 34.

It will be appreciated by those skilled in the art that the configuration of the electronic device shown in fig. 3 does not constitute a limitation of the embodiment of the present invention, and may be a bus-type configuration or a star-type configuration, and the electronic device 3 may include more or less other hardware or software than those shown, or a different arrangement of components.

In some embodiments, the electronic device 3 is an electronic device capable of automatically performing numerical calculation and/or information processing according to instructions set or stored in advance, and the hardware thereof includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The electronic device 3 may also include a client device, which includes, but is not limited to, any electronic product that can interact with a client through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, and the like.

It should be noted that the electronic device 3 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.

In some embodiments, the memory 31 is used for storing program codes and various data, such as the big data based tag pushing apparatus 20 installed in the electronic device 3, and realizes high-speed and automatic access to programs or data during the operation of the electronic device 3. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable rewritable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium readable by a computer capable of carrying or storing data.

In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The at least one processor 32 is a Control Unit (Control Unit) of the electronic device 3, connects various components of the electronic device 3 by various interfaces and lines, and executes various functions of the electronic device 3 and processes data, for example, a function of tag pushing based on big data, by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31.

In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.

Although not shown, the electronic device 3 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, an electronic device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.

In a further embodiment, in conjunction with fig. 2, the at least one processor 32 may execute an operating device of the electronic device 3 and various installed application programs (such as the big data based tag pushing device 20), program codes, and the like, for example, the above modules.

The memory 31 has program code stored therein, and the at least one processor 32 can call the program code stored in the memory 31 to perform related functions. For example, the modules illustrated in fig. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to implement the functions of the modules for the purpose of large data based tag pushing.

In one embodiment of the present invention, the memory 31 stores a plurality of instructions that are executed by the at least one processor 32 for the purpose of big data based tag push.

Specifically, the at least one processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, and details are not repeated here.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A big data based label pushing method is characterized in that the big data based label pushing method comprises the following steps:

and when the target node is monitored to be triggered, pushing an initial label in a label system corresponding to the target node.

2. The big-data-based label pushing method according to claim 1, wherein the clustering the initial label of each node to form a label system of different objects comprises:

3. The big-data based tag pushing method according to claim 1, wherein after clustering the initial tags of each node to form a tag hierarchy of different objects, the method further comprises:

4. The big-data-based tag pushing method according to claim 1, wherein the data cleaning of the original data according to a preset data cleaning policy, and obtaining sample data comprises:

identifying a node identification for each raw data;

converting the cleaned original data into structured data of a preset type;

5. The big-data-based tag pushing method according to claim 1, wherein the extracting multidimensional target features from the sample data corresponding to each node, and classifying the multidimensional target features according to a preset classification model to obtain an initial tag of each node comprises:

6. The big-data-based tag pushing method according to claim 5, wherein the extracting multi-dimensional target features from the read sample data of each node according to a preset algorithm comprises:

7. The big-data-based tag pushing method according to claim 1, wherein after the pushing of the initial tag in the tag hierarchy corresponding to the target node, the method further comprises:

and pushing the advanced label.

8. A big data based tag pushing apparatus, comprising:

9. An electronic device, comprising a processor configured to implement the big-data based tag pushing method according to any one of claims 1 to 7 when executing a computer program stored in a memory.

10. A computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the big data based tag pushing method according to any one of claims 1 to 7.