CN111708920B - Internet big data processing method based on artificial intelligence and intelligent cloud service platform - Google Patents

Internet big data processing method based on artificial intelligence and intelligent cloud service platform Download PDF

Info

Publication number
CN111708920B
CN111708920B CN202010508583.4A CN202010508583A CN111708920B CN 111708920 B CN111708920 B CN 111708920B CN 202010508583 A CN202010508583 A CN 202010508583A CN 111708920 B CN111708920 B CN 111708920B
Authority
CN
China
Prior art keywords
index
data
target
service
aggregation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010508583.4A
Other languages
Chinese (zh)
Other versions
CN111708920A (en
Inventor
程涛
谢国柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong and state network technology Co.,Ltd.
Original Assignee
Guangdong And State Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong And State Network Technology Co ltd filed Critical Guangdong And State Network Technology Co ltd
Priority to CN202011315512.9A priority Critical patent/CN112464041A/en
Priority to CN202010508583.4A priority patent/CN111708920B/en
Priority to CN202011313537.5A priority patent/CN112417221A/en
Publication of CN111708920A publication Critical patent/CN111708920A/en
Application granted granted Critical
Publication of CN111708920B publication Critical patent/CN111708920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the disclosure provides an internet big data processing method based on artificial intelligence and an intelligent cloud service platform, wherein a mobile internet terminal is subjected to corresponding data acquisition and identification operations through a pre-configured data acquisition script, a feature sample set is obtained from acquired internet big data information, then a corresponding portrait feature vector is extracted from the feature sample set, the portrait feature vector can be used as a shared portrait feature vector, and a portrait data area in a first feature sample and a key data area corresponding to the portrait data area in a second feature sample are respectively extracted on the basis of the shared portrait feature vector, so that portrait tags are generated, and tag generation speed and generation accuracy can be remarkably improved.

Description

Internet big data processing method based on artificial intelligence and intelligent cloud service platform
Technical Field
The disclosure relates to the technical field of big data and artificial intelligence, in particular to an internet big data processing method based on artificial intelligence and an intelligent cloud service platform.
Background
With the rapid development of the mobile internet technology, various internet access behaviors are increased, and data support can be provided for subsequent user portrait analysis through big data acquisition. However, in the conventional portrait label generation process, both the label generation speed and the generation accuracy need to be improved.
Disclosure of Invention
In order to overcome at least the above-mentioned deficiencies in the prior art, an object of the present disclosure is to provide an internet big data processing method and an intelligent cloud service platform based on artificial intelligence, which perform corresponding data acquisition and identification operations on a mobile internet terminal through a pre-configured data acquisition script, acquire a feature sample set from acquired internet big data information, and then extract a corresponding portrait feature vector from the feature sample set, where the portrait feature vector can be used as a shared portrait feature vector, and extract a portrait data area in a first feature sample and a key data area corresponding to the portrait data area in a second feature sample on the basis of the shared portrait feature vector, thereby performing portrait tag generation, and being capable of significantly improving tag generation speed and generation accuracy.
In a first aspect, the present disclosure provides an internet big data processing method based on artificial intelligence, which is applied to an intelligent cloud service platform, wherein the intelligent cloud service platform is in communication connection with a plurality of mobile internet terminals, and the method includes:
performing corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script, and acquiring a feature sample set from acquired internet big data information, wherein the feature sample set comprises a first feature sample and a second feature sample, and the second feature sample is a feature sample with internet service association in the first feature sample;
sequentially carrying out image feature analysis on each feature sample in the feature sample set according to a pre-configured artificial intelligence model to obtain a corresponding image feature vector, determining an image data area in the first feature sample based on the image feature vector corresponding to the first feature sample, extracting a target feature vector from the image feature vector corresponding to the first feature sample according to a target image data area corresponding to the image data area, and extracting a first candidate feature vector from the image feature vector corresponding to the second feature sample, wherein the data area corresponding to the first candidate feature vector covers the data area corresponding to the target feature vector;
searching a feature vector node matched with the target feature vector from the first candidate feature vector, and determining a key data area corresponding to the portrait data area in the second feature sample according to the searched feature vector node;
and generating portrait label information of the mobile internet terminal according to the portrait data area in the first characteristic sample and the key data area corresponding to the portrait data area in the second characteristic sample.
In a possible implementation manner of the first aspect, the generating of the portrait label information of the mobile internet terminal according to the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample includes:
acquiring a target data area formed by a common data area between an image data area in the first characteristic sample and a key data area corresponding to the image data area in the second characteristic sample;
establishing an index restriction bitmap according to the index restriction relationship among the data index targets in the target data area, and determining the index node of each data index target in the index restriction bitmap;
determining the index service of each data index target according to the index node of each data index target, determining a set formed by the index services of each data index target as a summary index aggregation service, comparing the index nodes of any two data index targets in the summary index aggregation service, and obtaining the mutual dominant relationship of the index services of any two data index targets based on the comparison result;
dividing the summary index aggregation service into at least one index aggregation service sequence based on the mutual leading relationship of the index services of any two data index targets, wherein each index aggregation service sequence has different aggregation quantity levels;
when a hot-spot data index target is added into the target data area, determining a target index node of the hot-spot data index target in the index restriction bitmap, comparing the target index node with the index node of the data index target in the at least one index aggregation service sequence, and determining a target index aggregation service sequence corresponding to the index service where the hot-spot data index target is located based on the comparison result;
and taking the service tag included in the target index aggregation service sequence corresponding to the index service of the hot data index target as portrait tag information of the mobile internet terminal.
In a possible implementation manner of the first aspect, the step of creating an index restriction bitmap according to an index restriction relationship between data index targets in the target data area includes:
acquiring an index sequence formed by data index targets in the target data area;
determining the aggregation quantity level of the index service of each data index target according to the occurrence frequency of each data index target in the index sequence;
sorting index services of data index targets on different nodes in a descending order according to the aggregation quantity level;
determining the trend from the index service of the data index target which is sequenced last to the index service of the data index target which is sequenced foremost as a first trend of the first dimension axial direction of the index restriction bitmap on a first preset appearance node;
and determining a trend which is crossed with the first trend of the first dimension axial direction in the positive direction as a second dimension axial direction of the index restriction bitmap, wherein the first trend of the second dimension axial direction is a trend from the index service of the data index target at the last sorting on a second preset appearance node to the index service of the data index target at the top sorting.
In a possible implementation manner of the first aspect, the step of comparing index nodes of any two data index targets in the summary index aggregation service and obtaining a mutual dominant relationship between index services where any two data index targets are located based on a comparison result includes:
comparing the data volume corresponding to the index nodes of any two data index targets in the summary index aggregation service, wherein when the data volume meets a first condition or a second condition, the index service where one data index target in any two data index targets is located can lead the index service where the other data index target is located;
wherein the first condition is that the first trending data volume size value of the one of the data index targets is greater than the first trending data volume size value of the other of the data index targets and the second trending data volume size value of the one of the data index targets is greater than or equal to the second trending data volume size value of the other of the data index targets, and the second condition is that the first trending data volume size value of the one of the data index targets is equal to the first trending data volume size value of the other of the data index targets and the second trending data volume size value of the one of the data index targets is greater than the second trending data volume size value of the other of the data index targets.
In a possible implementation manner of the first aspect, the dividing, based on a mutual dominant relationship between index services where any two data index targets are located, the summary index aggregation service into at least one index aggregation service sequence, where each index aggregation service sequence has a different aggregation number level includes:
taking the summary index aggregation service as a first aggregation service, and determining at least one first selected index aggregation service which is not dominated by any other index aggregation service from the first aggregation service according to the mutual dominance relationship of the index services of any two data index targets in the first aggregation service;
determining a set formed by the at least one first selected index aggregation service as a first-level index aggregation service sequence;
when the range of other index aggregation services except the A-level index aggregation service sequence in the A-level aggregation service is larger than or equal to a first threshold value, determining the other index aggregation services except the A-level index aggregation service sequence in the A-level aggregation service as A + 1-th aggregation service;
determining at least one A +1 selected index aggregation service which is not dominated by any other region from the A +1 aggregation services according to the mutual dominance relationship of the index services where any two data index targets in the A +1 aggregation services are located, and determining a set formed by the at least one A +1 selected index aggregation service as an A +1 level index aggregation service sequence;
when a is equal to N, the range of index aggregation services other than the a-th level index aggregation service sequence in the a-th aggregation service is equal to the first threshold, and the value corresponding to the aggregation number level is in an inverse relationship with the aggregation number level.
In a possible implementation manner of the first aspect, the step of comparing the target index node with an index node of a data index target in the at least one index aggregation service sequence, and determining a target index aggregation service sequence corresponding to an index service where the hotspot data index target is located based on a comparison result includes:
comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the first data index target;
when the data size meets a third condition or a fourth condition, performing degradation processing on the aggregation quantity level of each index aggregation service sequence, and determining the index service where the hot data index target is located as a target first-level index aggregation service sequence, wherein the target first-level index aggregation service sequence is a target index aggregation service sequence corresponding to the index service where the hot data index target is located;
the first data index target is a data index target in a first-level index aggregation service sequence, the third condition is that a second trend data size value of the target index node is greater than or equal to a second trend data size value of the first data index target and a first trend data size value of the target index node is greater than a first trend data size value of the first data index target, and the fourth condition is that the second trend data size value of the target index node is greater than the second trend data size value of the first data index target and the first trend data size value of the target index node is equal to the first trend data size value of the first data index target;
comparing the value corresponding to the target index node with the data size corresponding to the index node of the second data index target;
when the data size meets a fifth condition or a sixth condition, determining the index service of the hotspot data index target as an N + 2-level index aggregation service sequence, and determining the N + 2-level index aggregation service sequence as a target index aggregation service sequence corresponding to the index service of the hotspot data index target;
wherein the second data index target is a data index target in an N +1 th-level index aggregation service sequence, the fifth condition is that the second trending data volume size value of the target index node is less than or equal to the second trending data volume size value of the second data index target and the first trending data volume size value of the target index node is less than the first trending data volume size value of the second data index target, the sixth condition is that the second trending data volume size value of the target index node is less than the second trending data volume size value of the second data index target and the first trending data volume size value of the target index node is equal to the first trending data volume size value of the second data index target;
comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the third data index target;
when the data size meets a seventh condition or an eighth condition, sorting numerical values corresponding to the aggregation quantity level of each index aggregation service sequence where each third data index target is located in an ascending order, and determining the index aggregation service sequence corresponding to the numerical value with the highest sorting order as a target index aggregation service sequence corresponding to the index service where the hotspot data index target is located;
wherein the aggregation number level of the index aggregation service sequence where the third data index target is located is between the aggregation number level of the first-level index aggregation service sequence and the aggregation number level of the N + 1-th-level index aggregation service sequence, the seventh condition is that the second trending data volume size value of the target inode is greater than or equal to the second trending data volume size value of the third data index target and the first trending data volume size value of the target inode is less than the first trending data volume size value of the third data index target, the eighth condition is that the second trending data volume size value of the target inode is greater than the second trending data volume size value of the third data indexing target and the first trending data volume size value of the target inode is equal to the first trending data volume size value of the third data indexing target.
In a possible implementation manner of the first aspect, before the comparing the target index node with an index node of a data index target in the at least one index aggregation service sequence, and determining, based on a comparison result, a target index aggregation service sequence corresponding to an index service where the hotspot data index target is located, the method further includes:
judging whether at least one data index target with the same first trend data volume value or the same second trend data volume value exists in the summary index aggregation service;
if at least one data index target with the same value of the first trend data volume or the same value of the second trend data volume exists, taking the at least one data index target with the same value of the first trend data volume or the same value of the second trend data volume as a candidate data index target;
executing a first strategy or a second strategy on the candidate data index target to obtain an adjusted index node, wherein the first strategy is to increase a first trend data volume size value or a second trend data volume size value of the candidate data index target by a preset value corresponding to the candidate data index target, and the second strategy is to subtract the preset value corresponding to the candidate data index target from the first trend data volume size value or the second trend data volume size value of the candidate data index target;
correspondingly, the comparing the target index node with the index node of the data index target in the at least one index aggregation service sequence, and determining the target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result includes:
and comparing the target index node with the adjusted index node, and determining a target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result.
In a possible implementation manner of the first aspect, the step of performing corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script includes:
after page user behavior information corresponding to an extended page object needing big data acquisition is obtained from an internet access process, determining internet function service information matched with the page user behavior information;
generating corresponding data acquisition identification node information according to the internet function service information and the theme zone information corresponding to the internet function service information;
associating the data acquisition identification node information to a data acquisition script of a data uploading path of a data crawling flow of the page user behavior information through a big data acquisition control, configuring the data acquisition script according to the data acquisition identification node information, and then executing big data acquisition;
and carrying out corresponding data acquisition identification operation on the mobile internet terminal through the data acquisition script in the big data acquisition process, wherein in the data acquisition identification operation process, the data acquisition script is continuously updated and configured according to the obtained data acquisition identification node information through the data uploading path.
In a possible implementation manner of the first aspect, the step of generating corresponding data acquisition identification node information according to the internet function service information and subject domain information corresponding to the internet function service information includes:
determining a target internet function service with each service importance priority greater than a set priority in the internet function service information according to the subject domain information corresponding to the internet function service information, and a first identification object and a second identification object which take the target internet function service as service basic areas, wherein the simulation data acquisition process of the first identification object is not overlapped with the simulation data acquisition process of the second identification object, and logical association exists between the simulation data acquisition processes;
determining a subject field object meeting a first target requirement in the first identification object, and determining first sliding component information corresponding to the first identification object according to a field matching definition element of multilevel source matching information between source data table field information of the subject field object meeting the first target requirement and associated preset field verification information; the subject field object meeting the first target requirement is a subject field object of which the source data table field information is matched with the associated preset field verification information;
determining a subject field object meeting a second target requirement in the second identification object, and determining second sliding component information corresponding to the second identification object according to a field matching definition element of multilevel source matching information between source data table field information of the subject field object meeting the second target requirement and associated preset field verification information; the subject field object meeting the second target requirement is a subject field object of which the source data table field information is matched with the associated preset field verification information;
obtaining callback acquisition simulation parameters of the subject field object in each first simulation data acquisition process according to first sliding component information corresponding to the first identification object, and obtaining callback acquisition simulation parameters of the subject field object in each second simulation data acquisition process according to second sliding component information in the second identification object;
according to callback acquisition simulation parameters of each first simulation data acquisition process and each second simulation data acquisition process, respectively carrying out simulation acquisition indexing on the subject field object in each simulation data acquisition process to obtain first simulation acquisition index information of each first simulation data acquisition process and second simulation acquisition index information of each second simulation data acquisition process;
obtaining corresponding analog acquisition index information according to the first analog acquisition index information of each first analog data acquisition process and the second analog acquisition index information of each second analog data acquisition process;
and generating corresponding data acquisition identification node information according to the simulation acquisition index information.
In a second aspect, an embodiment of the present disclosure further provides an internet big data processing apparatus based on artificial intelligence, which is applied to an intelligent cloud service platform, where the intelligent cloud service platform is in communication connection with a plurality of mobile internet terminals, and the apparatus includes:
the acquisition module is used for carrying out corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script and acquiring a feature sample set from acquired internet big data information, wherein the feature sample set comprises a first feature sample and a second feature sample, and the second feature sample is a feature sample of the first feature sample with internet service association;
the analysis module is used for sequentially carrying out image feature analysis on each feature sample in the feature sample set according to a preconfigured artificial intelligence model to obtain a corresponding image feature vector, determining an image data area in the first feature sample based on the image feature vector corresponding to the first feature sample, extracting a target feature vector from the image feature vector corresponding to the first feature sample according to a target image data area corresponding to the image data area, extracting a first candidate feature vector from the image feature vector corresponding to the second feature sample, and covering a data area corresponding to the target feature vector with the data area corresponding to the first candidate feature vector;
a determining module, configured to search a feature vector node matching the target feature vector from the first candidate feature vector, and determine a key data area corresponding to the portrait data area in the second feature sample according to the searched feature vector node;
and the generating module is used for generating portrait label information of the mobile internet terminal according to the portrait data area in the first characteristic sample and the key data area corresponding to the portrait data area in the second characteristic sample.
In a third aspect, an embodiment of the present disclosure further provides an artificial intelligence based internet big data processing system, where the artificial intelligence based internet big data processing system includes an intelligent cloud service platform and a plurality of mobile internet terminals in communication connection with the intelligent cloud service platform;
the intelligent cloud service platform is used for carrying out corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script and acquiring a feature sample set from acquired internet big data information, wherein the feature sample set comprises a first feature sample and a second feature sample, and the second feature sample is a feature sample of the first feature sample with internet service association;
the intelligent cloud service platform is used for sequentially carrying out image feature analysis on each feature sample in the feature sample set according to a preconfigured artificial intelligence model to obtain a corresponding image feature vector, determining an image data area in the first feature sample based on the image feature vector corresponding to the first feature sample, extracting a target feature vector from the image feature vector corresponding to the first feature sample according to a target image data area corresponding to the image data area, extracting a first candidate feature vector from the image feature vector corresponding to the second feature sample, and covering the data area corresponding to the first candidate feature vector on the data area corresponding to the target feature vector;
the intelligent cloud service platform is used for searching a feature vector node matched with the target feature vector from the first candidate feature vector, and determining a key data area corresponding to the portrait data area in the second feature sample according to the searched feature vector node;
the intelligent cloud service platform is used for generating portrait label information of the mobile internet terminal according to the portrait data area in the first characteristic sample and the key data area corresponding to the portrait data area in the second characteristic sample.
In a fourth aspect, an embodiment of the present disclosure further provides an intelligent cloud service platform, where the intelligent cloud service platform includes a processor, a machine-readable storage medium, and a network interface, where the machine-readable storage medium, the network interface, and the processor are connected through a bus system, the network interface is configured to be in communication connection with at least one mobile internet terminal, the machine-readable storage medium is configured to store a program, an instruction, or a code, and the processor is configured to execute the program, the instruction, or the code in the machine-readable storage medium, so as to execute the artificial intelligence based internet big data processing method in any one of the first aspect or the possible designs of the first aspect.
In a fifth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, where instructions are stored, and when executed, cause a computer to perform the artificial intelligence based internet big data processing method in the first aspect or any one of the possible designs of the first aspect.
Based on any one of the above aspects, the present disclosure performs corresponding data acquisition and identification operations on a mobile internet terminal through a pre-configured data acquisition script, acquires a feature sample set from acquired internet big data information, extracts a corresponding portrait feature vector from the feature sample set, wherein the portrait feature vector can be used as a shared portrait feature vector, and extracts a portrait data area in a first feature sample and a key data area corresponding to the portrait data area in a second feature sample on the basis of the shared portrait feature vector, thereby performing portrait label generation, and can significantly improve label generation speed and generation accuracy.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present disclosure and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings may be obtained from the drawings without inventive effort.
FIG. 1 is a schematic diagram of an application scenario of an artificial intelligence based Internet big data processing system according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of a method for processing Internet big data based on artificial intelligence according to an embodiment of the disclosure;
FIG. 3 is a functional block diagram of an artificial intelligence based Internet big data processing device provided by the embodiment of the disclosure;
fig. 4 is a block diagram illustrating a structure of an intelligent cloud service platform for implementing the artificial intelligence-based internet big data processing method according to the embodiment of the present disclosure.
Detailed Description
The present disclosure is described in detail below with reference to the drawings, and the specific operation methods in the method embodiments can also be applied to the device embodiments or the system embodiments.
FIG. 1 is an interactive schematic diagram of an artificial intelligence based Internet big data processing system 10 provided by an embodiment of the disclosure. The artificial intelligence based internet big data processing system 10 may include an intelligent cloud service platform 100 and a mobile internet terminal 200 communicatively connected to the intelligent cloud service platform 100. The artificial intelligence based internet big data processing system 10 shown in fig. 1 is only one possible example, and in other possible embodiments, the artificial intelligence based internet big data processing system 10 may include only one of the components shown in fig. 1 or may also include other components.
In this embodiment, the mobile internet terminal 200 may include a mobile device, a tablet computer, a laptop computer, etc., or any combination thereof. In some embodiments, the mobile device may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home devices may include control devices of smart electrical devices, smart monitoring devices, smart televisions, smart cameras, and the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart lace, smart glass, a smart helmet, a smart watch, a smart garment, a smart backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a personal digital assistant, a gaming device, and the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, virtual reality glass, a virtual reality patch, an augmented reality helmet, augmented reality glass, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or augmented reality device may include various virtual reality products and the like.
In this embodiment, the intelligent cloud service platform 100 and the mobile internet terminal 200 in the artificial intelligence based internet big data processing system 10 may cooperatively execute the artificial intelligence based internet big data processing method described in the following method embodiment, and the detailed description of the following method embodiment may be referred to for the execution steps of the intelligent cloud service platform 100 and the mobile internet terminal 200.
In order to solve the technical problem in the foregoing background art, fig. 2 is a schematic flow chart of an artificial intelligence based internet big data processing method provided in the embodiment of the present disclosure, and the artificial intelligence based internet big data processing method provided in the embodiment may be executed by the intelligent cloud service platform 100 shown in fig. 1, and the artificial intelligence based internet big data processing method is described in detail below.
Step S110, performing corresponding data acquisition and identification operations on the mobile internet terminal 200 through a pre-configured data acquisition script, and acquiring a feature sample set from the acquired internet big data information.
Step S120, image feature analysis is sequentially carried out on each feature sample in the feature sample set according to a pre-configured artificial intelligence model to obtain a corresponding image feature vector, an image data area in the first feature sample is determined based on the image feature vector corresponding to the first feature sample, a target feature vector is extracted from the image feature vector corresponding to the first feature sample according to a target image data area corresponding to the image data area, a first candidate feature vector is extracted from the image feature vector corresponding to the second feature sample, and the data area corresponding to the first candidate feature vector covers the data area corresponding to the target feature vector.
Step S130, finding a feature vector node matched with the target feature vector from the first candidate feature vector, and determining a key data area corresponding to the portrait data area in the second feature sample according to the found feature vector node.
In step S140, the portrait label information of the mobile internet terminal 200 is generated according to the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample.
In this embodiment, the feature sample set includes a first feature sample and a second feature sample, where the second feature sample is a feature sample in which the first feature sample has internet service association. The existence of the internet service association means that an access relation exists between internet services, for example, the internet service A can jump to the internet service B.
In this embodiment, the preconfigured artificial intelligence model may be obtained by collecting feature training samples and image feature vectors (e.g., feature values in different image dimensions) corresponding to each feature training sample in advance for training, and the specific training mode is the prior art and is not described herein again.
In this embodiment, the image feature vector corresponding to the first feature sample is used to determine an image data area in the first feature sample, and according to a target image data area corresponding to the image data area, a target feature vector is extracted from the image feature vector corresponding to the first feature sample, and a first candidate feature vector is extracted from the image feature vector corresponding to the second feature sample, which may specifically be: and matching the image feature vectors corresponding to the first feature samples from the first feature samples, and taking the set of unit areas where the matching nodes are located as image data areas in the first feature samples. A target image data region corresponding to an image data region may refer to a target image data region that is in business association with the image data region. In addition, a target feature vector corresponding to the target image data area may be extracted from the image feature vector corresponding to the first feature sample, and a first candidate feature vector corresponding to the target image data area may be extracted from the image feature vector corresponding to the second feature sample.
In this embodiment, searching for a feature vector node matching the target feature vector from the first candidate feature vector, and determining a key data region corresponding to the portrait data region in the second feature sample according to the searched feature vector node may specifically be: and searching a feature vector node matched with each feature vector value in the target feature vector from the first candidate feature vector, and then acquiring a data area matched with the searched feature vector node from the second feature sample as a key data area corresponding to the portrait data area.
Based on the design, the embodiment performs corresponding data acquisition and identification operations on the mobile internet terminal through the pre-configured data acquisition script, acquires the feature sample set from the acquired internet big data information, then extracts the corresponding portrait feature vector from the feature sample set, wherein the portrait feature vector can be used as a shared portrait feature vector, and extracts the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample on the basis of the shared portrait feature vector, thereby performing portrait label generation, and being capable of remarkably improving label generation speed and generation accuracy.
In a possible implementation manner, for step S140, in order to further consider an index constraint relationship between different data regions in the tag generation process, so as to improve the accuracy of tag generation, the tag generation may be implemented by sub-steps, which are described in detail below.
In the substep S141, a target data region is obtained which is composed of a common data region between the image data region in the first feature sample and the key data region corresponding to the image data region in the second feature sample.
And a substep S142, establishing an index restriction bitmap according to the index restriction relationship between the data index targets in the target data region, and determining the index node of each data index target in the index restriction bitmap.
And a substep S143, determining the index service of each data index target according to the index node of each data index target, determining a set formed by the index services of each data index target as a summary index aggregation service, comparing the index nodes of any two data index targets in the summary index aggregation service, and obtaining the mutual dominant relationship of the index services of any two data index targets based on the comparison result.
And a substep S144, dividing the summary index aggregation service into at least one index aggregation service sequence based on the mutual leading relationship of the index services of any two data index targets, wherein each index aggregation service sequence has different aggregation number levels.
And a substep S145, when a hot-spot data index target is added into the target data area, determining a target index node of the hot-spot data index target in the index restriction bitmap, comparing the target index node with the index node of the data index target in at least one index aggregation service sequence, and determining a target index aggregation service sequence corresponding to the index service where the hot-spot data index target is located based on the comparison result.
In the substep S146, the service tag included in the target index aggregation service sequence corresponding to the index service where the hot spot data index target is located is used as the portrait tag information of the mobile internet terminal 200.
In a possible implementation manner, the sub-step S142 can be implemented by the following embodiments.
(1) And acquiring an index sequence formed by data index targets in the target data area.
(2) And determining the aggregation quantity level of the index service of each data index target according to the occurrence frequency of each data index target in the index sequence.
(3) And sorting the index services of the data index targets on different appearing nodes in a descending order according to the aggregation quantity level.
(4) And determining the trend from the index service of the data index target which is sequenced last to the index service of the data index target which is sequenced foremost as the first trend of the first dimension axial direction of the index restriction bitmap on the first preset appearance node.
(5) And determining a trend which is crossed with the first trend in the first dimension axial direction in the positive direction as a second dimension axial direction of the index restriction bitmap, wherein the first trend in the second dimension axial direction is a trend from the index service of the data index target which is sequenced at the last on a second preset appearance node to the index service of the data index target which is sequenced at the top.
In a possible implementation manner, for the sub-step S143, data size corresponding to index nodes of any two data index targets in the summary index aggregation service may be compared, and when the data size satisfies the first condition or the second condition, the index service where one data index target of the any two data index targets is located can lead the index service where the other data index target is located.
Illustratively, the first condition is that the first trending data volume size value of one of the data index targets is greater than the first trending data volume size value of the other data index target and the second trending data volume size value of the one of the data index targets is greater than or equal to the second trending data volume size value of the other data index target, and the second condition is that the first trending data volume size value of the one of the data index targets is equal to the first trending data volume size value of the other data index target and the second trending data volume size value of the one of the data index targets is greater than the second trending data volume size value of the other data index target.
In a possible implementation manner, the sub-step S144 can be implemented in the following embodiments.
(1) And determining at least one first selected index aggregation service which is not dominated by any other index aggregation service from the first aggregation services according to the mutual dominance relation of the index services of any two data index targets in the first aggregation services.
(2) And determining a set formed by at least one first selected index aggregation service as a first-level index aggregation service sequence.
(3) When the range of other index aggregation services except the A-level index aggregation service sequence in the A-th aggregation service is larger than or equal to a first threshold value, determining the other index aggregation services except the A-level index aggregation service sequence in the A-th aggregation service as the A + 1-th aggregation service.
(4) According to the mutual leading relation of the index services of any two data index targets in the A +1 th aggregation service, at least one A +1 th selected index aggregation service which is not led by any other region is determined from the A +1 th aggregation service, and a set formed by the at least one A +1 th selected index aggregation service is determined as an A +1 th level index aggregation service sequence.
When A is equal to N, the range of index aggregation services except the A-level index aggregation service sequence in the A-level aggregation service is equal to a first threshold value, and the value corresponding to the aggregation number level is in inverse proportion to the aggregation number level.
In a possible implementation manner, the sub-step S145 can be implemented in the following embodiments.
(1) And comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the first data index target.
(2) And when the data size meets the third condition or the fourth condition, performing degradation processing on the aggregation quantity level of each index aggregation service sequence, and determining the index service where the hot data index target is located as a target first-level index aggregation service sequence, wherein the target first-level index aggregation service sequence is a target index aggregation service sequence corresponding to the index service where the hot data index target is located.
The first data index target is a data index target in a first-level index aggregation service sequence, the third condition is that a second trend data volume size value of a target index node is larger than or equal to a second trend data volume size value of the first data index target and a first trend data volume size value of the target index node is larger than a first trend data volume size value of the first data index target, and the fourth condition is that a second trend data volume size value of the target index node is larger than a second trend data volume size value of the first data index target and a first trend data volume size value of the target index node is equal to a first trend data volume size value of the first data index target.
(3) And comparing the value corresponding to the target index node with the data size corresponding to the index node of the second data index target.
(4) And when the data size meets the fifth condition or the sixth condition, determining the index service where the hot data index target is located as an N + 2-level index aggregation service sequence, and determining the N + 2-level index aggregation service sequence as a target index aggregation service sequence corresponding to the index service where the hot data index target is located.
The second data index target is a data index target in the N +1 th-level index aggregation service sequence, the fifth condition is that the second trend data size value of the target index node is smaller than or equal to the second trend data size value of the second data index target and the first trend data size value of the target index node is smaller than the first trend data size value of the second data index target, and the sixth condition is that the second trend data size value of the target index node is smaller than the second trend data size value of the second data index target and the first trend data size value of the target index node is equal to the first trend data size value of the second data index target.
(5) And comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the third data index target.
(6) And when the data size meets the seventh condition or the eighth condition, sorting the numerical values corresponding to the aggregation quantity level of each index aggregation service sequence where each third data index target is located in an ascending order, and determining the index aggregation service sequence corresponding to the numerical value at the top of the sorting as a target index aggregation service sequence corresponding to the index service where the hot data index target is located.
The aggregation quantity level of the index aggregation service sequence where the third data index target is located is between the aggregation quantity level of the first-level index aggregation service sequence and the aggregation quantity level of the N + 1-th-level index aggregation service sequence, the seventh condition is that the second trend data volume size value of the target index node is greater than or equal to the second trend data volume size value of the third data index target and the first trend data volume size value of the target index node is smaller than the first trend data volume size value of the third data index target, and the eighth condition is that the second trend data volume size value of the target index node is greater than the second trend data volume size value of the third data index target and the first trend data volume size value of the target index node is equal to the first trend data volume size value of the third data index target.
In a possible implementation manner, before sub-step S145, it may be further determined whether at least one data index target with the same first trend data size value or the same second trend data size value exists in the aggregate service of the summary index. And if at least one data index target with the same value of the first trend data size or the same value of the second trend data size exists, taking the at least one data index target with the same value of the first trend data size or the same value of the second trend data size as a candidate data index target. And then, executing a first strategy or a second strategy on the candidate data index target to obtain the adjusted index node.
It should be noted that the first policy is to increase a first trend data size value or a second trend data size value of the candidate data index target by a preset value corresponding to the candidate data index target, and the second policy is to subtract the preset value corresponding to the candidate data index target from the first trend data size value or the second trend data size value of the candidate data index target.
Accordingly, in sub-step S145, the target index node may be compared with the adjusted index node, and a target index aggregation service sequence corresponding to the index service where the hotspot data index target is located is determined based on the comparison result.
On the basis of the above description, in a possible implementation manner, for step S110, in order to improve the acquisition pertinence and accuracy in the process of acquiring the big data, noise introduction of the acquired data, which may be caused by a noise problem of the data acquisition identification node, in the process of acquiring the big data is avoided to a certain extent, and step S110 may be specifically implemented by using sub-steps exemplarily, which is described in detail as follows.
And a substep S111, after acquiring page user behavior information corresponding to the extended page object needing big data acquisition from the Internet access process, determining Internet function service information matched with the page user behavior information.
And a substep S112, generating corresponding data acquisition identification node information according to the Internet function service information and the subject domain information corresponding to the Internet function service information.
And a substep S113, associating the data acquisition identification node information to a data acquisition script of a data uploading path of the data crawling flow of the page user behavior information through the big data acquisition control, configuring the data acquisition script according to the data acquisition identification node information, and executing big data acquisition.
And a substep S114 of performing a corresponding data collection recognition operation on the mobile internet terminal 200 through the data collection script in the big data collection process.
In this embodiment, the extended page object may refer to an accessible page related to the current page in the current page access process.
In this embodiment, the internet function service information may refer to an internet function service that may be associated with page user behavior information based on the extended page object, and the internet function service may refer to a function type of internet access. Correspondingly, the theme domain information may refer to theme data information in a page access process corresponding to the internet function service determined above. The data collection identification node information may refer to configuration information used to generate data collection during the access collection process.
In this embodiment, the page user behavior information may be, but is not limited to, information such as a user configuration behavior, a user click behavior, a user browsing behavior, and the like, and is not limited in detail herein.
In this embodiment, in the process of performing the data acquisition identification operation, the data acquisition script may be continuously updated and configured according to the obtained data acquisition identification node information through the data upload path.
Based on the above steps, after obtaining the page user behavior information corresponding to the extended page object that needs to be subjected to big data acquisition, the embodiment determines the internet function service information matched with the page user behavior information, and generates corresponding data acquisition identification node information according to the internet function service information and the subject domain information corresponding to the internet function service information, then, after the data acquisition script is configured according to the data acquisition identification node information, big data acquisition is executed, so that the corresponding data acquisition and identification operations can be performed on the mobile internet terminal 200 through the data acquisition script in the process of big data acquisition, and then the acquisition pertinence and accuracy in the big data acquisition process are improved, and the introduction of noise of acquired data, which may be caused by the noise problem of the data acquisition identification node, in the big data acquisition process is avoided to a certain extent.
In a possible implementation manner, step S111 may be specifically implemented by sub-steps, which are described in detail below.
And a substep S1111, obtaining page user behavior information corresponding to the extended page object needing big data acquisition from the Internet access process.
For example, the page user behavior information may include a reference internet function service, a number of service acquisition blocks, a user behavior permission interval, and a user behavior extension permission interval. In other possible implementation manners, the page user behavior information may further include behavior attribute information of the extended page object, such as a behavior operation type, a business type to which the behavior object belongs, a behavior generation time, and the like. The reference internet function service may be a preset internet function service determined according to a historical condition, the number of service acquisition boards may be a number of boards historically disclosed by various channels (e.g., a chat tool, an e-commerce tool, etc.) of the extended page object, the user behavior permission interval may be a user behavior service associated with the extended page object, and the user behavior extended permission interval may be a user behavior service associated with the outside of the extended page object.
And a substep S1112, determining the number of service acquisition blocks/service node interval value and the number of service acquisition blocks/user behavior expansion permission interval value of the page user behavior information.
And a substep S1113 of constructing an Internet function service matrix according to the number of service acquisition plates/the service node interval value and the number of service acquisition plates/the user behavior expansion permission interval value, and determining each first Internet function service corresponding to the page user behavior information in the Internet function service matrix according to the number of service acquisition plates/the service node interval value and the number of service acquisition plates/the user behavior expansion permission interval value of the page user behavior information.
And a substep S1114, determining a service characteristic interval of each reference internet function service in the internet function service matrix according to the service characteristic vector of each reference internet function service.
And a substep S1115 for determining the initial service access frequent parameter of each reference Internet function service according to the service characteristic interval corresponding to each reference Internet function service and the corresponding relation between the preset service characteristic interval and the initial service access frequent parameter.
And a substep S1116, determining, for each first internet function service included in each reference internet function service, a target service access frequency parameter of the first internet function service according to the initial service access frequency parameter of the reference internet function service to which the first internet function service belongs.
And a substep S1117 of determining a target service node interval value, a target service acquisition plate number value and a target user behavior expansion permission interval value corresponding to each first Internet function service according to the number of preset service acquisition plates, the preset service node interval value and the target service access frequency parameter corresponding to each first Internet function service.
And a substep S1118, determining the internet function service information matched with the page user behavior information according to the number value of the target service acquisition blocks, the interval value of the target service nodes and the interval value of the target user behavior expansion permission corresponding to each first internet function service, the number of the service acquisition blocks in the page user behavior information, the multi-level source matching information between the user behavior permission interval and the user behavior expansion permission interval, and the relationship between the multi-level source matching information and the preset multi-level source matching information.
In a possible implementation manner, step S112 may be specifically implemented by sub-steps, which are described in detail below.
And a substep S1121, determining a target Internet function service with each service importance priority greater than a set priority in the Internet function service information and a first identification object and a second identification object which take the target Internet function service as a service basic area according to the subject domain information corresponding to the Internet function service information, wherein the simulation data acquisition process of the first identification object is not overlapped with the simulation data acquisition process of the second identification object, and logical association exists between the simulation data acquisition processes.
In the sub-step S1122, a subject field object meeting the first target requirement in the first identification object is determined, and according to a field matching definition element of the multilevel source matching information between the source data table field information of the subject field object meeting the first target requirement and the associated preset field verification information, the first sliding component information corresponding to the first identification object is determined.
For example, a subject field object that meets the first target requirement may be a subject field object for which the source data table field information matches the associated preset field authentication information.
And a substep S1123 of determining a subject field object meeting the second target requirement in the second identification object, and determining second sliding component information corresponding to the second identification object according to a field matching definition element of multi-level source matching information between the source data table field information of the subject field object meeting the second target requirement and the associated preset field verification information.
For example, the subject field object that meets the second target requirement may be a subject field object for which the source data table field information matches the associated preset field authentication information.
And a substep S1124, obtaining a callback acquisition simulation parameter of the subject field object in each first simulation data acquisition process according to the first sliding component information corresponding to the first identification object, and obtaining a callback acquisition simulation parameter of the subject field object in each second simulation data acquisition process according to the second sliding component information in the second identification object.
And a substep S1125 of respectively performing analog acquisition indexing on the subject field object in each analog data acquisition process according to the callback acquisition analog parameters of each first analog data acquisition process and each second analog data acquisition process to obtain first analog acquisition index information of each first analog data acquisition process and second analog acquisition index information of each second analog data acquisition process.
And a substep S1126 of obtaining corresponding analog acquisition index information according to the first analog acquisition index information of each first analog data acquisition process and the second analog acquisition index information of each second analog data acquisition process.
And a substep S1127 of generating corresponding data acquisition identification node information according to the simulation acquisition index information.
In a possible implementation manner, step S113 may be specifically implemented by sub-steps, which are described in detail below.
And a substep S1131, associating each data acquisition identification unit in the data acquisition identification node information to a corresponding data acquisition control instruction in a data acquisition script of a data uploading path of the data crawling flow of the page user behavior information through the big data acquisition control.
And a substep S1132, configuring the data acquisition identification configuration information of each data acquisition identification unit to the transmission control template of the corresponding data acquisition control instruction in the data acquisition script, and then executing big data acquisition.
Therefore, in a possible implementation manner, for step S114, in particular, in the process of collecting big data, the mobile internet terminal 200 may be subjected to corresponding data collection and identification operations through each data collection control instruction in the data collection script.
Fig. 3 is a schematic functional module diagram of an artificial intelligence based internet big data processing apparatus 300 according to an embodiment of the present disclosure, in this embodiment, functional modules of the artificial intelligence based internet big data processing apparatus 300 may be divided according to a method embodiment executed by the intelligent cloud service platform 100, that is, the following functional modules corresponding to the artificial intelligence based internet big data processing apparatus 300 may be used to execute each method embodiment executed by the intelligent cloud service platform 100. The apparatus 300 for processing internet big data based on artificial intelligence may include an obtaining module 310, an analyzing module 320, a determining module 330, and a generating module 340, and the functions of the functional modules of the apparatus 300 for processing internet big data based on artificial intelligence are described in detail below.
The obtaining module 310 is configured to perform corresponding data collecting and identifying operations on the mobile internet terminal 200 through a preconfigured data collecting script, and obtain a feature sample set from the collected internet big data information, where the feature sample set includes a first feature sample and a second feature sample, and the second feature sample is a feature sample in which the first feature sample has internet service association. The obtaining module 310 may be configured to perform the step S110, and the detailed implementation of the obtaining module 310 may refer to the detailed description of the step S110.
The analysis module 320 is configured to perform image feature analysis on each feature sample in the feature sample set in sequence according to a preconfigured artificial intelligence model to obtain a corresponding image feature vector, determine an image data region in the first feature sample based on the image feature vector corresponding to the first feature sample, extract a target feature vector from the image feature vector corresponding to the first feature sample according to a target image data region corresponding to the image data region, and extract a first candidate feature vector from the image feature vector corresponding to the second feature sample, where the data region corresponding to the first candidate feature vector covers the data region corresponding to the target feature vector. The parsing module 320 may be configured to perform the step S120, and the detailed implementation manner of the parsing module 320 may refer to the detailed description of the step S120.
The determining module 330 is configured to search a feature vector node matched with the target feature vector from the first candidate feature vector, and determine a key data area corresponding to the portrait data area in the second feature sample according to the searched feature vector node. The determining module 330 may be configured to perform the step S130, and the detailed implementation of the determining module 330 may refer to the detailed description of the step S130.
The generating module 340 is configured to generate portrait label information of the mobile internet terminal 200 according to the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample. The generating module 340 may be configured to execute the step S140, and the detailed implementation of the generating module 340 may refer to the detailed description of the step S140.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the obtaining module 310 may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the processing element of the apparatus calls and executes the functions of the obtaining module 310. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).
Fig. 4 illustrates a hardware structure diagram of the smart cloud service platform 100 for implementing the control device according to the embodiment of the present disclosure, and as shown in fig. 4, the smart cloud service platform 100 may include a processor 110, a machine-readable storage medium 120, a bus 130, and a transceiver 140.
In a specific implementation process, at least one processor 110 executes computer-executable instructions stored in the machine-readable storage medium 120 (for example, the obtaining module 310, the parsing module 320, the determining module 330, and the generating module 340 included in the artificial intelligence based internet big data processing apparatus 300 shown in fig. 3), so that the processor 110 may execute the artificial intelligence based internet big data processing method according to the above method embodiment, where the processor 110, the machine-readable storage medium 120, and the transceiver 140 are connected through the bus 130, and the processor 110 may be configured to control transceiving actions of the transceiver 140, so as to perform data transceiving with the aforementioned mobile internet terminal 200.
For a specific implementation process of the processor 110, reference may be made to the above-mentioned method embodiments executed by the intelligent cloud service platform 100, and implementation principles and technical effects thereof are similar, and details of this embodiment are not described herein again.
In the embodiment shown in fig. 4, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
The machine-readable storage medium 120 may comprise high-speed RAA memory and may also include non-volatile storage NVA, such as at least one disk memory.
The bus 130 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (Extended Industry Standard Architecture) bus, or the like. The bus 130 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
In addition, the embodiment of the disclosure also provides a readable storage medium, in which computer execution instructions are stored, and when a processor executes the computer execution instructions, the method for processing the internet big data based on the artificial intelligence is implemented.
The readable storage medium described above may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAA), electrically erasable programmable read only memory (EEPROA), erasable programmable read only memory (EPROA), programmable read only memory (PROA), read only memory (ROA), magnetic storage, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (9)

1. An internet big data processing method based on artificial intelligence is applied to an intelligent cloud service platform, the intelligent cloud service platform is in communication connection with a plurality of mobile internet terminals, and the method comprises the following steps:
performing corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script, and acquiring a feature sample set from acquired internet big data information, wherein the feature sample set comprises a first feature sample and a second feature sample, and the second feature sample is a feature sample with internet service association in the first feature sample;
sequentially carrying out image feature analysis on each feature sample in the feature sample set according to a pre-configured artificial intelligence model to obtain a corresponding image feature vector, determining an image data area in the first feature sample based on the image feature vector corresponding to the first feature sample, extracting a target feature vector from the image feature vector corresponding to the first feature sample according to a target image data area corresponding to the image data area, and extracting a first candidate feature vector from the image feature vector corresponding to the second feature sample, wherein the data area corresponding to the first candidate feature vector covers the data area corresponding to the target feature vector;
searching a feature vector node matched with the target feature vector from the first candidate feature vector, and determining a key data area corresponding to the portrait data area in the second feature sample according to the searched feature vector node;
generating portrait label information of the mobile internet terminal according to the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample;
the step of generating portrait label information of the mobile internet terminal according to the portrait data area in the first feature sample and the key data area corresponding to the portrait data area in the second feature sample comprises:
acquiring a target data area formed by a common data area between an image data area in the first characteristic sample and a key data area corresponding to the image data area in the second characteristic sample;
establishing an index restriction bitmap according to the index restriction relationship among the data index targets in the target data area, and determining the index node of each data index target in the index restriction bitmap;
determining the index service of each data index target according to the index node of each data index target, determining a set formed by the index services of each data index target as a summary index aggregation service, comparing the index nodes of any two data index targets in the summary index aggregation service, and obtaining the mutual dominant relationship of the index services of any two data index targets based on the comparison result;
dividing the summary index aggregation service into at least one index aggregation service sequence based on the mutual leading relationship of the index services of any two data index targets, wherein each index aggregation service sequence has different aggregation quantity levels;
when a hot-spot data index target is added into the target data area, determining a target index node of the hot-spot data index target in the index restriction bitmap, comparing the target index node with the index node of the data index target in the at least one index aggregation service sequence, and determining a target index aggregation service sequence corresponding to the index service where the hot-spot data index target is located based on the comparison result;
and taking the service tag included in the target index aggregation service sequence corresponding to the index service of the hot data index target as portrait tag information of the mobile internet terminal.
2. The internet big data processing method based on artificial intelligence as claimed in claim 1, wherein the step of building an index restriction bitmap according to the index restriction relationship between data index targets in the target data area comprises:
acquiring an index sequence formed by data index targets in the target data area;
determining the aggregation quantity level of the index service of each data index target according to the occurrence frequency of each data index target in the index sequence;
sorting index services of data index targets on different nodes in a descending order according to the aggregation quantity level;
determining the trend from the index service of the data index target which is sequenced last to the index service of the data index target which is sequenced foremost as a first trend of the first dimension axial direction of the index restriction bitmap on a first preset appearance node;
and determining a trend which is crossed with the first trend of the first dimension axial direction in the positive direction as a second dimension axial direction of the index restriction bitmap, wherein the first trend of the second dimension axial direction is a trend from the index service of the data index target at the last sorting on a second preset appearance node to the index service of the data index target at the top sorting.
3. The internet big data processing method based on artificial intelligence according to claim 1, wherein the step of comparing the index nodes of any two data index targets in the summary index aggregation service and obtaining the mutual dominance relationship of the index services of any two data index targets based on the comparison result comprises:
comparing the data volume corresponding to the index nodes of any two data index targets in the summary index aggregation service, wherein when the data volume meets a first condition or a second condition, the index service where one data index target in any two data index targets is located can lead the index service where the other data index target is located;
wherein the first condition is that the first trending data volume size value of the one of the data index targets is greater than the first trending data volume size value of the other of the data index targets and the second trending data volume size value of the one of the data index targets is greater than or equal to the second trending data volume size value of the other of the data index targets, and the second condition is that the first trending data volume size value of the one of the data index targets is equal to the first trending data volume size value of the other of the data index targets and the second trending data volume size value of the one of the data index targets is greater than the second trending data volume size value of the other of the data index targets.
4. The internet big data processing method based on artificial intelligence according to claim 3, wherein the step of dividing the summary index aggregation service into at least one index aggregation service sequence based on the mutual dominance relationship between the index services where any two data index targets are located, each index aggregation service sequence having a different aggregation number level includes:
taking the summary index aggregation service as a first aggregation service, and determining at least one first selected index aggregation service which is not dominated by any other index aggregation service from the first aggregation service according to the mutual dominance relationship of the index services of any two data index targets in the first aggregation service;
determining a set formed by the at least one first selected index aggregation service as a first-level index aggregation service sequence;
when the range of other index aggregation services except the A-level index aggregation service sequence in the A-level aggregation service is larger than or equal to a first threshold value, determining the other index aggregation services except the A-level index aggregation service sequence in the A-level aggregation service as A + 1-th aggregation service;
determining at least one A +1 selected index aggregation service which is not dominated by any other region from the A +1 aggregation services according to the mutual dominance relationship of the index services where any two data index targets in the A +1 aggregation services are located, and determining a set formed by the at least one A +1 selected index aggregation service as an A +1 level index aggregation service sequence;
when a is equal to N, the range of index aggregation services other than the a-th level index aggregation service sequence in the a-th aggregation service is equal to the first threshold, and the value corresponding to the aggregation number level is in an inverse relationship with the aggregation number level.
5. The internet big data processing method based on artificial intelligence according to claim 4, wherein the step of comparing the target index node with the index nodes of the data index targets in the at least one index aggregation service sequence and determining the target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result comprises:
comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the first data index target;
when the data size meets a third condition or a fourth condition, performing degradation processing on the aggregation quantity level of each index aggregation service sequence, and determining the index service where the hot data index target is located as a target first-level index aggregation service sequence, wherein the target first-level index aggregation service sequence is a target index aggregation service sequence corresponding to the index service where the hot data index target is located;
the first data index target is a data index target in a first-level index aggregation service sequence, the third condition is that a second trend data size value of the target index node is greater than or equal to a second trend data size value of the first data index target and a first trend data size value of the target index node is greater than a first trend data size value of the first data index target, and the fourth condition is that the second trend data size value of the target index node is greater than the second trend data size value of the first data index target and the first trend data size value of the target index node is equal to the first trend data size value of the first data index target;
comparing the value corresponding to the target index node with the data size corresponding to the index node of the second data index target;
when the data size meets a fifth condition or a sixth condition, determining the index service of the hotspot data index target as an N + 2-level index aggregation service sequence, and determining the N + 2-level index aggregation service sequence as a target index aggregation service sequence corresponding to the index service of the hotspot data index target;
wherein the second data index target is a data index target in an N +1 th-level index aggregation service sequence, the fifth condition is that the second trending data volume size value of the target index node is less than or equal to the second trending data volume size value of the second data index target and the first trending data volume size value of the target index node is less than the first trending data volume size value of the second data index target, the sixth condition is that the second trending data volume size value of the target index node is less than the second trending data volume size value of the second data index target and the first trending data volume size value of the target index node is equal to the first trending data volume size value of the second data index target;
comparing the numerical value corresponding to the target index node with the data size corresponding to the index node of the third data index target;
when the data size meets a seventh condition or an eighth condition, sorting numerical values corresponding to the aggregation quantity level of each index aggregation service sequence where each third data index target is located in an ascending order, and determining the index aggregation service sequence corresponding to the numerical value with the highest sorting order as a target index aggregation service sequence corresponding to the index service where the hotspot data index target is located;
wherein the aggregation number level of the index aggregation service sequence where the third data index target is located is between the aggregation number level of the first-level index aggregation service sequence and the aggregation number level of the N + 1-th-level index aggregation service sequence, the seventh condition is that the second trending data volume size value of the target inode is greater than or equal to the second trending data volume size value of the third data index target and the first trending data volume size value of the target inode is less than the first trending data volume size value of the third data index target, the eighth condition is that the second trending data volume size value of the target inode is greater than the second trending data volume size value of the third data indexing target and the first trending data volume size value of the target inode is equal to the first trending data volume size value of the third data indexing target.
6. The internet big data processing method based on artificial intelligence of claim 1, wherein before the comparing the target index node with the index node of the data index target in the at least one index aggregation service sequence and determining the target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result, the method further comprises:
judging whether at least one data index target with the same first trend data volume value or the same second trend data volume value exists in the summary index aggregation service;
if at least one data index target with the same value of the first trend data volume or the same value of the second trend data volume exists, taking the at least one data index target with the same value of the first trend data volume or the same value of the second trend data volume as a candidate data index target;
executing a first strategy or a second strategy on the candidate data index target to obtain an adjusted index node, wherein the first strategy is to increase a first trend data volume size value or a second trend data volume size value of the candidate data index target by a preset value corresponding to the candidate data index target, and the second strategy is to subtract the preset value corresponding to the candidate data index target from the first trend data volume size value or the second trend data volume size value of the candidate data index target;
correspondingly, the comparing the target index node with the index node of the data index target in the at least one index aggregation service sequence, and determining the target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result includes:
and comparing the target index node with the adjusted index node, and determining a target index aggregation service sequence corresponding to the index service where the hotspot data index target is located based on the comparison result.
7. The artificial intelligence based internet big data processing method according to any one of claims 1-6, wherein the step of performing corresponding data acquisition and identification operations on the mobile internet terminal through a pre-configured data acquisition script comprises:
after page user behavior information corresponding to an extended page object needing big data acquisition is obtained from an internet access process, determining internet function service information matched with the page user behavior information;
generating corresponding data acquisition identification node information according to the internet function service information and the theme zone information corresponding to the internet function service information;
associating the data acquisition identification node information to a data acquisition script of a data uploading path of a data crawling flow of the page user behavior information through a big data acquisition control, configuring the data acquisition script according to the data acquisition identification node information, and then executing big data acquisition;
and carrying out corresponding data acquisition identification operation on the mobile internet terminal through the data acquisition script in the big data acquisition process, wherein in the data acquisition identification operation process, the data acquisition script is continuously updated and configured according to the obtained data acquisition identification node information through the data uploading path.
8. The internet big data processing method based on artificial intelligence according to claim 7, wherein the step of generating corresponding data collection identification node information according to the internet function service information and subject domain information corresponding to the internet function service information comprises:
determining a target internet function service with each service importance priority greater than a set priority in the internet function service information according to the subject domain information corresponding to the internet function service information, and a first identification object and a second identification object which take the target internet function service as service basic areas, wherein the simulation data acquisition process of the first identification object is not overlapped with the simulation data acquisition process of the second identification object, and logical association exists between the simulation data acquisition processes;
determining a subject field object meeting a first target requirement in the first identification object, and determining first sliding component information corresponding to the first identification object according to a field matching definition element of multilevel source matching information between source data table field information of the subject field object meeting the first target requirement and associated preset field verification information; the subject field object meeting the first target requirement is a subject field object of which the source data table field information is matched with the associated preset field verification information;
determining a subject field object meeting a second target requirement in the second identification object, and determining second sliding component information corresponding to the second identification object according to a field matching definition element of multilevel source matching information between source data table field information of the subject field object meeting the second target requirement and associated preset field verification information; the subject field object meeting the second target requirement is a subject field object of which the source data table field information is matched with the associated preset field verification information;
obtaining callback acquisition simulation parameters of the subject field object in each first simulation data acquisition process according to first sliding component information corresponding to the first identification object, and obtaining callback acquisition simulation parameters of the subject field object in each second simulation data acquisition process according to second sliding component information in the second identification object;
according to callback acquisition simulation parameters of each first simulation data acquisition process and each second simulation data acquisition process, respectively carrying out simulation acquisition indexing on the subject field object in each simulation data acquisition process to obtain first simulation acquisition index information of each first simulation data acquisition process and second simulation acquisition index information of each second simulation data acquisition process;
obtaining corresponding analog acquisition index information according to the first analog acquisition index information of each first analog data acquisition process and the second analog acquisition index information of each second analog data acquisition process;
and generating corresponding data acquisition identification node information according to the simulation acquisition index information.
9. An intelligent cloud service platform, characterized in that the intelligent cloud service platform comprises a processor, a machine-readable storage medium and a network interface, the machine-readable storage medium, the network interface and the processor are connected through a bus system, the network interface is used for being in communication connection with at least one mobile internet terminal, the machine-readable storage medium is used for storing programs, instructions or codes, and the processor is used for executing the programs, instructions or codes in the machine-readable storage medium to execute the artificial intelligence based internet big data processing method of any one of claims 1 to 8.
CN202010508583.4A 2020-06-06 2020-06-06 Internet big data processing method based on artificial intelligence and intelligent cloud service platform Active CN111708920B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202011315512.9A CN112464041A (en) 2020-06-06 2020-06-06 Internet big data processing method and system based on artificial intelligence
CN202010508583.4A CN111708920B (en) 2020-06-06 2020-06-06 Internet big data processing method based on artificial intelligence and intelligent cloud service platform
CN202011313537.5A CN112417221A (en) 2020-06-06 2020-06-06 Internet big data processing method, system and service platform based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010508583.4A CN111708920B (en) 2020-06-06 2020-06-06 Internet big data processing method based on artificial intelligence and intelligent cloud service platform

Related Child Applications (2)

Application Number Title Priority Date Filing Date
CN202011313537.5A Division CN112417221A (en) 2020-06-06 2020-06-06 Internet big data processing method, system and service platform based on artificial intelligence
CN202011315512.9A Division CN112464041A (en) 2020-06-06 2020-06-06 Internet big data processing method and system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN111708920A CN111708920A (en) 2020-09-25
CN111708920B true CN111708920B (en) 2021-01-08

Family

ID=72539094

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202011315512.9A Withdrawn CN112464041A (en) 2020-06-06 2020-06-06 Internet big data processing method and system based on artificial intelligence
CN202011313537.5A Withdrawn CN112417221A (en) 2020-06-06 2020-06-06 Internet big data processing method, system and service platform based on artificial intelligence
CN202010508583.4A Active CN111708920B (en) 2020-06-06 2020-06-06 Internet big data processing method based on artificial intelligence and intelligent cloud service platform

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202011315512.9A Withdrawn CN112464041A (en) 2020-06-06 2020-06-06 Internet big data processing method and system based on artificial intelligence
CN202011313537.5A Withdrawn CN112417221A (en) 2020-06-06 2020-06-06 Internet big data processing method, system and service platform based on artificial intelligence

Country Status (1)

Country Link
CN (3) CN112464041A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112486710B (en) * 2020-12-17 2021-07-09 浙江盘石信息技术股份有限公司 Information acquisition method based on big data and artificial intelligence and digital content service platform
CN113486238A (en) * 2021-06-29 2021-10-08 平安信托有限责任公司 Information pushing method, device and equipment based on user portrait and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706821A (en) * 2009-12-10 2010-05-12 中兴通讯股份有限公司 Tag-based mobile internet page design system and method
CN107784122A (en) * 2017-11-22 2018-03-09 殷周平 A kind of instance-level image search method represented based on multilayer feature
CN109547477A (en) * 2018-12-27 2019-03-29 石更箭数据科技(上海)有限公司 A kind of data processing method and its device, medium, terminal
CN110148013A (en) * 2019-04-22 2019-08-20 阿里巴巴集团控股有限公司 A kind of user tag distribution forecasting method, apparatus and system
CN110688566A (en) * 2019-09-06 2020-01-14 平安科技(深圳)有限公司 Data pushing method, system, equipment and storage medium based on user portrait
CN111210634A (en) * 2020-02-27 2020-05-29 周国霞 Intelligent traffic information processing method and device, intelligent traffic system and server

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111210198B (en) * 2019-12-30 2020-08-21 广州高企云信息科技有限公司 Information delivery method and device and server
CN111177569B (en) * 2020-01-07 2021-06-11 腾讯科技(深圳)有限公司 Recommendation processing method, device and equipment based on artificial intelligence
CN111241185B (en) * 2020-04-26 2020-10-27 浙江网商银行股份有限公司 Data processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706821A (en) * 2009-12-10 2010-05-12 中兴通讯股份有限公司 Tag-based mobile internet page design system and method
CN107784122A (en) * 2017-11-22 2018-03-09 殷周平 A kind of instance-level image search method represented based on multilayer feature
CN109547477A (en) * 2018-12-27 2019-03-29 石更箭数据科技(上海)有限公司 A kind of data processing method and its device, medium, terminal
CN110148013A (en) * 2019-04-22 2019-08-20 阿里巴巴集团控股有限公司 A kind of user tag distribution forecasting method, apparatus and system
CN110688566A (en) * 2019-09-06 2020-01-14 平安科技(深圳)有限公司 Data pushing method, system, equipment and storage medium based on user portrait
CN111210634A (en) * 2020-02-27 2020-05-29 周国霞 Intelligent traffic information processing method and device, intelligent traffic system and server

Also Published As

Publication number Publication date
CN111708920A (en) 2020-09-25
CN112417221A (en) 2021-02-26
CN112464041A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN112199375A (en) Cross-modal data processing method and device, storage medium and electronic device
CN111540466B (en) Big data based intelligent medical information pushing method and big data medical cloud platform
CN111708920B (en) Internet big data processing method based on artificial intelligence and intelligent cloud service platform
CN111581625A (en) User identity identification method and device and electronic equipment
CN111930809A (en) Data processing method, device and equipment
CN111159413A (en) Log clustering method, device, equipment and storage medium
CN111611581B (en) Internet of things-based network big data information anti-disclosure method and cloud communication server
CN105653171A (en) Fingerprint identification based terminal control method, terminal control apparatus and terminal
CN104572436B (en) Automatic debugging and error proofing method and device
CN111723227B (en) Data analysis method based on artificial intelligence and Internet and cloud computing service platform
CN111831662B (en) Medical data information processing method and system
CN111708931B (en) Big data acquisition method based on mobile internet and artificial intelligence cloud service platform
CN110909817B (en) Distributed clustering method and system, processor, electronic device and storage medium
CN110888756A (en) Diagnostic log generation method and device
CN111783812A (en) Method and device for identifying forbidden images and computer readable storage medium
CN110929644B (en) Heuristic algorithm-based multi-model fusion face recognition method and device, computer system and readable medium
CN110990834A (en) Static detection method, system and medium for android malicious software
CN111652074B (en) Face recognition method, device, equipment and medium
CN112487421B (en) Android malicious application detection method and system based on heterogeneous network
CN111800790B (en) Information analysis method based on cloud computing and 5G interconnection and man-machine cooperation cloud platform
CN112825122A (en) Ethnicity judgment method, ethnicity judgment device, ethnicity judgment medium and ethnicity judgment equipment based on two-dimensional face image
CN111918137B (en) Push method and device based on video characteristics, storage medium and terminal
CN111539034B (en) Solid state disk dual-protocol encryption method and device and solid state disk encryption chip
CN112347349A (en) Big data-based cosmetic service processing method and cosmetic e-commerce cloud platform
CN116935150A (en) Model training and prop validation determining method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Cheng Tao

Inventor after: Xie Guozhu

Inventor before: Xie Guozhu

CB03 Change of inventor or designer information
TA01 Transfer of patent application right

Effective date of registration: 20201217

Address after: No. 53 B2, No. 397, Xingang Middle Road, Haizhu District, Guangzhou City, Guangdong Province (office only)

Applicant after: Guangdong and state network technology Co.,Ltd.

Address before: Room 206, 2 / F, R & D building, No. 6, No. 73, Lishi Avenue, Jinhu Economic Development Zone, Huaian City, Jiangsu Province 211600

Applicant before: Xie Guozhu

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant