WO2016201933A1 - User data processing method, providing method, system and computer device - Google Patents

User data processing method, providing method, system and computer device Download PDF

Info

Publication number
WO2016201933A1
WO2016201933A1 PCT/CN2015/097772 CN2015097772W WO2016201933A1 WO 2016201933 A1 WO2016201933 A1 WO 2016201933A1 CN 2015097772 W CN2015097772 W CN 2015097772W WO 2016201933 A1 WO2016201933 A1 WO 2016201933A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
data
network behavior
attribute information
identifier
Prior art date
Application number
PCT/CN2015/097772
Other languages
French (fr)
Chinese (zh)
Inventor
邵睿
沈剑平
李炫�
莫洋
宋元峰
齐沁芳
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Publication of WO2016201933A1 publication Critical patent/WO2016201933A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Definitions

  • the present invention relates to information processing technologies, and in particular, to a user data processing method, a providing method, a system, and a computer device.
  • an embodiment of the present invention provides a user data processing method based on user network behavior, including: acquiring network behavior data of a user; and respectively, by using at least one classification model for identifying user attributes.
  • the network behavior data is identified, and at least one user attribute information of the user is obtained; and at least one user attribute information of the user that identifies the acquired is added to the user model data of the user.
  • the user model data of the user includes identification information of the user and at least one of the user attribute information.
  • the method further includes: extracting keywords from the network behavior data of the user, and selecting the classification model for identifying user attributes according to the extracted keywords.
  • the user attribute information includes age, gender, region, interest, and belonging. At least one of the industry.
  • the embodiment of the present invention further provides a method for providing user data, including: acquiring network behavior data of a user; and acquiring a second user identifier of the user from the user identifier mapping table according to the first user identifier of the user,
  • the user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data of the user is obtained according to the second user identifier, and the user model data includes at least one user attribute information of the user.
  • the method further includes: obtaining, by the network behavior data, the first user identifier of the user.
  • the plurality of user identifiers of the user identifier mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
  • the embodiment of the present invention further provides a user data processing system based on user network behavior, comprising: a data acquisition module, configured to acquire network behavior data of the user; and a data identification module, configured to respectively identify at least one user
  • the classification model of the attribute identifies the network behavior data of the user, and acquires at least one user attribute information of the user;
  • the information adding module is configured to add at least one user attribute information of the user that is obtained by the identification to the user User model data.
  • the user model data of the user includes identification information of the user and at least one of the user attribute information.
  • the system further includes: a model selection module, configured to extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
  • a model selection module configured to extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
  • the user attribute information includes at least one of age, gender, region, interest, and industry.
  • the embodiment of the present invention further provides a user data providing system, including: a data obtaining module, configured to acquire network behavior data of a user; and a user identifier obtaining module, configured to identify the user identifier according to the first user identifier of the user Obtaining, in the mapping table, the second user identifier of the user, where the user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data obtaining module is configured to acquire the user according to the second user identifier User model data, the user model data including at least one user attribute information of the user.
  • a user data providing system including: a data obtaining module, configured to acquire network behavior data of a user; and a user identifier obtaining module, configured to identify the user identifier according to the first user identifier of the user Obtaining, in the mapping table, the second user identifier of the user, where the user identifier mapping table is associated with a plurality of user identifiers of the user; the user model
  • the user identifier obtaining module is further configured to obtain the network behavior data.
  • the plurality of user identifiers of the user identifier mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
  • Embodiments of the present invention also provide a computer device comprising: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory, and configured to be
  • the one or more processors execute instructions included in the one or more programs for executing a user data processing method based on user network behavior: acquiring network behavior data of the user; respectively, by using at least one type for identifying user attributes
  • the classification model identifies the user's network behavior data, acquires at least one user attribute information of the user, and adds at least one user attribute information of the obtained user to the user model data of the user.
  • Embodiments of the present invention also provide a computer device comprising: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory, and configured to be
  • the one or more processors execute instructions for executing the providing method of the user data included in the one or more programs: acquiring network behavior data of the user; and from the user identifier mapping table according to the first user identifier of the user Obtaining a second user identifier of the user, where the user identifier mapping table is associated with a plurality of user identifiers of the user; and acquiring user model data of the user according to the second user identifier, where the user model data includes At least one user attribute information of the user.
  • the user data processing, providing method, system and computer device provided by the embodiment of the present invention use the classification model that can identify the user attribute to identify the user network behavior data, and obtain user model data including at least one user attribute information;
  • the user identifier mapping table identifies the user identifier in the acquired network behavior data of the new user, and finds the user identifier that has a mapping relationship with the user identifier, so as to further obtain the user model data corresponding to the mapped user identifier.
  • FIG. 1 is a flowchart of a method for processing user data based on user network behavior according to an embodiment of the present invention
  • FIG. 2 is a flow chart of a method for providing user data according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an embodiment of a user data processing system based on user network behavior provided by the present invention
  • FIG. 4 is a schematic structural diagram of another embodiment of a user data processing system based on user network behavior provided by the present invention.
  • FIG. 5 is a schematic structural diagram of an embodiment of a system for providing user data according to the present invention.
  • FIG. 6 is a logic block diagram of a computer device according to an embodiment of the present invention.
  • FIG. 7 is a logic block diagram of a computer device according to another embodiment of the present invention.
  • the basic inventive concept of the present invention is to first identify a user attribute corresponding to a user's network behavior data by using a trained classification model, and obtain user model data corresponding to the user and including at least one user attribute information;
  • the model data and the user identification mapping table of the plurality of user identifiers corresponding to the same user are pre-established, and the user attribute information of the newly acquired user's network behavior data is identified in the whole network system.
  • FIG. 1 is a flowchart of a method for processing user data based on user network behavior according to an embodiment of the present invention.
  • the execution body of the method may be a server with data processing function.
  • step S110 network behavior data of a user is acquired.
  • the network behavior data of the user may be related data generated by the user in the behavior operation of using the internet network, and specifically, may include data such as logs and cookies. For example, if a user searches for "Lenovo", the keyword can be used as a network behavior data. Similarly, the link of the website that the user clicked after Baidu search and the click on the website of Baidu's cooperation can be used as the user's network behavior data. These network behavior data are generated along with the user's network behavior and are saved by the web browser or web server.
  • step S120 the user's network behavior data is identified by at least one classification model for identifying the user attribute, and at least one user attribute information of the user is obtained.
  • the classification model is a preset model for identifying user attribute information.
  • the user attribute information may include information of at least one of the user including age, gender, region, interest, and industry. And for the user attribute information of the user in different dimensions, Different classification models are used for identification.
  • Each classification model may contain keywords that identify corresponding user attribute information, and by using these keywords to match the user's network behavior data, the most likely user attributes of the user may be identified.
  • the keywords used in the classification model identified as female may include keywords that specifically target female special behaviors, such as “cosmetics”, “beauty”, and “mother and baby”.
  • the keyword used in the classification model identified as a province/city may be the majority of the geographic names included in the province/city. And so on, at least one user attribute information of the user can be obtained through keywords under at least one classification model.
  • At step S130 at least one user attribute information identifying the acquired user is added to the user model data of the user.
  • the corresponding user attribute information may be added to the corresponding attribute location in the user model data of the user. It can be understood that when a large amount of network behavior data of the same user is identified, the multi-dimensional user attribute information corresponding to the user can be obtained.
  • Table 1 shows user model data obtained by the method described in this embodiment.
  • the user model data may include: identification information of the user, such as a user ID and an ID type, and at least one of the foregoing user attribute information.
  • the network behavior data of the user is identified by using at least one classification model for identifying the user attribute, and before the at least one user attribute information of the user is obtained, the network behavior data of the acquired user may also be used.
  • Select a classification model that identifies the user's attributes For example, a keyword is extracted from the user's network behavior data, and a classification model for identifying the user attribute is selected according to the extracted keyword.
  • the network behavior data of the user acquired each time is different, some data is long, and some data is short, the number of corresponding extractable keywords is different, thereby causing the classification model adopted.
  • the number of correspondences may be more or less; or, for the purpose of acquisition
  • the network behavior data of the user can also initially determine the data source, the coverage area or the summary content by extracting the information of the keywords therein, and then the matching classification model can be selected for the feature information reflected by the keywords; When you specify the identified user attributes for identification, you can select a specific classification model that identifies these user attribute information.
  • the keywords obtained by extracting the acquired network behavior data of the user include keywords such as “cosmetic”, “beauty”, “mother and baby”, a classification model that identifies the “gender” attribute may be selected;
  • keywords containing geographic information it is optional to identify the "regional" attribute classification model.
  • the information of the keywords extracted from the user may select an appropriate classification model, thereby quickly and specifically targeting the user attribute information of the user's network behavior data. Identification.
  • the user data processing method based on the user network behavior provided by the embodiment of the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one User model data of the user and containing at least one user attribute information.
  • the user model data can be used to perform further audience analysis on users who perform network behavior.
  • the execution body of the method may be a server having a data processing function.
  • the method concept of the embodiment shown in FIG. 2 is a user identity mapping table of multiple user identifiers including at least one user's interrelated record, which is set by using the user model data in the embodiment shown in FIG.
  • User attribute information for the user's network behavior data is directly obtained in the data.
  • step S210 network behavior data of the user is acquired.
  • the content of the step of step S210 is similar to the content of the step of step S110 described above.
  • the network behavior data of the user acquired in this step may be network behavior data randomly generated by the user in the entire network system, such as Baidu Post Bar, Weibo, and network behavior data generated on the forum.
  • step S220 the second user identifier of the user is obtained from the user identifier mapping table according to the first user identifier of the user, where the user identifier mapping table associates with the plurality of user identifiers of the user.
  • the user identifier may be an identifier with an identity attribute marked by the user when performing the network behavior, for example, an account corresponding to the Baidu Post Bar, a corresponding Weibo account in the Weibo, or a forum corresponding to the forum. Publisher account number, etc.
  • a similar user ID in Table 1 above is also a specific form of user identification. It is thus known that each user can correspond to multiple different forms of user identification according to the network behavior performed. By summarizing the multiple user identifiers of each user, a user identity mapping table similar to that shown in Table 2 can be obtained.
  • the user identifier corresponding to each user may include the registered account of the user on different websites, such as Baidu ID and Weibo ID; the MAC address of the terminal used when performing network behavior (not shown in the table)
  • the mobile terminal such as the called user identification number CUID of the mobile phone, and the international mobile device identification code IMEI.
  • CUID the called user identification number
  • IMEI the international mobile device identification code
  • the first user identifier may be a user identifier corresponding to the network behavior data of the user acquired in step S210 in the embodiment.
  • the second user identifier may be another user identifier of the same user as the first user identifier in the user identifier mapping table shown in Table 2.
  • the first user identifier of the user corresponding to the foregoing network behavior data may be obtained from the user data recorded by the system, or obtained by identifying the network behavior data.
  • the manner of obtaining the first user identifier is not limited herein.
  • step S230 user model data of the user is acquired according to the second user identifier, where the user model data includes at least one user attribute information of the user.
  • the user identification included in the aforementioned user model data is already included.
  • the user identifier may be the same identifier as the first user identifier or the second user identifier in the user identifier mapping table, and any identifier may be used according to any type of user identifier, and any The association between the user identifier and the identifier is obtained from the corresponding user model data.
  • the user model data may also be applied to characterize user attribute information of a user corresponding to any of the above user identifiers.
  • the user identifier mapping table can be used to project at least one user attribute information of the same user from one form of user identifier to another form of user identifier, thereby eliminating the need to repeatedly pass the user's network behavior data.
  • the user model data under the form of user identification speeds up the process of providing user attribute information of the corresponding user according to the user's network behavior data, and improves the identification efficiency of the user attribute information.
  • the method for providing user data according to the embodiment of the present invention according to the user model data generated by the foregoing embodiment, and the pre-built user identification mapping table with multiple user identifiers recorded by the user, the network behavior data of the acquired user
  • the corresponding user identifier is identified, and the user identifier is associated and mapped, and the user attribute information of the corresponding user is obtained, so that the user attribute information of the user for different network behavior data is quickly provided in the whole network system.
  • FIG. 3 is a schematic structural diagram of an embodiment of a user data processing system based on user network behavior provided by the present invention.
  • the system of Figure 3 can be used to perform the method steps of the embodiment shown in Figure 1.
  • the user data processing system based on user network behavior specifically includes a first data obtaining module 310, a data identifying module 320, and an information adding module 330.
  • the first data obtaining module 310 is configured to acquire network behavior data of the user; the data identifying module 320 is configured to identify the network behavior data of the user by using at least one classification model for identifying the attribute of the user, and obtain at least one user attribute of the user.
  • the information adding module 330 is configured to add at least one user attribute information identifying the acquired user to the user model data of the user.
  • the user model data of the user may include identification information of the user and at least one user attribute information.
  • the user data processing system based on the user network behavior may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords.
  • the classification model of the attribute may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords.
  • the classification model of the attribute may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords.
  • the classification model of the attribute may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords.
  • the classification model of the attribute may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords.
  • the classification model of the attribute may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the
  • the user attribute information may include at least one of age, gender, region, interest, and industry.
  • the user data processing system based on the user network behavior provided by the embodiment of the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one User's package User model data containing at least one user attribute information.
  • the user model data can be used to perform further audience analysis on users who perform network behavior.
  • FIG. 5 is a schematic structural diagram of an embodiment of a system for providing user data according to the present invention.
  • the system shown in FIG. 5 can be used to perform the method steps of the embodiment shown in FIG. 2.
  • the user data providing system includes a second data acquiring module 510, a user identification acquiring module 520, and a user model data acquiring module 530.
  • the second data obtaining module 510 is configured to obtain the network behavior data of the user
  • the user identifier obtaining module 520 is configured to obtain the second user identifier of the user from the user identifier mapping table according to the first user identifier of the user, where the user identifier mapping table
  • the association record has a plurality of user identifiers of the user
  • the user model data obtaining module 530 is configured to acquire user model data of the user according to the second user identifier, where the user model data includes at least one user attribute information of the user.
  • the user identifier obtaining module 520 is further configured to obtain the first user identifier of the user from the network behavior data.
  • the plurality of user identifiers of the user identification mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
  • the user data providing system provided by the embodiment of the present invention, according to the user model data generated by the foregoing embodiment, and the pre-built association user record mapping table with multiple user identifiers recorded by the user, the network behavior data of the acquired user
  • the corresponding user identifier is identified, and the user identifier is associated and mapped, and the user attribute information of the corresponding user is obtained, so that the user attribute information of the user for different network behavior data is quickly provided in the whole network system.
  • FIG. 6 is a logic block diagram of a computer device according to an embodiment of the present invention.
  • the computer device can be used to implement the user data processing method based on user network behavior provided in the above embodiments. Specifically:
  • Computer devices may vary considerably depending on configuration or performance, and may include one or more processors (eg, Central Processing Units, CPU) 610 and memory 620.
  • the memory 620 can be short-lived or persistent.
  • One or more programs may be stored in memory 620, each of which may include a series of instruction operations on a computer device.
  • the processor 610 can communicate with the memory 620, and is configured in the computer.
  • a series of instruction operations in the execution memory 620 are provided.
  • the memory 620 also stores data of one or more operating systems, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like.
  • the computer device may also include one or more power sources 630, one or more wired or wireless network interfaces 640, one or more input and output interfaces 650, and the like.
  • the computer device includes one or more processors 610, memory 620, and one or more programs, one or more programs stored in memory 620, and configured to be executed by one or more processors.
  • 610 executing instructions included in one or more programs for executing a user data processing method based on user network behavior: acquiring network behavior data of the user; respectively, by using at least one classification model for identifying user attributes The behavior data is identified, and at least one user attribute information of the user is acquired; and at least one user attribute information of the user that identifies the acquired is added to the user model data of the user.
  • the user model data of the user may include identification information of the user and at least one of the user attribute information.
  • the processor 610 further includes an instruction to: extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
  • the user attribute information may include at least one of age, gender, region, interest, and industry.
  • the computer device provided by the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one user and including at least one user attribute information.
  • User model data can be used to perform further audience analysis on users who perform network behavior.
  • FIG. 7 is a logic block diagram of a computer device according to another embodiment of the present invention.
  • a computer device can be used to implement the method of providing user data provided in the above embodiments. Specifically:
  • the computer device can include a communication unit 710, a memory 720 including one or more computer readable storage media, an input unit 730, a display unit 740, including one or More than one processing core processor 750, and power supply 760 and other components.
  • a communication unit 710 can include a communication unit 710, a memory 720 including one or more computer readable storage media, an input unit 730, a display unit 740, including one or More than one processing core processor 750, and power supply 760 and other components.
  • a communication unit 710 can include a communication unit 710, a memory 720 including one or more computer readable storage media, an input unit 730, a display unit 740, including one or More than one processing core processor 750, and power supply 760 and other components.
  • Those skilled in the art will appreciate that the computer device architecture illustrated in the figures does not constitute a limitation to a computer device, and may include more or fewer components than those illustrated, or a combination of certain components, or different component arrangements. among them:
  • the communication unit 710 can be used for transmitting and receiving information or receiving and transmitting signals during a call.
  • the communication unit 710 can be a network communication device such as an RF (Radio Frequency) circuit, a router, a modem, or the like.
  • communication unit 710 can also communicate with the network and other devices via wireless communication.
  • Wireless communication can use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access) Divisional Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
  • the memory 720 can be used to store software programs and data, and the processor 750 executes various functional applications and data processing by running software programs and data stored in the memory 720.
  • the memory 720 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of computer equipment (such as audio data, phone book, etc.).
  • memory 720 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 720 can also include a memory controller to provide access to memory 720 by processor 750 and input unit 730.
  • Display unit 740 can be used to display information entered by the user or information provided to the user, as well as various graphical user interfaces of the computer device, which can be constructed from graphics, text, icons, video, and any combination thereof.
  • the display unit 740 may include a display panel.
  • the display panel may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • input unit 730 and display unit 740 are implemented as two separate components to implement input and output functions, in some embodiments, input unit 730 can be integrated with display unit 740 for input and output.
  • Processor 750 is a control center for computer devices that connects various portions of the entire computer device using various interfaces and lines, by running or executing software programs and/or modules stored in memory 720, and recalling data stored in memory 720. , performing various functions and processing data of the computer device, thereby performing overall monitoring of the computer device.
  • the computer device may also include a power source 760 (such as a battery) that supplies power to the various components.
  • the power source may be logically coupled to the processor 750 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • Power supply 760 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the computer device may also include a camera, a Bluetooth module, sensors (such as light sensors, motion sensors, and other sensors, etc.), audio circuits, and wireless communication units, etc., and are not described herein.
  • the computer device includes one or more processors 750, memory 720, and one or more programs, one or more programs stored in memory 720, and configured to be executed by one or more processors 750.
  • Executing, by the one or more programs, an instruction for executing a providing method of the user data acquiring network behavior data of the user; acquiring a second user identifier of the user from the user identifier mapping table according to the first user identifier of the user
  • the user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data of the user is obtained according to the second user identifier, and the user model data includes at least one user attribute information of the user. .
  • the processor 750 further includes an instruction to: obtain the first user identifier of the user from the network behavior data.
  • the plurality of user identifiers of the user identification mapping table association record may include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
  • the computer device provided by the present invention according to the user model data generated by the foregoing embodiment, and the pre-built user identification mapping table with the user identification of multiple user identifiers, the user identifier corresponding to the acquired network behavior data of the user is performed. Identifying and correlating the user identifier to obtain user attribute information of the corresponding user, thereby quickly providing user attribute information of the user for different network behavior data in the whole network system.
  • the above method and apparatus according to the present invention may be implemented in hardware, firmware, or as software or computer code that may be stored in a recording medium such as a CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk, or implemented.
  • the network downloads computer code originally stored in a remote recording medium or non-transitory machine readable medium and stored in a local recording medium so that the methods described herein can be stored using a general purpose computer, a dedicated processor or programmable
  • Such software processing on a recording medium of dedicated hardware such as an ASIC or an FPGA.
  • a computer, processor, microprocessor controller or programmable hardware includes storage components (eg, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is The processing methods described herein are implemented when the processor or hardware is accessed and executed. Moreover, when a general purpose computer accesses code for implementing the processing shown herein, the execution of the code converts the general purpose computer into a special purpose computer for performing the processing shown herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A user data processing method, a providing method, a system and a computer device, wherein the user data processing method based on user network behaviour comprises: acquiring network behaviour data of a user (S110); identifying the network behaviour data of the user respectively by at least one classification model for identifying a user attribute, so as to acquire at least one item of user attribute information about the user (S120); and adding the at least one item of user attribute information about the user, which is acquired through identification, to user model data about the user (S130). By means of the method, the attribute information about a user is acquired and provided on the basis of network behaviour of a user.

Description

用户数据处理方法、提供方法、***和计算机设备User data processing method, providing method, system and computer device
本申请要求于2015年06月19日提交中国专利局、申请号为201510347820.2、发明名称为“用户数据处理方法、用户数据的提供方法和***”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201510347820.2, entitled "User Data Processing Method, User Data Provisioning Method and System", filed on June 19, 2015, the entire contents of which are incorporated by reference. Combined in this application.
技术领域Technical field
本发明涉及信息处理技术,尤其涉及一种用户数据处理方法、提供方法、***和计算机设备。The present invention relates to information processing technologies, and in particular, to a user data processing method, a providing method, a system, and a computer device.
背景技术Background technique
致力于满足政府、企业针对网络舆情监控的需求,需要利用全网数据(包括微博,贴吧,新闻,论坛等)对热点事件、用户关心事件进行实时监控。在舆情监控中,除了舆情事件本身,事件相关的人群数据也是需求者所重点关注的信息。随着当下人们的隐私意识越发强烈,越来越少的用户会主动在网络上填写个人信息并且公开,因此如何准确地获取用户的数据,便成为舆情监控中的一大难题。Committed to meeting the needs of government and enterprises for network public opinion monitoring, it is necessary to use real-time data (including Weibo, Post Bar, News, Forums, etc.) to monitor hot events and user concerns in real time. In the public opinion monitoring, in addition to the public opinion event itself, the event-related population data is also the information that the demander focuses on. As people's privacy awareness becomes more and more intense, fewer and fewer users will actively fill in personal information on the Internet and make it public. Therefore, how to accurately obtain user data becomes a major problem in public opinion monitoring.
发明内容Summary of the invention
本发明的目的在于,提供一种用户数据处理方法、提供方法、***和计算机设备,以基于用户的网络行为获取用户的属性信息。It is an object of the present invention to provide a user data processing method, a providing method, a system, and a computer device to acquire attribute information of a user based on a user's network behavior.
为达到上述目的,本发明的实施例提供了一种基于用户网络行为的用户数据处理方法,包括:获取用户的网络行为数据;分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息;将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。To achieve the above objective, an embodiment of the present invention provides a user data processing method based on user network behavior, including: acquiring network behavior data of a user; and respectively, by using at least one classification model for identifying user attributes. The network behavior data is identified, and at least one user attribute information of the user is obtained; and at least one user attribute information of the user that identifies the acquired is added to the user model data of the user.
进一步地,所述用户的用户模型数据包括用户的标识信息以及至少一个所述用户属性信息。Further, the user model data of the user includes identification information of the user and at least one of the user attribute information.
可选地,所述方法还包括:从所述用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的所述分类模型。Optionally, the method further includes: extracting keywords from the network behavior data of the user, and selecting the classification model for identifying user attributes according to the extracted keywords.
进一步地,所述用户属性信息包括年龄、性别、地域、兴趣、所属 行业当中的至少一个。Further, the user attribute information includes age, gender, region, interest, and belonging. At least one of the industry.
本发明的实施例还提供了一种用户数据的提供方法,包括:获取用户的网络行为数据;根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。The embodiment of the present invention further provides a method for providing user data, including: acquiring network behavior data of a user; and acquiring a second user identifier of the user from the user identifier mapping table according to the first user identifier of the user, The user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data of the user is obtained according to the second user identifier, and the user model data includes at least one user attribute information of the user.
可选地,所述方法还包括:从所述网络行为数据中获取所述用户的第一用户标识。Optionally, the method further includes: obtaining, by the network behavior data, the first user identifier of the user.
进一步地,所述用户标识映射表关联记录的多个用户标识包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。Further, the plurality of user identifiers of the user identifier mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
本发明的实施例还提供了一种基于用户网络行为的用户数据处理***,包括:数据获取模块,用于获取用户的网络行为数据;数据识别模块,用于分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息;信息添加模块,用于将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。The embodiment of the present invention further provides a user data processing system based on user network behavior, comprising: a data acquisition module, configured to acquire network behavior data of the user; and a data identification module, configured to respectively identify at least one user The classification model of the attribute identifies the network behavior data of the user, and acquires at least one user attribute information of the user; the information adding module is configured to add at least one user attribute information of the user that is obtained by the identification to the user User model data.
进一步地,所述用户的用户模型数据包括用户的标识信息以及至少一个所述用户属性信息。Further, the user model data of the user includes identification information of the user and at least one of the user attribute information.
可选地,所述***还包括:模型选取模块,用于从所述用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的所述分类模型。Optionally, the system further includes: a model selection module, configured to extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
进一步地,所述用户属性信息包括年龄、性别、地域、兴趣、所属行业当中的至少一个。Further, the user attribute information includes at least one of age, gender, region, interest, and industry.
本发明的实施例还提供了一种用户数据的提供***,包括:数据获取模块,用于获取用户的网络行为数据;用户标识获取模块,用于根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;用户模型数据获取模块,用于根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。The embodiment of the present invention further provides a user data providing system, including: a data obtaining module, configured to acquire network behavior data of a user; and a user identifier obtaining module, configured to identify the user identifier according to the first user identifier of the user Obtaining, in the mapping table, the second user identifier of the user, where the user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data obtaining module is configured to acquire the user according to the second user identifier User model data, the user model data including at least one user attribute information of the user.
可选地,所述用户标识获取模块还用于从所述网络行为数据中获取 所述用户的第一用户标识。Optionally, the user identifier obtaining module is further configured to obtain the network behavior data. The first user identifier of the user.
进一步地,所述用户标识映射表关联记录的多个用户标识包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。Further, the plurality of user identifiers of the user identifier mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
本发明的实施例还提供了一种计算机设备,包括:一个或多个处理器;存储器;一个或多个程序,所述一个或多个程序存储在所述存储器中,且经配置以由所述一个或者多个处理器执行所述一个或者多个程序包含的用于执行基于用户网络行为的用户数据处理方法的指令:获取用户的网络行为数据;分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息;将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。Embodiments of the present invention also provide a computer device comprising: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory, and configured to be The one or more processors execute instructions included in the one or more programs for executing a user data processing method based on user network behavior: acquiring network behavior data of the user; respectively, by using at least one type for identifying user attributes The classification model identifies the user's network behavior data, acquires at least one user attribute information of the user, and adds at least one user attribute information of the obtained user to the user model data of the user.
本发明的实施例还提供了一种计算机设备,包括:一个或多个处理器;存储器;一个或多个程序,所述一个或多个程序存储在所述存储器中,且经配置以由所述一个或者多个处理器执行所述一个或者多个程序包含的用于执行用户数据的提供方法的指令:获取用户的网络行为数据;根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。Embodiments of the present invention also provide a computer device comprising: one or more processors; a memory; one or more programs, the one or more programs being stored in the memory, and configured to be The one or more processors execute instructions for executing the providing method of the user data included in the one or more programs: acquiring network behavior data of the user; and from the user identifier mapping table according to the first user identifier of the user Obtaining a second user identifier of the user, where the user identifier mapping table is associated with a plurality of user identifiers of the user; and acquiring user model data of the user according to the second user identifier, where the user model data includes At least one user attribute information of the user.
本发明实施例提供的用户数据处理、提供方法、***和计算机设备,利用可识别用户属性的分类模型对用户网络行为数据进行识别,得到包含至少一个用户属性信息的用户模型数据;通过预置的用户标识映射表,对获取的新的用户的网络行为数据中的用户标识进行识别,找到与该用户标识具有映射关系的用户标识,以进一步获取与该映射的用户标识对应的用户模型数据。The user data processing, providing method, system and computer device provided by the embodiment of the present invention use the classification model that can identify the user attribute to identify the user network behavior data, and obtain user model data including at least one user attribute information; The user identifier mapping table identifies the user identifier in the acquired network behavior data of the new user, and finds the user identifier that has a mapping relationship with the user identifier, so as to further obtain the user model data corresponding to the mapped user identifier.
附图说明DRAWINGS
图1为本发明提供的基于用户网络行为的用户数据处理方法一个实施例的方法流程图;1 is a flowchart of a method for processing user data based on user network behavior according to an embodiment of the present invention;
图2为本发明提供的用户数据的提供方法一个实施例的方法流程 图;2 is a flow chart of a method for providing user data according to an embodiment of the present invention Figure
图3为本发明提供的基于用户网络行为的用户数据处理***一个实施例的结构示意图;3 is a schematic structural diagram of an embodiment of a user data processing system based on user network behavior provided by the present invention;
图4为本发明提供的基于用户网络行为的用户数据处理***另一个实施例的结构示意图;4 is a schematic structural diagram of another embodiment of a user data processing system based on user network behavior provided by the present invention;
图5为本发明提供的用户数据的提供***一个实施例的结构示意图;FIG. 5 is a schematic structural diagram of an embodiment of a system for providing user data according to the present invention; FIG.
图6为本发明实施例提供的计算机设备的逻辑框图;FIG. 6 is a logic block diagram of a computer device according to an embodiment of the present invention;
图7为本发明另一实施例提供的计算机设备的逻辑框图。FIG. 7 is a logic block diagram of a computer device according to another embodiment of the present invention.
具体实施方式detailed description
本发明的基本发明构思是,先通过已训练好的分类模型对用户的网络行为数据对应的用户属性进行识别,得到与该用户对应的包含至少一个用户属性信息的用户模型数据;然后利用该用户模型数据以及预先建立好的同一用户对应的多个用户标识的用户标识映射表,在全网***中对新获取的用户的网络行为数据的用户属性信息进行识别。The basic inventive concept of the present invention is to first identify a user attribute corresponding to a user's network behavior data by using a trained classification model, and obtain user model data corresponding to the user and including at least one user attribute information; The model data and the user identification mapping table of the plurality of user identifiers corresponding to the same user are pre-established, and the user attribute information of the newly acquired user's network behavior data is identified in the whole network system.
实施例一Embodiment 1
图1为本发明提供的基于用户网络行为的用户数据处理方法一个实施例的方法流程图,该方法的执行主体可以为具有数据处理功能的服务器。FIG. 1 is a flowchart of a method for processing user data based on user network behavior according to an embodiment of the present invention. The execution body of the method may be a server with data processing function.
参照图1,在步骤S110,获取用户的网络行为数据。Referring to FIG. 1, in step S110, network behavior data of a user is acquired.
所述的用户的网络行为数据可以为用户在使用互联网络的行为操作中所产生的相关数据,具体地,可以包括日志、cookies等数据。例如,用户搜索了“联想”,则该关键词可以作为一项网络行为数据。类似的,用户在百度搜索后所点击的网站链接,以及在百度合作的网站上的广告点击,都可以作为用户的网络行为数据。这些网络行为数据随着用户的网络行为产生,并被网络浏览器或网络服务器保存下来。The network behavior data of the user may be related data generated by the user in the behavior operation of using the internet network, and specifically, may include data such as logs and cookies. For example, if a user searches for "Lenovo", the keyword can be used as a network behavior data. Similarly, the link of the website that the user clicked after Baidu search and the click on the website of Baidu's cooperation can be used as the user's network behavior data. These network behavior data are generated along with the user's network behavior and are saved by the web browser or web server.
在步骤S120,分别通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息。In step S120, the user's network behavior data is identified by at least one classification model for identifying the user attribute, and at least one user attribute information of the user is obtained.
所述的分类模型为预先设置的用于识别用户属性信息的模型。其中,用户属性信息可以包括用户的包括年龄、性别、地域、兴趣、所属行业当中的至少一个的信息。而针对用户在不同维度下的用户属性信息,可 采用不同的分类模型分别进行识别。每种分类模型都可包含识别相应用户属性信息的关键词,利用这些关键词与用户的网络行为数据进行匹配,则可识别出该用户最有可能的用户属性。The classification model is a preset model for identifying user attribute information. The user attribute information may include information of at least one of the user including age, gender, region, interest, and industry. And for the user attribute information of the user in different dimensions, Different classification models are used for identification. Each classification model may contain keywords that identify corresponding user attribute information, and by using these keywords to match the user's network behavior data, the most likely user attributes of the user may be identified.
例如,针对“性别”维度下的用户属性信息的识别,在识别为女性的分类模型中采用的关键词可以包括特别针对女性特殊行为的关键词,如“化妆品”、“美容”、“母婴”等;类似的,针对“地域”维度下的用户属性信息的识别,在识别为某省/市的分类模型中采用的关键词可以是包含在该省/市内的大部分地理名称。依此类推,通过至少一种分类模型下的关键词可以获取用户的至少一个用户属性信息。For example, for the identification of user attribute information under the “gender” dimension, the keywords used in the classification model identified as female may include keywords that specifically target female special behaviors, such as “cosmetics”, “beauty”, and “mother and baby”. Similarly, for the identification of user attribute information under the "regional" dimension, the keyword used in the classification model identified as a province/city may be the majority of the geographic names included in the province/city. And so on, at least one user attribute information of the user can be obtained through keywords under at least one classification model.
在步骤S130,将识别获取的用户的至少一个用户属性信息添加到用户的用户模型数据中。At step S130, at least one user attribute information identifying the acquired user is added to the user model data of the user.
在获取到用户的至少一个用户属性信息后,可将相应的用户属性信息添加到该用户的用户模型数据中的对应属性位置中。可理解的,当对同一用户的大量的网络行为数据进行识别后,就可以得到该用户对应的多维度的用户属性信息。表1示出了通过本实施例所述方法获得的用户模型数据。After the user attribute information of the user is obtained, the corresponding user attribute information may be added to the corresponding attribute location in the user model data of the user. It can be understood that when a large amount of network behavior data of the same user is identified, the multi-dimensional user attribute information corresponding to the user can be obtained. Table 1 shows user model data obtained by the method described in this embodiment.
表1用户模型数据Table 1 user model data
用户IDUser ID ID类型ID type 年龄age 性别gender 地域area 兴趣interest 行业industry 其他other
123456123456 BaiduBaidu 18以下18 or less male 北京Beijing IT电子IT Electronics 互联网the Internet ...
abcdAbcd BaiduBaidu 18-2518-25 Female 广东Guangdong 动漫Anime 医疗Medical ...
my123My123 BaiduBaidu 18-2518-25 male 上海Shanghai 摄影photography 销售Sales ...
... ... ... ... ... ... ... ...
如表1所示,用户模型数据中可包括:用户的标识信息,如用户ID和ID类型,以及至少一个上述的用户属性信息。As shown in Table 1, the user model data may include: identification information of the user, such as a user ID and an ID type, and at least one of the foregoing user attribute information.
可选地,在步骤120,分别通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息之前,还可以根据获取的用户的网络行为数据选择用于识别用户属性的分类模型。如从用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的分类模型。Optionally, in step 120, the network behavior data of the user is identified by using at least one classification model for identifying the user attribute, and before the at least one user attribute information of the user is obtained, the network behavior data of the acquired user may also be used. Select a classification model that identifies the user's attributes. For example, a keyword is extracted from the user's network behavior data, and a classification model for identifying the user attribute is selected according to the extracted keyword.
具体地,由于每次获取的用户的网络行为数据存在差异性,有的数据较长、有的数据较短,则相应的可提取的关键词的数目多少不一,从而导致所采用的分类模型的数目对应也可多可少;或者,对于获取的用 户的网络行为数据通过提取其中的关键词的信息也可以初步判断出其数据来源、涵盖领域或概要内容,那么可以针对这些关键词所反映的特征信息选择匹配的分类模型;当然,也可针对指定识别的用户属性进行识别时,可选取可识别这些用户属性信息的特定的分类模型。Specifically, since the network behavior data of the user acquired each time is different, some data is long, and some data is short, the number of corresponding extractable keywords is different, thereby causing the classification model adopted. The number of correspondences may be more or less; or, for the purpose of acquisition The network behavior data of the user can also initially determine the data source, the coverage area or the summary content by extracting the information of the keywords therein, and then the matching classification model can be selected for the feature information reflected by the keywords; When you specify the identified user attributes for identification, you can select a specific classification model that identifies these user attribute information.
例如,当对获取的用户的网络行为数据进行提取得到的关键词中包含如“化妆品”、“美容”、“母婴”等关键词时,可选取识别“性别”属性的分类模型;类似的,当提取出包含地理信息的关键词时,可选择识别“地域”属性分类模型。For example, when the keywords obtained by extracting the acquired network behavior data of the user include keywords such as “cosmetic”, “beauty”, “mother and baby”, a classification model that identifies the “gender” attribute may be selected; When extracting keywords containing geographic information, it is optional to identify the "regional" attribute classification model.
根据以上几种具体应用场景,根据用户的网络行为数据,例如从其中提取的关键词的信息可以选择恰当的分类模型,从而对获取用户的网络行为数据的用户属性信息进行快速且有针对性的识别。According to the above specific application scenarios, according to the user's network behavior data, for example, the information of the keywords extracted from the user may select an appropriate classification model, thereby quickly and specifically targeting the user attribute information of the user's network behavior data. Identification.
本发明实施例提供的基于用户网络行为的用户数据处理方法,通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息,从而形成至少一个用户的且包含至少一个用户属性信息的用户模型数据。利用该用户模型数据可以对进行网络行为的用户进行进一步的受众分析。The user data processing method based on the user network behavior provided by the embodiment of the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one User model data of the user and containing at least one user attribute information. The user model data can be used to perform further audience analysis on users who perform network behavior.
实施例二Embodiment 2
图2为本发明提供的用户数据的提供方法一个实施例的方法流程图,该方法的执行主体可以为具有数据处理功能的服务器。图2所示实施例的方法构思,是利用图1所示实施例中的用户模型数据,通过预先设置的包含至少一个用户的相互关联记录的多个用户标识的用户标识映射表,从用户模型数据中直接获取针对用户的网络行为数据的用户属性信息。2 is a flowchart of a method for providing a method for providing user data according to an embodiment of the present invention. The execution body of the method may be a server having a data processing function. The method concept of the embodiment shown in FIG. 2 is a user identity mapping table of multiple user identifiers including at least one user's interrelated record, which is set by using the user model data in the embodiment shown in FIG. User attribute information for the user's network behavior data is directly obtained in the data.
参照图2,在步骤S210,获取用户的网络行为数据。步骤S210的步骤内容与前述步骤S110的步骤内容相似。Referring to FIG. 2, in step S210, network behavior data of the user is acquired. The content of the step of step S210 is similar to the content of the step of step S110 described above.
具体地,本步骤获取的用户的网络行为数据可以是用户在全网***中任意产生的网络行为数据,如百度贴吧、微博、论坛上产生的网络行为数据。Specifically, the network behavior data of the user acquired in this step may be network behavior data randomly generated by the user in the entire network system, such as Baidu Post Bar, Weibo, and network behavior data generated on the forum.
在步骤S220,根据用户的第一用户标识从用户标识映射表中获取用户的第二用户标识,其中,用户标识映射表关联记录有用户的多个用户标识。 In step S220, the second user identifier of the user is obtained from the user identifier mapping table according to the first user identifier of the user, where the user identifier mapping table associates with the plurality of user identifiers of the user.
所述的用户标识可以是用户在进行网络行为时所被标注的具有身份属性的标识,例如在百度贴吧对应的账号、在微博里对应的微博账号,或是在论坛上发表论坛所对应的发布者账号等。类似的前述表1中的用户ID也是用户标识的一种具体形式。如此获悉,每个用户根据所进行的网络行为都可以对应包含多个不同形式的用户标识。对每个用户的多个用户标识进行归纳整理,可以得到类似表2所示的用户标识映射表。如表2所示,每个用户所对应的用户标识可以包括用户在不同网站的注册账号,如百度ID、微博ID;进行网络行为时所使用的终端的MAC地址(表中未标出)、移动终端如手机的被叫用户识别号CUID以及国际移动设备识别码IMEI等。通过用户在进行网络行为时产生的日志或cookies等信息,可获得这些信息,以及信息之间的关联关系,从而得到如表2所示的用户标识映射表。The user identifier may be an identifier with an identity attribute marked by the user when performing the network behavior, for example, an account corresponding to the Baidu Post Bar, a corresponding Weibo account in the Weibo, or a forum corresponding to the forum. Publisher account number, etc. A similar user ID in Table 1 above is also a specific form of user identification. It is thus known that each user can correspond to multiple different forms of user identification according to the network behavior performed. By summarizing the multiple user identifiers of each user, a user identity mapping table similar to that shown in Table 2 can be obtained. As shown in Table 2, the user identifier corresponding to each user may include the registered account of the user on different websites, such as Baidu ID and Weibo ID; the MAC address of the terminal used when performing network behavior (not shown in the table) The mobile terminal, such as the called user identification number CUID of the mobile phone, and the international mobile device identification code IMEI. Through the information such as logs or cookies generated by the user when performing network behavior, such information and the relationship between the information can be obtained, thereby obtaining a user identification mapping table as shown in Table 2.
表2用户标识映射表Table 2 User Identity Mapping Table
Figure PCTCN2015097772-appb-000001
Figure PCTCN2015097772-appb-000001
所述的第一用户标识,可以为实施例中步骤S210中获取的用户的网络行为数据对应的用户标识。所述的第二用户标识,可以为表2所示的用户标识映射表中,与所述第一用户标识所指同一用户的其他用户标识。The first user identifier may be a user identifier corresponding to the network behavior data of the user acquired in step S210 in the embodiment. The second user identifier may be another user identifier of the same user as the first user identifier in the user identifier mapping table shown in Table 2.
通过获取用户的网络行为数据所对应的第一用户标识,就可以通过查询表2得到该用户所对应的其他第二用户标识。By obtaining the first user identifier corresponding to the network behavior data of the user, other second user identifiers corresponding to the user may be obtained through the query table 2.
可选地,获取上述网络行为数据对应的用户的第一用户标识可以从***已记录的用户数据中获取,或是通过识别该网络行为数据后获取的。在此对第一用户标识的获取方式不做限定。Optionally, the first user identifier of the user corresponding to the foregoing network behavior data may be obtained from the user data recorded by the system, or obtained by identifying the network behavior data. The manner of obtaining the first user identifier is not limited herein.
在步骤S230中,根据第二用户标识获取用户的用户模型数据,所述用户模型数据包括用户的至少一个用户属性信息。In step S230, user model data of the user is acquired according to the second user identifier, where the user model data includes at least one user attribute information of the user.
在上述表2所示的用户标识映射表中,已包含前述用户模型数据中所包含的用户标识。该用户标识在用户标识映射表中可以为与第一用户标识相同的标识,也可以为第二用户标识,而不论是哪种标识,都可以根据任一种形式的用户标识,以及任一种用户标识与该标识的关联关系得到与其对应用户模型数据。而该用户模型数据则也可适用表征上述任一种用户标识所对应用户的用户属性信息。 In the user identification mapping table shown in Table 2 above, the user identification included in the aforementioned user model data is already included. The user identifier may be the same identifier as the first user identifier or the second user identifier in the user identifier mapping table, and any identifier may be used according to any type of user identifier, and any The association between the user identifier and the identifier is obtained from the corresponding user model data. The user model data may also be applied to characterize user attribute information of a user corresponding to any of the above user identifiers.
因此,利用上述用户标识映射表可将同一用户的至少一个用户属性信息从一种形式的用户标识投影到另一种形式的用户标识下,从而免去了重复通过用户的网络行为数据,形成不用形式的用户标识下的用户模型数据,加快了根据用户的网络行为数据提供相应用户的用户属性信息的过程,提高了用户属性信息的鉴别效率。Therefore, the user identifier mapping table can be used to project at least one user attribute information of the same user from one form of user identifier to another form of user identifier, thereby eliminating the need to repeatedly pass the user's network behavior data. The user model data under the form of user identification speeds up the process of providing user attribute information of the corresponding user according to the user's network behavior data, and improves the identification efficiency of the user attribute information.
本发明实施例提供的用户数据的提供方法,依据上述实施例所产生的用户模型数据,以及预先构建的关联记录有用户的多个用户标识的用户标识映射表,对获取的用户的网络行为数据对应的用户标识进行识别,并对该用户标识进行关联映射,得到相应用户的用户属性信息,从而在全网***中快速提供针对不同网络行为数据的用户的用户属性信息。The method for providing user data according to the embodiment of the present invention, according to the user model data generated by the foregoing embodiment, and the pre-built user identification mapping table with multiple user identifiers recorded by the user, the network behavior data of the acquired user The corresponding user identifier is identified, and the user identifier is associated and mapped, and the user attribute information of the corresponding user is obtained, so that the user attribute information of the user for different network behavior data is quickly provided in the whole network system.
实施例三Embodiment 3
图3为本发明提供的基于用户网络行为的用户数据处理***一个实施例的结构示意图。图3所示***可用于执行如图1所示实施例的方法步骤。FIG. 3 is a schematic structural diagram of an embodiment of a user data processing system based on user network behavior provided by the present invention. The system of Figure 3 can be used to perform the method steps of the embodiment shown in Figure 1.
参照图3,该基于用户网络行为的用户数据处理***具体包括第一数据获取模块310、数据识别模块320和信息添加模块330。Referring to FIG. 3, the user data processing system based on user network behavior specifically includes a first data obtaining module 310, a data identifying module 320, and an information adding module 330.
第一数据获取模块310用于获取用户的网络行为数据;数据识别模块320用于分别通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息;信息添加模块330用于将识别获取的用户的至少一个用户属性信息添加到用户的用户模型数据中。The first data obtaining module 310 is configured to acquire network behavior data of the user; the data identifying module 320 is configured to identify the network behavior data of the user by using at least one classification model for identifying the attribute of the user, and obtain at least one user attribute of the user. The information adding module 330 is configured to add at least one user attribute information identifying the acquired user to the user model data of the user.
进一步地,上述用户的用户模型数据可包括用户的标识信息以及至少一个用户属性信息。Further, the user model data of the user may include identification information of the user and at least one user attribute information.
可选地,如图4所示,上述基于用户网络行为的用户数据处理***还可包括:模型选取模块340用于从用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的分类模型。Optionally, as shown in FIG. 4, the user data processing system based on the user network behavior may further include: the model selection module 340 is configured to extract keywords from the network behavior data of the user, and select and identify the user according to the extracted keywords. The classification model of the attribute.
进一步地,上述用户属性信息可包括年龄、性别、地域、兴趣、所属行业当中的至少一个。Further, the user attribute information may include at least one of age, gender, region, interest, and industry.
本发明实施例提供的基于用户网络行为的用户数据处理***,通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息,从而形成至少一个用户的且包 含至少一个用户属性信息的用户模型数据。利用该用户模型数据可以对进行网络行为的用户进行进一步的受众分析。The user data processing system based on the user network behavior provided by the embodiment of the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one User's package User model data containing at least one user attribute information. The user model data can be used to perform further audience analysis on users who perform network behavior.
实施例四Embodiment 4
图5为本发明提供的用户数据的提供***一个实施例的结构示意图,图5所示***可用于执行如图2所示实施例的方法步骤。FIG. 5 is a schematic structural diagram of an embodiment of a system for providing user data according to the present invention. The system shown in FIG. 5 can be used to perform the method steps of the embodiment shown in FIG. 2.
参照图5,该用户数据的提供***包括第二数据获取模块510、用户标识获取模块520和用户模型数据获取模块530。Referring to FIG. 5, the user data providing system includes a second data acquiring module 510, a user identification acquiring module 520, and a user model data acquiring module 530.
第二数据获取模块510用于获取用户的网络行为数据;用户标识获取模块520,用于根据用户的第一用户标识从用户标识映射表中获取用户的第二用户标识,其中,用户标识映射表关联记录有用户的多个用户标识;用户模型数据获取模块530用于根据第二用户标识获取用户的用户模型数据,所述用户模型数据包括用户的至少一个用户属性信息。The second data obtaining module 510 is configured to obtain the network behavior data of the user, and the user identifier obtaining module 520 is configured to obtain the second user identifier of the user from the user identifier mapping table according to the first user identifier of the user, where the user identifier mapping table The association record has a plurality of user identifiers of the user; the user model data obtaining module 530 is configured to acquire user model data of the user according to the second user identifier, where the user model data includes at least one user attribute information of the user.
可选地,上述用户标识获取模块520还可用于从网络行为数据中获取用户的第一用户标识。Optionally, the user identifier obtaining module 520 is further configured to obtain the first user identifier of the user from the network behavior data.
进一步地,上述用户标识映射表关联记录的多个用户标识包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。Further, the plurality of user identifiers of the user identification mapping table association record include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
本发明实施例提供的用户数据的提供***,依据上述实施例所产生的用户模型数据,以及预先构建的关联记录有用户的多个用户标识的用户标识映射表,对获取的用户的网络行为数据对应的用户标识进行识别,并对该用户标识进行关联映射,得到相应用户的用户属性信息,从而在全网***中快速提供针对不同网络行为数据的用户的用户属性信息。The user data providing system provided by the embodiment of the present invention, according to the user model data generated by the foregoing embodiment, and the pre-built association user record mapping table with multiple user identifiers recorded by the user, the network behavior data of the acquired user The corresponding user identifier is identified, and the user identifier is associated and mapped, and the user attribute information of the corresponding user is obtained, so that the user attribute information of the user for different network behavior data is quickly provided in the whole network system.
实施例五Embodiment 5
图6为本发明实施例提供的计算机设备的逻辑框图。FIG. 6 is a logic block diagram of a computer device according to an embodiment of the present invention.
参照图6,计算机设备可用于实施上述实施例中提供的基于用户网络行为的用户数据处理方法。具体来讲:Referring to FIG. 6, the computer device can be used to implement the user data processing method based on user network behavior provided in the above embodiments. Specifically:
计算机设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(如Central Processing Units,CPU)610和存储器620。其中,存储器620可以是短暂存储或持久存储。存储器620中可存储有一个或一个以上的程序,每个程序可包括对计算机设备中的一系列指令操作。更进一步地,处理器610可与存储器620通信,在计算机设 备上执行存储器620中的一系列指令操作。特别地,存储器620中还存储有一个或一个以上的操作***的数据,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。计算机设备还可包括一个或一个以上电源630,一个或一个以上有线或无线网络接口640,一个或一个以上输入输出接口650等。Computer devices may vary considerably depending on configuration or performance, and may include one or more processors (eg, Central Processing Units, CPU) 610 and memory 620. The memory 620 can be short-lived or persistent. One or more programs may be stored in memory 620, each of which may include a series of instruction operations on a computer device. Further, the processor 610 can communicate with the memory 620, and is configured in the computer. A series of instruction operations in the execution memory 620 are provided. In particular, the memory 620 also stores data of one or more operating systems, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like. The computer device may also include one or more power sources 630, one or more wired or wireless network interfaces 640, one or more input and output interfaces 650, and the like.
具体在本实施例中,计算机设备包括一个或者多个处理器610、存储器620,以及一个或者多个程序,一个或者多个程序存储于存储器620中,且经配置以由一个或者多个处理器610执行一个或者多个程序包含的用于执行基于用户网络行为的用户数据处理方法的指令:获取用户的网络行为数据;分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息;将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。In particular, in the present embodiment, the computer device includes one or more processors 610, memory 620, and one or more programs, one or more programs stored in memory 620, and configured to be executed by one or more processors. 610 executing instructions included in one or more programs for executing a user data processing method based on user network behavior: acquiring network behavior data of the user; respectively, by using at least one classification model for identifying user attributes The behavior data is identified, and at least one user attribute information of the user is acquired; and at least one user attribute information of the user that identifies the acquired is added to the user model data of the user.
这里,所述用户的用户模型数据可包括用户的标识信息以及至少一个所述用户属性信息。Here, the user model data of the user may include identification information of the user and at least one of the user attribute information.
进一步地,处理器610中还包括执行以下处理的指令:从所述用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的所述分类模型。Further, the processor 610 further includes an instruction to: extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
需要说明的是,用户属性信息可包括年龄、性别、地域、兴趣、所属行业当中的至少一个。It should be noted that the user attribute information may include at least one of age, gender, region, interest, and industry.
本发明提供的计算机设备,通过至少一种用于识别用户属性的分类模型对用户的网络行为数据进行识别,获取用户的至少一个用户属性信息,从而形成至少一个用户的且包含至少一个用户属性信息的用户模型数据。利用该用户模型数据可以对进行网络行为的用户进行进一步的受众分析。The computer device provided by the present invention identifies the user's network behavior data by using at least one classification model for identifying the user attribute, and acquires at least one user attribute information of the user, thereby forming at least one user and including at least one user attribute information. User model data. The user model data can be used to perform further audience analysis on users who perform network behavior.
实施例六Embodiment 6
图7为本发明另一实施例提供的计算机设备的逻辑框图。FIG. 7 is a logic block diagram of a computer device according to another embodiment of the present invention.
参照图7,计算机设备可用于实施上述实施例中提供的用户数据的提供方法。具体来讲:Referring to FIG. 7, a computer device can be used to implement the method of providing user data provided in the above embodiments. Specifically:
计算机设备可包括通信单元710、包括有一个或一个以上计算机可读存储介质的存储器720、输入单元730、显示单元740、包括有一个或 者一个以上处理核心的处理器750、以及电源760等部件。本领域技术人员可以理解,图中示出的计算机设备结构并不构成对计算机设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:The computer device can include a communication unit 710, a memory 720 including one or more computer readable storage media, an input unit 730, a display unit 740, including one or More than one processing core processor 750, and power supply 760 and other components. Those skilled in the art will appreciate that the computer device architecture illustrated in the figures does not constitute a limitation to a computer device, and may include more or fewer components than those illustrated, or a combination of certain components, or different component arrangements. among them:
通信单元710可用于收发信息或通话过程中,信号的接收和发送,通信单元710可以为RF(Radio Frequency,射频)电路、路由器、调制解调器、等网络通信设备。此外,通信单元710还可通过无线通信与网络和其他设备通信。无线通信可使用任一通信标准或协议,包括但不限于GSM(Global System of Mobile communication,全球移动通讯***)、GPRS(General Packet Radio Service,通用分组无线服务)、CDMA(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,宽带码分多址)、LTE(Long Term Evolution,长期演进)、电子邮件、SMS(Short Messaging Service,短消息服务)等。The communication unit 710 can be used for transmitting and receiving information or receiving and transmitting signals during a call. The communication unit 710 can be a network communication device such as an RF (Radio Frequency) circuit, a router, a modem, or the like. In addition, communication unit 710 can also communicate with the network and other devices via wireless communication. Wireless communication can use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access) Divisional Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
存储器720可用于存储软件程序以及数据,处理器750通过运行存储在存储器720的软件程序以及数据,从而执行各种功能应用以及数据处理。存储器720可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机设备的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器720可包括高速随机存取存储器,还可包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器720还可包括存储器控制器,以提供处理器750和输入单元730对存储器720的访问。The memory 720 can be used to store software programs and data, and the processor 750 executes various functional applications and data processing by running software programs and data stored in the memory 720. The memory 720 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of computer equipment (such as audio data, phone book, etc.). Moreover, memory 720 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 720 can also include a memory controller to provide access to memory 720 by processor 750 and input unit 730.
显示单元740可用于显示由用户输入的信息或提供给用户的信息以及计算机设备的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示单元740可包括显示面板,可选的,可以采用LCD(Liquid Crystal Display,液晶显示器)、OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板。虽然在图示中,输入单元730与显示单元740是作为两个独立的部件来实现输入和输出功能,但是在某些实施例中,可将输入单元730与显示单元740集成而实现输入和输出功能。 Display unit 740 can be used to display information entered by the user or information provided to the user, as well as various graphical user interfaces of the computer device, which can be constructed from graphics, text, icons, video, and any combination thereof. The display unit 740 may include a display panel. Alternatively, the display panel may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. Although in the illustration, input unit 730 and display unit 740 are implemented as two separate components to implement input and output functions, in some embodiments, input unit 730 can be integrated with display unit 740 for input and output. Features.
处理器750是计算机设备的控制中心,利用各种接口和线路连接整个计算机设备的各个部分,通过运行或执行存储在存储器720内的软件程序和/或模块,以及调用存储在存储器720内的数据,执行计算机设备的各种功能和处理数据,从而对计算机设备进行整体监控。 Processor 750 is a control center for computer devices that connects various portions of the entire computer device using various interfaces and lines, by running or executing software programs and/or modules stored in memory 720, and recalling data stored in memory 720. , performing various functions and processing data of the computer device, thereby performing overall monitoring of the computer device.
计算机设备还可包括给各个部件供电的电源760(比如电池),优选的,电源可通过电源管理***与处理器750逻辑相连,从而通过电源管理***实现管理充电、放电、以及功耗管理等功能。电源760还可包括一个或一个以上的直流或交流电源、再充电***、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。The computer device may also include a power source 760 (such as a battery) that supplies power to the various components. Preferably, the power source may be logically coupled to the processor 750 through a power management system to manage functions such as charging, discharging, and power management through the power management system. . Power supply 760 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
尽管未示出,计算机设备还可以包括摄像头、蓝牙模块、传感器(比如光传感器、运动传感器以及其他传感器等)、音频电路和无线通信单元等,在此不再赘述。Although not shown, the computer device may also include a camera, a Bluetooth module, sensors (such as light sensors, motion sensors, and other sensors, etc.), audio circuits, and wireless communication units, etc., and are not described herein.
在本实施例中,计算机设备包括一个或者多个处理器750、存储器720,以及一个或者多个程序,一个或者多个程序存储于存储器720中,且经配置以由一个或者多个处理器750执行一个或者多个程序包含的用于执行用户数据的提供方法的指令:获取用户的网络行为数据;根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。In the present embodiment, the computer device includes one or more processors 750, memory 720, and one or more programs, one or more programs stored in memory 720, and configured to be executed by one or more processors 750. Executing, by the one or more programs, an instruction for executing a providing method of the user data: acquiring network behavior data of the user; acquiring a second user identifier of the user from the user identifier mapping table according to the first user identifier of the user The user identifier mapping table is associated with a plurality of user identifiers of the user; the user model data of the user is obtained according to the second user identifier, and the user model data includes at least one user attribute information of the user. .
进一步地,处理器750中还包括执行以下处理的指令:从所述网络行为数据中获取所述用户的第一用户标识。Further, the processor 750 further includes an instruction to: obtain the first user identifier of the user from the network behavior data.
此外,用户标识映射表关联记录的多个用户标识可包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。In addition, the plurality of user identifiers of the user identification mapping table association record may include at least two identifiers: at least one user's registration account number, a MAC address, a called user identification number CUID, and an international mobile device identifier IMEI.
本发明提供的计算机设备,依据上述实施例所产生的用户模型数据,以及预先构建的关联记录有用户的多个用户标识的用户标识映射表,对获取的用户的网络行为数据对应的用户标识进行识别,并对该用户标识进行关联映射,得到相应用户的用户属性信息,从而在全网***中快速提供针对不同网络行为数据的用户的用户属性信息。The computer device provided by the present invention, according to the user model data generated by the foregoing embodiment, and the pre-built user identification mapping table with the user identification of multiple user identifiers, the user identifier corresponding to the acquired network behavior data of the user is performed. Identifying and correlating the user identifier to obtain user attribute information of the corresponding user, thereby quickly providing user attribute information of the user for different network behavior data in the whole network system.
需要指出,根据实施的需要,可将本申请中描述的各个步骤拆分为 更多步骤,也可将两个或多个步骤或者步骤的部分操作组合成新的步骤,以实现本发明的目的。It should be noted that the various steps described in this application can be split according to the needs of the implementation. Further steps may also combine two or more steps or part of the steps into new steps to achieve the objectives of the present invention.
上述根据本发明的方法和装置可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的处理方法。此外,当通用计算机访问用于实现在此示出的处理的代码时,代码的执行将通用计算机转换为用于执行在此示出的处理的专用计算机。The above method and apparatus according to the present invention may be implemented in hardware, firmware, or as software or computer code that may be stored in a recording medium such as a CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk, or implemented. The network downloads computer code originally stored in a remote recording medium or non-transitory machine readable medium and stored in a local recording medium so that the methods described herein can be stored using a general purpose computer, a dedicated processor or programmable Such software processing on a recording medium of dedicated hardware such as an ASIC or an FPGA. It will be understood that a computer, processor, microprocessor controller or programmable hardware includes storage components (eg, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is The processing methods described herein are implemented when the processor or hardware is accessed and executed. Moreover, when a general purpose computer accesses code for implementing the processing shown herein, the execution of the code converts the general purpose computer into a special purpose computer for performing the processing shown herein.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。 The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.

Claims (16)

  1. 一种基于用户网络行为的用户数据处理方法,其特征在于,包括:A user data processing method based on user network behavior, characterized in that it comprises:
    获取用户的网络行为数据;Obtain the user's network behavior data;
    分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息;Identifying, by the at least one classification model for identifying the user attribute, the network behavior data of the user, and acquiring at least one user attribute information of the user;
    将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。Adding at least one user attribute information identifying the acquired user to the user model data of the user.
  2. 根据权利要求1所述的方法,其特征在于,所述用户的用户模型数据包括用户的标识信息以及至少一个所述用户属性信息。The method according to claim 1, wherein the user model data of the user comprises identification information of the user and at least one of the user attribute information.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    从所述用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的所述分类模型。A keyword is extracted from the user's network behavior data, and the classification model for identifying the user attribute is selected according to the extracted keyword.
  4. 根据权利要求1~3中任一项所述的方法,其特征在于,所述用户属性信息包括年龄、性别、地域、兴趣、所属行业当中的至少一个。The method according to any one of claims 1 to 3, wherein the user attribute information includes at least one of age, gender, region, interest, and industry.
  5. 一种用户数据的提供方法,其特征在于,包括:A method for providing user data, comprising:
    获取用户的网络行为数据;Obtain the user's network behavior data;
    根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;Obtaining, by the user identifier mapping table, the second user identifier of the user, according to the first user identifier of the user, where the user identifier mapping table is associated with multiple user identifiers of the user;
    根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。Acquiring user model data of the user according to the second user identifier, where the user model data includes at least one user attribute information of the user.
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:The method of claim 5, wherein the method further comprises:
    从所述网络行为数据中获取所述用户的第一用户标识。Obtaining the first user identifier of the user from the network behavior data.
  7. 根据权利要求5或6所述的方法,其特征在于,所述用户标识映射表关联记录的多个用户标识包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。The method according to claim 5 or 6, wherein the plurality of user identifiers of the user identification mapping table association record comprise at least two identifiers: at least one user's registered account number, MAC address, and called user identification number. CUID and International Mobile Equipment Identity IMEI.
  8. 一种基于用户网络行为的用户数据处理***,其特征在于,包括:A user data processing system based on user network behavior, characterized in that it comprises:
    数据获取模块,用于获取用户的网络行为数据;a data acquisition module, configured to acquire network behavior data of the user;
    数据识别模块,用于分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息; a data identification module, configured to identify, by using at least one classification model for identifying a user attribute, network behavior data of the user, and acquiring at least one user attribute information of the user;
    信息添加模块,用于将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。And an information adding module, configured to add at least one user attribute information of the user that identifies the acquired to the user model data of the user.
  9. 根据权利要求8所述的***,其特征在于,所述用户的用户模型数据包括用户的标识信息以及至少一个所述用户属性信息。The system according to claim 8, wherein the user model data of the user comprises identification information of the user and at least one of the user attribute information.
  10. 根据权利要求8或9所述的***,其特征在于,所述***还包括:The system of claim 8 or claim 9, wherein the system further comprises:
    模型选取模块,用于从所述用户的网络行为数据提取关键词,根据提取的关键词选择用于识别用户属性的所述分类模型。And a model selection module, configured to extract keywords from the network behavior data of the user, and select the classification model for identifying user attributes according to the extracted keywords.
  11. 根据权利要求8~10中任一项所述的***,其特征在于,所述用户属性信息包括年龄、性别、地域、兴趣、所属行业当中的至少一个。The system according to any one of claims 8 to 10, wherein the user attribute information includes at least one of age, gender, region, interest, and industry.
  12. 一种用户数据的提供***,其特征在于,包括:A system for providing user data, comprising:
    数据获取模块,用于获取用户的网络行为数据;a data acquisition module, configured to acquire network behavior data of the user;
    用户标识获取模块,用于根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;a user identifier obtaining module, configured to acquire a second user identifier of the user from the user identifier mapping table according to the first user identifier of the user, where the user identifier mapping table associates with multiple user identifiers of the user;
    用户模型数据获取模块,用于根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。And a user model data obtaining module, configured to acquire user model data of the user according to the second user identifier, where the user model data includes at least one user attribute information of the user.
  13. 根据权利要求12所述的***,其特征在于,所述用户标识获取模块还用于从所述网络行为数据中获取所述用户的第一用户标识。The system according to claim 12, wherein the user identification obtaining module is further configured to acquire the first user identifier of the user from the network behavior data.
  14. 根据权利要求12或13所述的***,其特征在于,所述用户标识映射表关联记录的多个用户标识包括至少两个以下标识:至少一个用户的注册账号、MAC地址、被叫用户识别号CUID以及国际移动设备识别码IMEI。The system according to claim 12 or 13, wherein the plurality of user identifiers of the user identification mapping table association record comprise at least two identifiers: at least one user's registered account number, MAC address, and called user identification number. CUID and International Mobile Equipment Identity IMEI.
  15. 一种计算机设备,其特征在于,包括:A computer device, comprising:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个程序,所述一个或多个程序存储在所述存储器中,且经配置以由所述一个或者多个处理器执行所述一个或者多个程序包含的用于执行搜索处理方法的指令:One or more programs, the one or more programs being stored in the memory and configured to be executed by the one or more processors to perform the search processing method included in the one or more programs instruction:
    获取用户的网络行为数据;Obtain the user's network behavior data;
    分别通过至少一种用于识别用户属性的分类模型对所述用户的网络行为数据进行识别,获取所述用户的至少一个用户属性信息; Identifying, by the at least one classification model for identifying the user attribute, the network behavior data of the user, and acquiring at least one user attribute information of the user;
    将识别获取的所述用户的至少一个用户属性信息添加到所述用户的用户模型数据中。Adding at least one user attribute information identifying the acquired user to the user model data of the user.
  16. 一种计算机设备,其特征在于,包括:A computer device, comprising:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个程序,所述一个或多个程序存储在所述存储器中,且经配置以由所述一个或者多个处理器执行所述一个或者多个程序包含的用于执行搜索内容的显示方法的指令:One or more programs, the one or more programs being stored in the memory, and configured to be executed by the one or more processors to perform display included in the one or more programs for performing search content Method instructions:
    获取用户的网络行为数据;Obtain the user's network behavior data;
    根据所述用户的第一用户标识从用户标识映射表中获取所述用户的第二用户标识,其中,所述用户标识映射表关联记录有用户的多个用户标识;Obtaining, by the user identifier mapping table, the second user identifier of the user, according to the first user identifier of the user, where the user identifier mapping table is associated with multiple user identifiers of the user;
    根据所述第二用户标识获取所述用户的用户模型数据,所述用户模型数据包括所述用户的至少一个用户属性信息。 Acquiring user model data of the user according to the second user identifier, where the user model data includes at least one user attribute information of the user.
PCT/CN2015/097772 2015-06-19 2015-12-17 User data processing method, providing method, system and computer device WO2016201933A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510347820.2A CN104951544A (en) 2015-06-19 2015-06-19 User data processing method and system and method and system for providing user data
CN201510347820.2 2015-06-19

Publications (1)

Publication Number Publication Date
WO2016201933A1 true WO2016201933A1 (en) 2016-12-22

Family

ID=54166202

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097772 WO2016201933A1 (en) 2015-06-19 2015-12-17 User data processing method, providing method, system and computer device

Country Status (2)

Country Link
CN (1) CN104951544A (en)
WO (1) WO2016201933A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110347830A (en) * 2019-06-28 2019-10-18 阿里巴巴集团控股有限公司 The implementation method and device of public sentiment early warning
CN111242171A (en) * 2019-12-31 2020-06-05 中移(杭州)信息技术有限公司 Model training, diagnosis and prediction method and device for network fault and electronic equipment
CN111340112A (en) * 2020-02-26 2020-06-26 腾讯科技(深圳)有限公司 Classification method, classification device and server
CN111353001A (en) * 2018-12-24 2020-06-30 杭州海康威视数字技术股份有限公司 Method and device for classifying users

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636473A (en) * 2015-02-13 2015-05-20 百度在线网络技术(北京)有限公司 Data processing method and system based on electronic payment behaviors
CN104951544A (en) * 2015-06-19 2015-09-30 百度在线网络技术(北京)有限公司 User data processing method and system and method and system for providing user data
CN106488493B (en) * 2015-08-24 2020-06-02 阿里巴巴集团控股有限公司 Method and device for identifying network hotspot type of user and electronic equipment
CN105553999B (en) * 2015-12-23 2019-05-31 北京奇虎科技有限公司 Application user behavioural analysis and method of controlling security and its corresponding device
CN106897729B (en) * 2016-06-28 2020-09-11 阿里巴巴集团控股有限公司 Information identification method, model training method, device and processing equipment
CN107818472B (en) * 2016-09-13 2022-03-11 腾讯科技(北京)有限公司 Information processing method and server
CN106998262A (en) * 2016-10-10 2017-08-01 深圳汇网天下科技有限公司 A kind of System and method for for recognizing Internet user
CN107786389A (en) * 2017-10-16 2018-03-09 上海理工大学 A kind of spreading network information device and method thereof
CN107682344A (en) * 2017-10-18 2018-02-09 南京邮数通信息科技有限公司 A kind of ID collection of illustrative plates method for building up based on DPI data interconnection net identifications
CN108038714B (en) * 2017-11-29 2020-12-11 贝壳找房(北京)科技有限公司 Advertisement promotion processing method and device
CN110555451A (en) * 2018-05-31 2019-12-10 北京京东尚科信息技术有限公司 information identification method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957834A (en) * 2010-08-12 2011-01-26 百度在线网络技术(北京)有限公司 Content recommending method and device based on user characteristics
CN102368788A (en) * 2011-12-09 2012-03-07 中国电信股份有限公司 Information pushing method and apparatus thereof
CN103530540A (en) * 2013-09-27 2014-01-22 西安交通大学 User identity attribute detection method based on man-machine interaction behavior characteristics
CN103731284A (en) * 2012-10-11 2014-04-16 腾讯科技(深圳)有限公司 Method and system for correlating a plurality of network accounts
CN103778555A (en) * 2014-01-21 2014-05-07 北京集奥聚合科技有限公司 User attribute mining method and system based on user tags
US20140280221A1 (en) * 2013-03-13 2014-09-18 Google Inc. Tailoring user experience for unrecognized and new users
CN104951544A (en) * 2015-06-19 2015-09-30 百度在线网络技术(北京)有限公司 User data processing method and system and method and system for providing user data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090888B (en) * 2013-12-10 2016-05-11 深圳市腾讯计算机***有限公司 A kind of analytical method of user behavior data and device
CN104298719B (en) * 2014-09-23 2018-02-27 新浪网技术(中国)有限公司 Category division, advertisement placement method and the system of user is carried out based on Social behaviors

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957834A (en) * 2010-08-12 2011-01-26 百度在线网络技术(北京)有限公司 Content recommending method and device based on user characteristics
CN102368788A (en) * 2011-12-09 2012-03-07 中国电信股份有限公司 Information pushing method and apparatus thereof
CN103731284A (en) * 2012-10-11 2014-04-16 腾讯科技(深圳)有限公司 Method and system for correlating a plurality of network accounts
US20140280221A1 (en) * 2013-03-13 2014-09-18 Google Inc. Tailoring user experience for unrecognized and new users
CN103530540A (en) * 2013-09-27 2014-01-22 西安交通大学 User identity attribute detection method based on man-machine interaction behavior characteristics
CN103778555A (en) * 2014-01-21 2014-05-07 北京集奥聚合科技有限公司 User attribute mining method and system based on user tags
CN104951544A (en) * 2015-06-19 2015-09-30 百度在线网络技术(北京)有限公司 User data processing method and system and method and system for providing user data

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353001A (en) * 2018-12-24 2020-06-30 杭州海康威视数字技术股份有限公司 Method and device for classifying users
CN111353001B (en) * 2018-12-24 2023-08-18 杭州海康威视数字技术股份有限公司 Method and device for classifying users
CN110347830A (en) * 2019-06-28 2019-10-18 阿里巴巴集团控股有限公司 The implementation method and device of public sentiment early warning
CN110347830B (en) * 2019-06-28 2023-09-05 创新先进技术有限公司 Public opinion early warning implementation method and device
CN111242171A (en) * 2019-12-31 2020-06-05 中移(杭州)信息技术有限公司 Model training, diagnosis and prediction method and device for network fault and electronic equipment
CN111242171B (en) * 2019-12-31 2023-10-31 中移(杭州)信息技术有限公司 Model training and diagnosis prediction method and device for network faults and electronic equipment
CN111340112A (en) * 2020-02-26 2020-06-26 腾讯科技(深圳)有限公司 Classification method, classification device and server
CN111340112B (en) * 2020-02-26 2023-09-26 腾讯科技(深圳)有限公司 Classification method, classification device and classification server

Also Published As

Publication number Publication date
CN104951544A (en) 2015-09-30

Similar Documents

Publication Publication Date Title
WO2016201933A1 (en) User data processing method, providing method, system and computer device
JP6388988B2 (en) Static ranking for search queries in online social networks
US10785510B2 (en) Automatic recognition of entities in media-captured events
JP6377807B2 (en) Rewriting search queries in online social networks
JP6555883B2 (en) Method and apparatus for presenting search results
WO2016107523A1 (en) Access path analysis method and apparatus for website
WO2016095621A1 (en) Information providing method, apparatus, and computer device
US10552422B2 (en) Extended search method and apparatus
US9858342B2 (en) Method and system for searching for applications respective of a connectivity mode of a user device
CN108984650B (en) Computer-readable recording medium and computer device
US20140095308A1 (en) Advertisement distribution apparatus and advertisement distribution method
US20140372403A1 (en) Methods and systems for information matching
WO2017143930A1 (en) Method of sorting search results, and device for same
CN107920103B (en) Information pushing method and system, client and server
US20180025372A1 (en) Deriving audiences through filter activity
US10061806B2 (en) Presenting previously selected search results
US11082800B2 (en) Method and system for determining an occurrence of a visit to a venue by a user
US9633084B2 (en) Information searching method and device, and computer storage medium
US20170185676A1 (en) System and method for profiling a user based on visual content
CN110674394A (en) Knowledge graph-based information recommendation method and device and storage medium
US20160156724A1 (en) Method, apparatus, and system for determining target user for service policy
US10033737B2 (en) System and method for cross-cloud identity matching
JP2014174781A (en) Item recommendation system, method, and program
TWI639093B (en) Object set and processing method and device thereof
KR101545653B1 (en) Apparatus and method for providing searching service

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15895498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15895498

Country of ref document: EP

Kind code of ref document: A1