CN113743103A - Comment user identity identification method and device, computer equipment and storage medium - Google Patents

Comment user identity identification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113743103A
CN113743103A CN202110963292.9A CN202110963292A CN113743103A CN 113743103 A CN113743103 A CN 113743103A CN 202110963292 A CN202110963292 A CN 202110963292A CN 113743103 A CN113743103 A CN 113743103A
Authority
CN
China
Prior art keywords
user
text content
comment
comment text
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110963292.9A
Other languages
Chinese (zh)
Inventor
王济宣
顾扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xingyun Digital Technology Co Ltd
Original Assignee
Nanjing Xingyun Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xingyun Digital Technology Co Ltd filed Critical Nanjing Xingyun Digital Technology Co Ltd
Priority to CN202110963292.9A priority Critical patent/CN113743103A/en
Publication of CN113743103A publication Critical patent/CN113743103A/en
Priority to CA3170614A priority patent/CA3170614A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a comment user identity identification method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring the text content of the user comment to be identified and the matched associated text content of the user comment; obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content; inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index; and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user. By adopting the method, the comment text content can be monitored by identifying the user identity of the comment text content, and the identification accuracy of the comment text content is improved.

Description

Comment user identity identification method and device, computer equipment and storage medium
Technical Field
The application relates to the technical field of computers, in particular to a comment user identity identification method and device, computer equipment and a storage medium.
Background
Natural language processing of comment text is a widely used module in text processing. The existing comment text classification mainly includes that a dictionary is used for segmenting words of a text, then a sentence after word segmentation is converted into a sentence vector in a weighted average mode, and then a classification model is used for obtaining relevant attributes in the comment text. The method has the great disadvantage that the semantics of the text context cannot be understood, and simultaneously, similar sentences or same words have different semantics in different scenes, so that sentence vectors have some problems, and the accuracy of comment text recognition is low.
Disclosure of Invention
Therefore, it is necessary to provide a comment user identity identification method, apparatus, computer device and storage medium capable of monitoring comment text content and improving the identification accuracy of comment text content by identifying the user identity of comment text content.
A comment user identification method comprises the following steps:
acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
In one embodiment, the obtaining of the text content of the user comment to be identified and the matched text content of the associated user comment includes: acquiring a user identifier to be identified and a comment platform identifier corresponding to the comment text content of the user to be identified; and acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
In one embodiment, obtaining a user comment text content vector set according to a user comment text content to be identified and an associated user comment text content includes: segmenting the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of characters; acquiring a digital vector corresponding to each character; acquiring a preset vector dimension; and forming a user comment text content vector set according to the preset vector dimension and each digital vector.
In one embodiment, the method for forming a user comment text content vector set according to a preset vector dimension and each digital vector comprises the following steps: when the number corresponding to the digital vector does not reach the preset vector dimension, acquiring a default vector; and forming a user comment text content vector set according to the default vector and the number vector.
In one embodiment, the target user identification type recognition model training step includes: acquiring training comment text contents, wherein the training comment text contents are associated with standard training user identity attribute indexes; converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text content vector into an original user identity type recognition model to obtain a target training user identity attribute index; calculating according to the standard training user identity attribute index and the target training user identity attribute index to obtain a training loss value; and performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, and obtaining a target user identity type recognition model.
In one embodiment, determining a target user identity type corresponding to the comment text content of the user to be identified according to the target user identity attribute index includes: acquiring a preset user identity type mapping table, wherein the user identity type mapping table reflects the mapping relation between a user identity attribute index and a corresponding user identity type; and determining the identity type of the target user corresponding to the identity attribute index of the target user according to a preset user identity type mapping table.
In one embodiment, monitoring the text content of the comment of the identified user according to the identity type of the target user includes: when the target user identity type is a non-legal user identity, deleting the comment text content of the identified user; and when the target user identity type is a legal user identity, carrying out corresponding product adjustment according to the identified user comment text content.
A comment user identification apparatus, the apparatus comprising:
the first acquisition module is used for acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
the second acquisition module is used for acquiring a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
the input module is used for inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and the generating module is used for determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user and monitoring the comment text content of the identified user according to the identity type of the target user.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
According to the comment user identity identification method and device, the computer equipment and the storage medium, the comment text content of the user to be identified and the matched associated user comment text content are obtained; obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content; inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index; and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user. The method comprises the steps of analyzing the text content of the comment of the user to be identified and the matched associated text content of the comment of the user, determining the corresponding user identity type, carrying out corresponding processing on the text content of the comment of the identified user according to the user identity type, monitoring the text content of the comment by identifying the user identity of the text content of the comment, and improving the identification accuracy of the text content of the comment.
Drawings
FIG. 1 is a diagram of an application environment for a method for reviewing identity recognition of a user in one embodiment;
FIG. 2 is a flow diagram illustrating a methodology for reviewing identity identifiers for users in one embodiment;
FIG. 3 is a flowchart illustrating a step of obtaining text content of a comment of a user to be recognized in one embodiment;
FIG. 4 is a flowchart illustrating a step of obtaining a vector set of user comment text contents in one embodiment;
FIG. 5 is a flowchart illustrating the steps in a user comment text content vector set composition step in one embodiment;
FIG. 6 is a flowchart illustrating the steps of the target user identification type recognition model training in one embodiment;
FIG. 7 is a flowchart illustrating the target user identity type determination step in one embodiment;
FIG. 8 is a flowchart of the step of monitoring the text content of the identified user comment in one embodiment;
FIG. 9 is a block diagram showing the structure of a comment user identification apparatus in one embodiment;
FIG. 10 is a diagram showing an internal structure of a computer device in one embodiment;
FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The comment user identity identification method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
Specifically, the terminal 102 obtains a to-be-identified user comment text content and a matched associated user comment text content, and sends the to-be-identified user comment text content and the matched associated user comment text content to the server 104, the server 104 obtains the to-be-identified user comment text content and the matched associated user comment text content, obtains a user comment text content vector set according to the to-be-identified user comment text content and the associated user comment text content, inputs the user comment text content vector set into a target user identity type identification model, obtains a target user identity attribute index, determines a target user identity type corresponding to the to-be-identified user comment text content according to the target user identity attribute index, and monitors the identified user comment text content according to the target user identity type.
In another embodiment, the terminal 102 obtains the comment text content of the user to be identified and the matched associated user comment text content, obtains a user comment text content vector set according to the comment text content of the user to be identified and the associated user comment text content, inputs the user comment text content vector set into the identification model of the identity type of the target user to obtain an identity attribute index of the target user, determines the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitors the comment text content of the identified user according to the identity type of the target user.
In one embodiment, as shown in fig. 2, a comment user identification method is provided, which is described by taking the method as an example applied to the terminal or the server in fig. 1, and includes the following steps:
step 202, obtaining the text content of the user comment to be identified and the matched text content of the associated user comment.
The text content of the comment of the user to be identified is the text content related to the user to be identified, and can be a text of the comment of the user to be identified, or other text content related to the user to be identified in a comment platform where the user to be identified is located. The associated user comment text content is the user comment text content associated with the to-be-identified user comment text content, and can be text content related to other users of the comment platform where the to-be-identified user comment text content is located. Specifically, the comment text content of the user to be recognized can be obtained according to the user identifier of the user to be recognized, and then the matched comment text content of the associated user can be obtained according to the comment text content of the user to be recognized.
For example, the text content of the comment of the user to be identified may be the text of the comment of the input user, and the text content of the matched comment of the associated user may be the text of the comment of the reply object of the input user, the text of the comment of the main building of the forum, the name of the plate of the forum, and the name of the user.
And 204, obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content.
Specifically, after obtaining the text content of the user comment to be recognized and the text content of the associated user comment, vector conversion may be performed on the text content of the user comment to be recognized and the text content of the associated user comment, specifically, a preset vector dictionary is obtained, and a user comment text content vector corresponding to the text content of the user comment to be recognized and the text content of the associated user comment is searched according to the preset vector dictionary to obtain a user comment text content vector set.
And step 206, inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index.
The target user identity type identification model is used for identifying a user identity attribute index of a user to be identified corresponding to comment text content of the user to be identified, the user identity attribute index is used for identifying the user identity type, and different user identity attribute indexes correspond to different user identity types. The target user identity type identification model can be obtained through pre-supervised training, and specifically can be obtained through training of sample data according to actual business requirements, product requirements or actual application scenes.
Specifically, after the user comment text content vector set is obtained, the user comment text content vector set can be used as an input of a target user identity type identification model, and the user comment text content vector set is identified through the target identity type identification model to obtain a corresponding target user identity attribute index.
And 208, determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
The user identity attribute index is used for identifying the user identity type, and different user identity attribute indexes correspond to different user identity types, so that the mapping relation between the user identity attribute index and the matched user identity type can be established in advance according to actual service requirements, product requirements or actual application scenes. Specifically, after the target user identity attribute index output by the target identity type identification model is obtained, the target user identity type corresponding to the target user identity attribute index can be determined according to a pre-established mapping relation, and finally, the comment text content of the identified user is monitored according to the target user identity type. The user comment text content is monitored through the identity type of the target user, and influence and loss caused by comment text content of some illegal users can be effectively avoided.
In the comment user identity identification method, the comment text content of a user to be identified and the matched associated user comment text content are obtained; obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content; inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index; and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user. The method comprises the steps of analyzing the text content of the comment of the user to be identified and the matched associated text content of the comment of the user, determining the corresponding user identity type, carrying out corresponding processing on the text content of the comment of the identified user according to the user identity type, monitoring the text content of the comment by identifying the user identity of the text content of the comment, and improving the identification accuracy of the text content of the comment.
In one embodiment, as shown in fig. 3, acquiring text content of a user comment to be identified and matched text content of an associated user comment includes:
step 302, obtaining a to-be-identified user identifier and a comment platform identifier corresponding to the to-be-identified user comment text content.
And 304, acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
The to-be-recognized user identifier is used for identifying a publishing user who comments the text content of the to-be-recognized user, and the comment platform identifier is used for identifying a platform where the comment text content of the to-be-recognized user is located. Specifically, the publishing user who comments the text content of the user to be recognized can be determined according to the user identifier to be recognized, and the platform at which the publishing user specifically publishes can be determined according to the comment platform identifier. Therefore, the published user corresponding to the user identifier to be identified can be published on the platform corresponding to the comment platform identifier, other users of the platform comment on the content published by the published user, and comments of other users can be determined as matched associated user comment text content.
In another embodiment, the comment platform corresponding to the comment platform identifier has a plurality of comment sub-platforms, such as forums, the comment sub-platform where the posting user corresponding to the user identifier to be identified is located is obtained, and all relevant comment texts of the comment sub-platform are determined as the matched associated user comment text contents. For example, the matched associated user comment text content may be a comment text posting a user reply object, a forum main building comment text, a forum plate name, a user name, and so on.
In an embodiment, as shown in fig. 4, obtaining a user comment text content vector set according to a to-be-identified user comment text content and an associated user comment text content includes:
and 402, segmenting the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of characters.
Step 404, obtaining a number vector corresponding to each character.
Step 406, obtaining a preset vector dimension.
And step 408, forming a user comment text content vector set according to the preset vector dimension and each digital vector.
The text content of the user comment to be identified and the text content of the associated user comment are segmented to obtain a plurality of characters, wherein the segmentation can be that the text content of the user comment to be identified and the text content of the associated user comment are segmented word by word to obtain each character. In fact, these words are the smallest units that constitute the text content of the user comment to be identified and the text content of the associated user comment.
Further, the digital vector corresponding to each character is obtained, and the association relationship between all characters and the corresponding digital vector can be established in advance according to actual business requirements, product requirements or actual application scenes, so that after the characters corresponding to the text content of the comment of the user to be identified and the text content of the comment of the associated user are obtained, the digital vector corresponding to each character is obtained according to the association relationship.
The preset vector dimension refers to the length of a user comment text content vector set, and can be determined according to actual business requirements and product requirements, and finally, the user comment text content vector set conforming to the preset vector dimension is formed by all digital vectors.
For example, if the text content of the user comment to be recognized and the text content of the associated user comment are "this good commodity", the converted vector is (203,166,122,150,34,15,0,0,0,0, …,0), where the first six digits of the vector correspond to "this good commodity", and the next 194 0 s are padded, so that the length of the vector satisfies 200, where 200 is the preset vector dimension.
In one embodiment, as shown in fig. 5, composing a user comment text content vector set according to preset vector dimensions and respective digital vectors includes:
step 502, when the number corresponding to the digital vector does not reach the preset vector dimension, a default vector is obtained.
And step 504, forming a user comment text content vector set according to the default vector and the digital vector.
The preset vector dimension can be determined according to actual business requirements, product requirements or actual application scenes, and the preset vector dimension refers to the length of the user comment text content vector set. Specifically, whether the number corresponding to the digital vectors reaches the preset vector dimension is detected, if yes, the number corresponding to the digital vectors accords with the preset vector dimension, no operation is needed, and a user comment text content vector set is directly formed by all the digital vectors. Otherwise, if the number corresponding to the number vector does not reach the preset vector dimension, it indicates that the number corresponding to the number vector does not conform to the vector dimension, and a default vector needs to be obtained.
The default vector is a default digital vector determined according to actual service requirements, product requirements or actual application scenarios. And finally, forming a user comment text content vector set by the default vector and the number vector.
For example, if the text content of the user comment to be recognized and the text content of the associated user comment are "this good commodity", the converted vector is (203,166,122,150,34,15,0,0,0,0, …,0), where the first six digits of the vector correspond to "this good commodity", and the next 194 0 s are padded, so that the length of the vector satisfies 200, where 200 is the preset vector dimension. And acquiring 194 default vectors 0 because the digital vector corresponding to the commodity is good does not reach the preset vector dimension 200, and finally, forming a user comment text content vector set by the default vectors 0 and 6 digital vectors.
In one embodiment, as shown in fig. 6, the step of training the target user identification type recognition model includes:
step 602, obtaining a training comment text content, wherein the training comment text content is associated with a standard training user identity attribute index.
And step 604, converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text vector into the original user identity type recognition model to obtain the target training user identity attribute index.
The training comment text content can be obtained by crawling the big data or from other business systems, and is not limited herein. The standard training user identity attribute index is a correct user identity attribute index corresponding to the training comment text content and is used for judging whether the user identity attribute index output by the user identity type recognition model is correct or not and whether the model achieves the training purpose or not.
Further, the training comment text content is converted into a corresponding training comment text content vector, specifically, the training comment text content is divided word by word to obtain each divided word, and a digital vector corresponding to each word is obtained to obtain the training comment text content vector. And finally, inputting the training comment text content vector into an original user identity type recognition model, and performing recognition analysis on the training comment text content vector through the original user identity type recognition model to obtain the target training user identity attribute index.
And 606, calculating a training loss value according to the standard training user identity attribute index and the target training user identity attribute index.
And 608, performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, and obtaining a target user identity type recognition model.
Specifically, after obtaining the target training user identity attribute index, it may be determined whether the original user identity type recognition model achieves the training purpose through the standard training user identity attribute index and the target training user identity attribute index, specifically, a training loss value may be obtained by calculating according to the standard training user identity attribute index and the target training user identity attribute index, and the specific calculation may be, but is not limited to, obtaining a mean value, a difference value, and the like. Therefore, the training loss value can be used for judging whether the original user identity type recognition model reaches the convergence condition, if not, the model parameters of the original user identity type recognition model are continuously adjusted according to the training loss value until the training loss value reaches the convergence condition, and model training is completed to obtain the target user identity type recognition model. The convergence condition may be determined according to an actual service requirement, a product requirement, or an actual application scenario, for example, when the training loss value is no longer changed, or the training loss value reaches a minimum value, or the training frequency reaches a preset training frequency threshold, it is determined that the model training has reached the convergence condition.
In an embodiment, as shown in fig. 7, determining, according to the target user identity attribute index, a target user identity type corresponding to the comment text content of the user to be identified includes:
step 702, a preset user identity type mapping table is obtained, and the user identity type mapping table reflects the mapping relationship between the user identity attribute index and the corresponding user identity type.
Step 704, determining the target user identity type corresponding to the target user identity attribute index according to the preset user identity type mapping table.
The preset user identity type mapping table is established in advance according to actual service requirements, product requirements or actual application scenes and is used for describing the mapping relation between the user identity attribute index and the matched user identity type. Specifically, after the target user identity attribute index is obtained, the target user identity type corresponding to the target user identity attribute index is found out according to the mapping relation described by the preset user identity type mapping table.
In one embodiment, as shown in fig. 8, monitoring the text content of the comment of the identified user according to the identity type of the target user includes:
and step 802, when the type of the target user identity is a non-legal user identity, deleting the comment text content of the identified user.
And step 804, when the type of the target user identity is a legal user identity, performing corresponding product adjustment according to the comment text content of the identified user.
After the target user identity type is obtained, corresponding processing can be performed according to the legality of the target user identity type, and different processing can be performed on different target user identity types. Specifically, whether the target user identity type is a legal user identity is determined, and if the target user identity type is the legal user identity, corresponding product adjustment can be performed according to the identified user comment text content. For example, the identified user comment text content is: the product interface of the product A is too complicated. Because the target user identity type is legal user identity, product adjustment can be performed on the product interface of the product A by referring to the identified user comment text content.
On the contrary, if the target user identity type is a non-legal user identity, it indicates that the publishing user of the comment text content of the identified user may be an illegal user, so that in order to avoid the adverse effect of the comment text content of the user, the comment text content of the identified user can be deleted, and the adverse effect of some bad statements on products is avoided.
In a specific embodiment, a comment user identification method is provided, which specifically includes the following steps:
1. and acquiring training comment text contents, wherein the training comment text contents are associated with standard training user identity attribute indexes.
2. And converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text vector into the original user identity type recognition model to obtain the target training user identity attribute index.
3. And calculating according to the standard training user identity attribute index and the target training user identity attribute index to obtain a training loss value.
4. And performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, and obtaining a target user identity type recognition model.
5. And acquiring the text content of the user comment to be identified and the matched associated text content of the user comment.
And 5-1, acquiring a to-be-identified user identifier and a comment platform identifier corresponding to the comment text content of the to-be-identified user.
And 5-2, acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
6. And obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content.
And 6-1, segmenting the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of characters.
And 6-2, acquiring a digital vector corresponding to each character.
And 6-3, acquiring a preset vector dimension.
And 6-4, forming a user comment text content vector set according to the preset vector dimension and each digital vector.
6-4-1, and acquiring a default vector when the number corresponding to the digital vector does not reach the preset vector dimension.
And 6-4-2, forming a user comment text content vector set according to the default vector and the number vector.
7. And inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index.
8. And determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user.
8-1, acquiring a preset user identity type mapping table, wherein the user identity type mapping table reflects the mapping relation between the user identity attribute index and the corresponding user identity type.
And 8-2, determining the identity type of the target user corresponding to the identity attribute index of the target user according to a preset user identity type mapping table.
9. And monitoring the comment text content of the identified user according to the identity type of the target user.
9-1, when the type of the target user identity is a non-legal user identity, deleting the comment text content of the identified user.
And 9-2, when the identity type of the target user is a legal user identity, performing corresponding product adjustment according to the comment text content of the identified user.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 9, there is provided a comment user identification apparatus 900 including: a first obtaining module 902, a second obtaining module 904, an input module 906, and a generating module 908, wherein:
the first obtaining module 902 is configured to obtain a text content of a comment of a user to be identified and a matched text content of a comment of an associated user.
And the second obtaining module 904 is configured to obtain a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content.
And the input module 906 is configured to input the user comment text content vector set into the target user identity type identification model, so as to obtain a target user identity attribute index.
The generating module 908 is configured to determine, according to the target user identity attribute index, a target user identity type corresponding to the comment text content of the user to be identified, and monitor the comment text content of the identified user according to the target user identity type.
In an embodiment, the first obtaining module 902 obtains a to-be-identified user identifier and a comment platform identifier corresponding to a to-be-identified user comment text content, and obtains a matched associated user comment text content according to the to-be-identified user identifier and the comment platform identifier.
In one embodiment, the second obtaining module 904 segments the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of words, obtains a number vector corresponding to each word, obtains a preset vector dimension, and forms a vector set of the text content of the user comment according to the preset vector dimension and each number vector.
In one embodiment, the second obtaining module 904 obtains a default vector when the number corresponding to the digital vector does not reach the preset vector dimension, and forms a user comment text content vector set according to the default vector and the digital vector.
In one embodiment, the comment user identity recognition device 900 obtains a training comment text content, the training comment text content is associated with a standard training user identity attribute index, the training comment text content is converted into a corresponding training comment text content vector, the training comment text content is input into the original user identity type recognition model to obtain a target training user identity attribute index, a training loss value is obtained through calculation according to the standard training user identity attribute index and the target training user identity attribute index, model parameter adjustment is performed on the original user identity type recognition model according to the training loss value until a convergence condition is met, and the target user identity type recognition model is obtained.
In an embodiment, the generating module 908 obtains a preset user identity type mapping table, where the user identity type mapping table reflects a mapping relationship between a user identity attribute index and a corresponding user identity type, and determines a target user identity type corresponding to the target user identity attribute index according to the preset user identity type mapping table.
In one embodiment, the generating module 908 performs deletion processing on the identified user comment text content when the target user identity type is a non-legal user identity, and performs corresponding product adjustment according to the identified user comment text content when the target user identity type is a legal user identity.
For specific limitations of the comment user identification apparatus, reference may be made to the above limitations on the comment user identification method, which is not described herein again. The modules in the comment user identification device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the identification model of the identity type of the target user. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a comment user identification method.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a comment user identification method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the configurations shown in fig. 10 or 11 are merely block diagrams of some configurations relevant to the present disclosure, and do not constitute a limitation on the computing devices to which the present disclosure may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring the text content of the user comment to be identified and the matched associated text content of the user comment; obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content; inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index; and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a user identifier to be identified and a comment platform identifier corresponding to the comment text content of the user to be identified; and acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
In one embodiment, the processor, when executing the computer program, further performs the steps of: segmenting the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of characters; acquiring a digital vector corresponding to each character; acquiring a preset vector dimension; and forming a user comment text content vector set according to the preset vector dimension and each digital vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of: when the number corresponding to the digital vector does not reach the preset vector dimension, acquiring a default vector; and forming a user comment text content vector set according to the default vector and the number vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring training comment text contents, wherein the training comment text contents are associated with standard training user identity attribute indexes; converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text content vector into an original user identity type recognition model to obtain a target training user identity attribute index; calculating according to the standard training user identity attribute index and the target training user identity attribute index to obtain a training loss value; and performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, and obtaining a target user identity type recognition model.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset user identity type mapping table, wherein the user identity type mapping table reflects the mapping relation between a user identity attribute index and a corresponding user identity type; and determining the identity type of the target user corresponding to the identity attribute index of the target user according to a preset user identity type mapping table.
In one embodiment, the processor, when executing the computer program, further performs the steps of: when the target user identity type is a non-legal user identity, deleting the comment text content of the identified user; and when the target user identity type is a legal user identity, carrying out corresponding product adjustment according to the identified user comment text content.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring the text content of the user comment to be identified and the matched associated text content of the user comment; obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content; inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index; and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a user identifier to be identified and a comment platform identifier corresponding to the comment text content of the user to be identified; and acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
In one embodiment, the processor, when executing the computer program, further performs the steps of: segmenting the text content of the user comment to be identified and the text content of the associated user comment to obtain a plurality of characters; acquiring a digital vector corresponding to each character; acquiring a preset vector dimension; and forming a user comment text content vector set according to the preset vector dimension and each digital vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of: when the number corresponding to the digital vector does not reach the preset vector dimension, acquiring a default vector; and forming a user comment text content vector set according to the default vector and the number vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring training comment text contents, wherein the training comment text contents are associated with standard training user identity attribute indexes; converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text content vector into an original user identity type recognition model to obtain a target training user identity attribute index; calculating according to the standard training user identity attribute index and the target training user identity attribute index to obtain a training loss value; and performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, and obtaining a target user identity type recognition model.
In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring a preset user identity type mapping table, wherein the user identity type mapping table reflects the mapping relation between a user identity attribute index and a corresponding user identity type; and determining the identity type of the target user corresponding to the identity attribute index of the target user according to a preset user identity type mapping table.
In one embodiment, the processor, when executing the computer program, further performs the steps of: when the target user identity type is a non-legal user identity, deleting the comment text content of the identified user; and when the target user identity type is a legal user identity, carrying out corresponding product adjustment according to the identified user comment text content.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A comment user identification method, the method comprising:
acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user, and monitoring the comment text content of the identified user according to the identity type of the target user.
2. The method of claim 1, wherein the obtaining of the text content of the user comment to be identified and the matched text content of the associated user comment comprises:
acquiring a user identifier to be identified and a comment platform identifier corresponding to the comment text content of the user to be identified;
and acquiring matched associated user comment text content according to the user identification to be recognized and the comment platform identification.
3. The method of claim 1, wherein the deriving a set of user comment text content vectors from the to-be-identified user comment text content and the associated user comment text content comprises:
segmenting the user comment text content to be identified and the associated user comment text content to obtain a plurality of characters;
acquiring a digital vector corresponding to each character;
acquiring a preset vector dimension;
and forming a user comment text content vector set according to the preset vector dimension and each digital vector.
4. The method of claim 3, wherein said composing a set of user comment text content vectors from said preset vector dimensions and respective said numeric vectors comprises:
when the number corresponding to the digital vector does not reach the preset vector dimension, acquiring a default vector;
and forming a user comment text content vector set according to the default vector and the digital vector.
5. The method of claim 1, wherein the step of training the target user identity type recognition model comprises:
acquiring training comment text contents, wherein the training comment text contents are associated with standard training user identity attribute indexes;
converting the training comment text content into a corresponding training comment text content vector, and inputting the training comment text content vector into an original user identity type recognition model to obtain a target training user identity attribute index;
calculating to obtain a training loss value according to the standard training user identity attribute index and the target training user identity attribute index;
and performing model parameter adjustment on the original user identity type recognition model according to the training loss value until a convergence condition is met, so as to obtain a target user identity type recognition model.
6. The method according to claim 1, wherein the determining, according to the target user identity attribute index, a target user identity type corresponding to the text content of the comment of the user to be identified includes:
acquiring a preset user identity type mapping table, wherein the user identity type mapping table reflects the mapping relation between a user identity attribute index and a corresponding user identity type;
and determining the target user identity type corresponding to the target user identity attribute index according to the preset user identity type mapping table.
7. The method of claim 1, wherein the monitoring the text content of the identified user comment according to the identity type of the target user comprises:
when the target user identity type is a non-legal user identity, deleting the identified user comment text content;
and when the target user identity type is a legal user identity, carrying out corresponding product adjustment according to the identified user comment text content.
8. An apparatus for identifying a comment user, the apparatus comprising:
the first acquisition module is used for acquiring the text content of the user comment to be identified and the matched associated text content of the user comment;
the second obtaining module is used for obtaining a user comment text content vector set according to the user comment text content to be identified and the associated user comment text content;
the input module is used for inputting the user comment text content vector set into a target user identity type identification model to obtain a target user identity attribute index;
and the generating module is used for determining the identity type of the target user corresponding to the comment text content of the user to be identified according to the identity attribute index of the target user and monitoring the comment text content of the identified user according to the identity type of the target user.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110963292.9A 2021-08-20 2021-08-20 Comment user identity identification method and device, computer equipment and storage medium Withdrawn CN113743103A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110963292.9A CN113743103A (en) 2021-08-20 2021-08-20 Comment user identity identification method and device, computer equipment and storage medium
CA3170614A CA3170614A1 (en) 2021-08-20 2022-08-17 Commentating user identification recognizing method, device, computer equipment, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110963292.9A CN113743103A (en) 2021-08-20 2021-08-20 Comment user identity identification method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113743103A true CN113743103A (en) 2021-12-03

Family

ID=78732182

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110963292.9A Withdrawn CN113743103A (en) 2021-08-20 2021-08-20 Comment user identity identification method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN113743103A (en)
CA (1) CA3170614A1 (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853948A (en) * 2012-11-28 2014-06-11 阿里巴巴集团控股有限公司 User identity recognizing and information filtering and searching method and server
CN107015993A (en) * 2016-01-28 2017-08-04 ***通信集团上海有限公司 A kind of user type recognition methods and device
US20190122215A1 (en) * 2017-10-19 2019-04-25 Capital One Services, Llc User account controls for online transactions
CN110096499A (en) * 2019-04-10 2019-08-06 华南理工大学 A kind of the user object recognition methods and system of Behavior-based control time series big data
CN110209795A (en) * 2018-06-11 2019-09-06 腾讯科技(深圳)有限公司 Comment on recognition methods, device, computer readable storage medium and computer equipment
CN110706026A (en) * 2019-09-25 2020-01-17 精硕科技(北京)股份有限公司 Abnormal user identification method, identification device and readable storage medium
CN111191099A (en) * 2019-12-30 2020-05-22 中国地质大学(武汉) User activity type identification method based on social media
CN111382366A (en) * 2020-03-03 2020-07-07 重庆邮电大学 Social network user identification method and device based on language and non-language features
CN111444440A (en) * 2020-06-15 2020-07-24 腾讯科技(深圳)有限公司 Identity information identification method and device, electronic equipment and storage medium
CN111585851A (en) * 2020-04-13 2020-08-25 中国联合网络通信集团有限公司 Method and device for identifying private line user
CN112529629A (en) * 2020-12-16 2021-03-19 北京居理科技有限公司 Malicious user comment brushing behavior identification method and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853948A (en) * 2012-11-28 2014-06-11 阿里巴巴集团控股有限公司 User identity recognizing and information filtering and searching method and server
CN107015993A (en) * 2016-01-28 2017-08-04 ***通信集团上海有限公司 A kind of user type recognition methods and device
US20190122215A1 (en) * 2017-10-19 2019-04-25 Capital One Services, Llc User account controls for online transactions
CN110209795A (en) * 2018-06-11 2019-09-06 腾讯科技(深圳)有限公司 Comment on recognition methods, device, computer readable storage medium and computer equipment
CN110096499A (en) * 2019-04-10 2019-08-06 华南理工大学 A kind of the user object recognition methods and system of Behavior-based control time series big data
CN110706026A (en) * 2019-09-25 2020-01-17 精硕科技(北京)股份有限公司 Abnormal user identification method, identification device and readable storage medium
CN111191099A (en) * 2019-12-30 2020-05-22 中国地质大学(武汉) User activity type identification method based on social media
CN111382366A (en) * 2020-03-03 2020-07-07 重庆邮电大学 Social network user identification method and device based on language and non-language features
CN111585851A (en) * 2020-04-13 2020-08-25 中国联合网络通信集团有限公司 Method and device for identifying private line user
CN111444440A (en) * 2020-06-15 2020-07-24 腾讯科技(深圳)有限公司 Identity information identification method and device, electronic equipment and storage medium
CN112529629A (en) * 2020-12-16 2021-03-19 北京居理科技有限公司 Malicious user comment brushing behavior identification method and system

Also Published As

Publication number Publication date
CA3170614A1 (en) 2023-02-20

Similar Documents

Publication Publication Date Title
CN108874992B (en) Public opinion analysis method, system, computer equipment and storage medium
CN108595695B (en) Data processing method, data processing device, computer equipment and storage medium
CN111192025A (en) Occupational information matching method and device, computer equipment and storage medium
CN108809718B (en) Network access method, system, computer device and medium based on virtual resources
CN110377558A (en) Document searching method, device, computer equipment and storage medium
CN110888911A (en) Sample data processing method and device, computer equipment and storage medium
CN110781379A (en) Information recommendation method and device, computer equipment and storage medium
CN110674131A (en) Financial statement data processing method and device, computer equipment and storage medium
CN112035611B (en) Target user recommendation method, device, computer equipment and storage medium
CN112632139A (en) Information pushing method and device based on PMIS system, computer equipment and medium
CN111651666A (en) User theme recommendation method and device, computer equipment and storage medium
CN108200087B (en) Web intrusion detection method and device, computer equipment and storage medium
CN110633413A (en) Label recommendation method and device, computer equipment and storage medium
CN111190946A (en) Report generation method and device, computer equipment and storage medium
CN112559526A (en) Data table export method and device, computer equipment and storage medium
CN114399396A (en) Insurance product recommendation method and device, computer equipment and storage medium
CN112988997A (en) Response method and system of intelligent customer service, computer equipment and storage medium
CN110533381B (en) Case jurisdiction auditing method, device, computer equipment and storage medium
CN113743103A (en) Comment user identity identification method and device, computer equipment and storage medium
CN113743129B (en) Information pushing method, system, equipment and medium based on neural network
CN112016297B (en) Intention recognition model testing method and device, computer equipment and storage medium
CN114169331A (en) Address resolution method, device, computer equipment and storage medium
CN110780850B (en) Requirement case auxiliary generation method and device, computer equipment and storage medium
CN110222290B (en) Page generation method and device, computer equipment and storage medium
CN113868516A (en) Object recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20211203