CN111259207A - Short message identification method, device and equipment - Google Patents

Short message identification method, device and equipment Download PDF

Info

Publication number
CN111259207A
CN111259207A CN201811459607.0A CN201811459607A CN111259207A CN 111259207 A CN111259207 A CN 111259207A CN 201811459607 A CN201811459607 A CN 201811459607A CN 111259207 A CN111259207 A CN 111259207A
Authority
CN
China
Prior art keywords
information
short message
character string
identification information
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811459607.0A
Other languages
Chinese (zh)
Inventor
张翅飞
邱俊凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811459607.0A priority Critical patent/CN111259207A/en
Publication of CN111259207A publication Critical patent/CN111259207A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention provides a method, a device and equipment for identifying short messages, wherein the method comprises the following steps: acquiring a short message character string to be identified; extracting the characteristic identification information of the short message character string; and determining at least one piece of label information for identifying the short message according to the characteristic identification information. By extracting the characteristic identification information of the short message character string to be identified and determining at least one piece of label information for identifying the short message according to the characteristic identification information, the automatic identification and verification of the short message are realized, manual intervention is not needed, the influence caused by manual verification is reduced, the verification quality and efficiency are further ensured, the good experience of a user is improved, and the intelligent degree of the method is effectively improved.

Description

Short message identification method, device and equipment
Technical Field
The present invention relates to the field of information identification technologies, and in particular, to a method, an apparatus, and a device for identifying a short message.
Background
The short message is text information or numeric information transmitted/received by a user through a terminal, and the number of characters that the user can receive and transmit the short message at a time is generally 160 english or numeric characters, or 70 chinese characters. For text-type short messages, in order to avoid that merchants randomly send illegal and illegal information to users, the short messages generally need to be checked before being sent.
Disclosure of Invention
The embodiment of the invention provides a short message identification method, a short message identification device and short message identification equipment, which are used for identifying and auditing short messages without manual intervention, ensuring the auditing quality and efficiency and improving the good experience of a user.
In a first aspect, an embodiment of the present invention provides a method for identifying a short message, including:
acquiring a short message character string to be identified;
extracting the characteristic identification information of the short message character string;
and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
In a second aspect, an embodiment of the present invention provides an apparatus for identifying a short message, including:
the acquisition module is used for acquiring a short message character string to be identified;
the extraction module is used for extracting the characteristic identification information of the short message character string;
and the identification module is used for determining at least one piece of label information for identifying the short message according to the characteristic identification information.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor; the memory is configured to store one or more computer instructions, where the one or more computer instructions, when executed by the processor, implement the method for identifying a short message in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer storage medium, which is used for storing a computer program, and when the computer program is executed, the method for identifying a short message in the first aspect is implemented.
In a fifth aspect, an embodiment of the present invention provides a method for identifying a short message, which is applied to group-sending short messages, and includes:
acquiring a short message character string to be identified;
extracting the characteristic identification information of the short message character string;
and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
In a sixth aspect, an embodiment of the present invention provides a short message identification apparatus, applied to group sending of short messages, including:
the acquisition module is used for acquiring a short message character string to be identified;
the extraction module is used for extracting the characteristic identification information of the short message character string;
and the identification module is used for determining at least one piece of label information for identifying the short message according to the characteristic identification information.
In a seventh aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor; the memory is configured to store one or more computer instructions, where the one or more computer instructions, when executed by the processor, implement the method for identifying a short message in the fifth aspect.
In an eighth aspect, an embodiment of the present invention provides a computer storage medium, which is used for storing a computer program, and when the computer program is executed, the method for identifying a short message in the fifth aspect is implemented.
By extracting the characteristic identification information of the short message character string to be identified and determining at least one piece of label information for identifying the short message according to the characteristic identification information, the automatic identification and verification of the short message are realized, manual intervention is not needed, the influence caused by manual verification is reduced, the verification quality and efficiency are further ensured, the good experience of a user is improved, and the intelligent degree of the method is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a flowchart of a short message identification method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating determining at least one tag information for identifying the short message according to the feature identification information according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating determining at least one target information matching the feature identification information by using a preset database according to an embodiment of the present invention;
fig. 4 is a flowchart of another short message identification method according to an embodiment of the present invention;
fig. 5 is a flowchart of another short message identification method according to an embodiment of the present invention;
fig. 6 is a flowchart of a method for identifying a short message according to another embodiment of the present invention;
fig. 7 is a schematic structural diagram of a short message identification apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device corresponding to the short message identification apparatus provided in the embodiment shown in fig. 7.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and "a" and "an" generally include at least two, but do not exclude at least one, unless the context clearly dictates otherwise.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.
In addition, the sequence of steps in each method embodiment described below is only an example and is not strictly limited.
Fig. 1 is a flowchart of a short message identification method according to an embodiment of the present invention; referring to fig. 1, the embodiment provides a short message identification method, which can identify and audit a short message without manual intervention, thereby ensuring the audit quality and efficiency and improving the good experience of a user.
Optionally, the identification method in this embodiment may be applied to mass texting, where the mass texting may include a short message sent by a merchant to a user in mass. Of course, those skilled in the art can select other types of mass texting according to specific application requirements and application scenarios, for example: the short messages sent by the enterprise to the staff, the school to the students, the tourist bureau to the individuals in the specific area, and the like, and the mass short messages can be the short messages sent by any organization or group organization to the individuals.
Specifically, the identification method may include:
s101: and acquiring a short message character string to be identified.
The short message character string in this embodiment may be a character string included in a text-type short message, where the text-type short message may be a point-to-point sent short message, or may also be an enterprise short message sent by an enterprise to an individual. The short message character string to be identified is a short message character string which is required to be sent by a user and needs to be checked, and when the short message character string to be identified is obtained, the edited short message character string to be identified of the user can be obtained through a preset server, a preset database, a preset data interface or a preset data port; of course, those skilled in the art may also use other methods to obtain the short message character string to be recognized, as long as the accuracy and reliability of obtaining the short message character string can be ensured, which is not described herein again.
S102: and extracting the characteristic identification information of the short message character string.
Wherein the feature identification information includes at least one of: a character string template, character string characteristics, and signature information. Specifically, the character string template may refer to a character string remaining after the variable part is abstracted according to the short message character string, or may be called a short message scheme. The character string features may refer to a set of information such as a QQ number, a micro signal, a mobile phone number, a public number, a group number, url, and the like, which appear in the short message character string. The signature information is generally added at the beginning of a short message character string as a short message identifier, and may specifically be composed of [ c ] and corresponding keywords.
In addition, when the feature identification information includes a character string template, the implementation process of extracting the feature identification information of the short message character string may include: removing first-class information in the short message character string to obtain a character string template of the short message character string, wherein the first-class information comprises at least one of the following information: address, name, nickname, password, order number, that is: the character string template is other information except the first type of information in the short message character string; or removing second type information in the short message character string to obtain a character string template of the short message character string, wherein the second type information comprises at least one of the following information: numbers, letters, address bases, name bases, nickname bases, i.e.: the character string template is other information of the short message character string without the second type of information; or removing third type information in the short message character string to obtain a character string template of the short message character string, wherein the third type information comprises at least one of the following information: numbers, letters, symbols. Of course, those skilled in the art may also use other methods to extract the character string template of the short message character string, as long as the accuracy and stability of the character string template acquisition can be ensured, which is not described herein again.
To facilitate understanding of the extraction process of the feature identification information, the following is exemplified:
for example, the existing short message string to be recognized is as follows: [ SMS identifying code platform ] parent, your express delivery parcel has arrived me at 9.17, give for early withdrawal. And (4) contacting the telephone: 4001148008, address: new outer street 28 in the western region.
For the short message character string to be recognized, when the feature identification information includes a character string template, the extracted feature identification information may be: "parent, your express package has arrived at me at xxx, ready for use. And (4) contacting the telephone: xxxxxxxxxx, address: xxxxxxxxx ", it can be understood that, for the short message string at this time, the date, the contact phone number and the specific address are extracted as variable parts, and the reserved short message string is the character string template. When the feature identification information includes a character string feature, the extracted feature identification information may be: 4001148008, respectively; when the feature identification information includes signature information, the extracted feature identification information may be: [ SMS authentication code platform ].
For example two, the existing short message string to be recognized is as follows:
[ xx trade and trade ] you are ready to pick up a product by the pick-up code E41, and the product is stored for 24 hours without charge, see http:// you huiquan666. cn/? E41.
For the short message character string to be recognized, when the feature identification information includes a character string template, the extracted feature identification information may be: "you order has been delivered, please pick up a piece by the pickup code xxx, the merchandise is stored for 24 hours without charge, see xxxxxxxxxxxx in detail, it can be understood that, for the short message character string at this time, the pickup code and the specific website information are extracted as variable parts, and the reserved short message character string is the character string template. When the feature identification information includes a character string feature, the extracted feature identification information may be: http:// you huiquan666. cn/? E41; when the feature identification information includes signature information, the extracted feature identification information may be: [ xx commercial and trade ]. Obviously, when the feature identification information includes different character strings, the specific character string of the acquired feature identification information is also different.
In addition, it is understood that the above-mentioned extraction method and the extraction result of the feature identification information are only examples, and those skilled in the art may also use other extraction methods to obtain different extraction results, for example, for the above-mentioned example one, when the feature identification information includes a character string template, the obtained feature identification information may also be: "parent, your express package has arrived at me at xxxx, respectfully as early as possible. And (4) contacting the telephone: 4001148008, address: 28' of a new street outside a city, it can be understood that, regarding the short message character string at the moment, only the date is extracted as a variable part, and other character strings are fixed parts, so that another character string template different from the extracted result can be obtained.
S103: and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
After the characteristic identification information is acquired, the characteristic identification information may be analyzed, so that at least one piece of tag information may be determined according to an analysis processing result, and the tag information may be used for identifying the short message. Specifically, the tag information may include at least: the first label information is used for identifying the industry of the short message and the second label information is used for identifying the application intention of the short message.
For example, the following short message strings exist: [ xx bank ] your tail number 0856 cell phone has successfully signed up for a mobile bank,the client download address is:http://www.xxbank.com.cn/mobile/
for the short message character string, the feature identification information of the short message character string may be extracted first, and the analysis processing of the feature identification information may be performed to obtain, for example: analyzing and processing the signature information (xx bank), determining that the industry to which the short message belongs is a bank, and analyzing and processing the character string template to determine that the intention of the short message is to inform a user of client downloading; therefore, the short message can be determined to belong to the bank notification type short message. For the industry of the short message, based on the preset industry classification condition, the bank belongs to one subclass of the preset financial classes, and therefore, the first label information for identifying the industry to which the short message belongs can be determined as follows: finance-bank or banking. In addition, by regarding the intention of the short message, based on a preset intention setting condition, the client downloads a sub-class which may belong to the notification class, and therefore, the second tag information of the short message may be determined as: notification-client download or client download.
It is understood that, for the first tag information, a person skilled in the art may determine the representation form of the first tag information according to different fields or different application requirements, for example, the first tag information is secondary information: finance-banking, in this case, the second level information "bank" is a sub-category of the first level information "finance", and the first level information may also include other categories of second level information, such as: internet finance, securities, insurance, etc. Alternatively, the first tag information may be three-level information: finance-bank-reward part, etc., at this time, the third level information "reward part" is a subclass of the second level information "bank", the second level information "bank" is a subclass of the first level information "finance", and the second level information may also include other types of third level information, for example: business, management, and so on. Of course, the first tag information may also have other expression forms as long as the industry to which the short message belongs can be accurately identified, and is not described herein again.
Similarly, for the second tag information, a person skilled in the art may determine the representation form of the second tag information according to different fields or different application requirements, for example, the second tag information is secondary information: marketing-telemarketing, in which the second level information "telemarketing" is a subclass of the first level information "marketing", the first level information may also include other types of second level information, such as: network marketing, topic marketing, and the like. Of course, the second tag information may also have other expression forms as long as the application intention of the short message can be accurately identified, and is not described herein again.
In addition, when determining at least one piece of tag information for identifying a short message according to the feature identification information, the embodiment does not limit the specific implementation manner of the short message, and a person skilled in the art may set the tag information according to specific design requirements, for example: a mapping relationship between the feature identification information and the at least one tag information may be stored in advance, and the at least one tag information may be determined by the mapping relationship. Or, a database is preset, the database stores the corresponding relation between the standard feature identifier and the standard label, the standard feature identifier matched with the feature identifier information is searched in the database, and the standard label of the searched standard feature identifier is determined as the label information; of course, those skilled in the art may also use other manners to determine at least one tag information for identifying the short message, which is not described herein again.
In addition, since the feature identification information may include at least one of a character string template, a character string feature, and signature information, determining at least one tag information for identifying the short message according to the feature identification information may include: determining at least one piece of first label information for identifying the short message according to the character string template; and/or determining at least one piece of second label information for identifying the short message according to the character string characteristics; and/or determining at least one piece of third label information for identifying the short message according to the signature information; and processing the first label information and/or the second label information and/or the third label information according to a preset label principle, and determining at least one final label information for identifying the short message.
The implementation processes of determining the first tag information, the second tag information and the third tag information are independent and do not interfere with each other, and the first tag information, the second tag information and the third tag information may be the same or different, that is: when corresponding tag information is determined according to different feature identification information, one tag information may be acquired, or a plurality of different tag information may also be acquired, so that one short message may correspond to one tag information, or a plurality of tag information may also correspond to one short message.
Further, the embodiment does not limit the specific content of the preset label principle, and those skilled in the art may set the preset label principle according to specific design requirements, for example: the preset label principle may be a label weighting principle, a label priority principle, a label filtering principle, or the like, where the label weighting principle may refer to setting a corresponding weighted score for each piece of label information, determining a weighted weight of each piece of label information, multiplying the weighted score by the weighted weight, and determining the piece of label information whose sum of the products is greater than a preset threshold as final piece of label information. The label priority principle can select a limited number of label information according to a preset priority order, so as to form final label information; the label filtering principle is to remove one or part of the label information, and the rest part constitutes the final representation information. Of course, those skilled in the art may also adopt other labeling principles as long as the accuracy of the final label information acquisition can be ensured, and details are not described herein.
According to the short message identification method provided by the embodiment, the characteristic identification information of the short message character string to be identified is extracted, and at least one piece of label information for identifying the short message is determined according to the characteristic identification information, so that the short message is automatically identified and audited without manual intervention, the influence caused by manual audit is reduced, the audit quality and efficiency are further ensured, the good experience of a user is improved, and the intelligent degree of the method is effectively improved.
Fig. 2 is a flowchart illustrating determining at least one tag information for identifying a short message according to the feature identification information according to an embodiment of the present invention; fig. 3 is a flowchart illustrating determining at least one target information matching the feature identification information by using a preset database according to an embodiment of the present invention; on the basis of the foregoing embodiment, with reference to fig. 2 to 3, in this embodiment, a specific implementation manner of determining at least one tag information for identifying a short message according to feature identification information is not limited, and a person skilled in the art may set the tag information according to a specific design requirement, and preferably, determining at least one tag information for identifying a short message according to feature identification information in this embodiment may include:
s1031: and determining at least one piece of target information matched with the characteristic identification information by using a preset database.
Wherein the database may include at least one of: the system comprises a template library, a feature library and a signature library, wherein the template library stores corresponding relations between a plurality of character string templates and a plurality of first labels, the feature library stores corresponding relations between a plurality of character string features and a plurality of second labels, and the signature library stores corresponding relations between a plurality of signature information and a plurality of third labels.
It is to be understood that, when the extracted feature identification information includes a character string template, the corresponding database includes a template library; when the extracted feature identification information comprises character string features, the corresponding database comprises a feature library; when the extracted feature identification information comprises signature information, the corresponding database comprises a signature library; that is, the database corresponds to the feature identification information.
Specifically, the determining, by using a preset database, at least one target information matched with the feature identification information may include:
s10311: and analyzing and comparing the characteristic identification information with all standard characteristic information stored in the database to obtain the similarity between the characteristic identification information and each standard characteristic information.
S10312: and determining at least one piece of standard characteristic information with the similarity greater than or equal to a preset similarity threshold as target information.
The similarity threshold is preset, the specific numerical range is not limited in this embodiment, and those skilled in the art may set the similarity threshold according to specific design requirements, for example: the similarity threshold may be 80%, 85%, 90%, or 95%, etc., and it is understood that the greater the similarity threshold, the greater the accuracy of the feature identification information identification. After the similarity between the feature identification information and each piece of standard feature information is obtained, when the similarity is greater than or equal to a similarity threshold, it is indicated that the standard feature information matches the feature identification information, and at this time, at least one piece of standard feature information that satisfies the above conditions may be determined as the target information.
S1032: and acquiring a target label corresponding to the target information.
S1033: determining at least one target tag as at least one tag information.
For example, when the feature identification information includes a character string template, the feature identification information of the short message character string may be extracted first, and then at least one target character string template matching the character string template may be determined in a preset template library, specifically, the character string template is analyzed and compared with all standard character string templates pre-stored in the template library to obtain similarity information between the character string template and each standard character string template; and determining at least one standard character string template with the similarity information being greater than or equal to a preset similarity threshold as at least one target information. After the target information is obtained, a first tag corresponding to the target information may be obtained by using a preset mapping relationship, where the first tag is a target tag, so that the obtained at least one target tag may be determined as at least one tag information.
It can be understood that, for the implementation process of at least one tag information when the feature identification information includes a character string feature and the feature identification information includes signature information, the implementation process is similar to the implementation process when the feature identification information includes a character string template, and the above statements may be specifically referred to, and are not repeated herein.
It should be noted that, after step S10311, the method in this embodiment may further include:
s10313: and when the similarity information between the characteristic identification information and each standard characteristic information is smaller than a similarity threshold value, determining at least one piece of label information for identifying the short message according to other information in the characteristic identification information.
After the characteristic identification information is analyzed and compared with each standard characteristic information in the database, when the similarity information between the characteristic identification information and each standard characteristic information is smaller than a similarity threshold value, it is indicated that the standard characteristic information meeting preset conditions does not exist in the database at the moment, and further, in order to ensure the accuracy of short message character string identification, at least one piece of label information can be determined by using other information in the characteristic identification information.
For example, in a case that the feature identification information includes a character string template and character string features, when at least one piece of tag information for identifying a short message is determined according to the feature identification information, at least one target character string template matching the character string template may be determined in a preset template library, and when a target character string template (target information) meeting a condition does not exist in the template library, that is: if the similarity information between the character string template and each standard character string template in the template library is smaller than the similarity threshold, determining at least one piece of label information for identifying the short message according to other information (character string characteristics) in the characteristic identification information, that is: at least one target character string feature (target information) matched with the character string feature can be determined in a preset feature library, a target label corresponding to the target character string feature is obtained, and then the at least one target label is determined to be at least one label information used for identifying the short message.
Therefore, when the feature identification information comprises the character string template and the character string features, the character string template can be analyzed firstly, and when at least one piece of label information cannot be determined according to the character string template, at least one piece of label information for identifying the short message is determined according to other information (character string features) in the feature identification information, so that the at least one piece of label information is determined. It is conceivable that the implementation procedure is not limited to the implementation sequence described above, for example, the character string features may be analyzed first, and when at least one piece of tag information cannot be determined according to the character string features, at least one piece of tag information for identifying the short message is determined according to other information (character string template) in the feature identification information, so as to determine the at least one piece of tag information.
It is conceivable that, when the feature identification information includes the character string template and the signature information, and the feature identification information includes the character string feature and the signature information, the implementation process of specifically determining at least one piece of tag information for identifying the short message is similar to the implementation process of the first case, and the above statements may be specifically referred to, and are not repeated herein.
The second case can be realized: the feature identification information includes a character string template, character string features, and signature information, at this time, at least one target character string template matching the character string template may be determined in a preset template library, and when a target character string template meeting conditions does not exist in the template library, that is: if the similarity information between the character string template and each standard character string template in the template library is smaller than the similarity threshold, determining at least one piece of label information for identifying the short message according to other information (character string characteristics or signature information) in the characteristic identification information, and taking the character string characteristics as an example for explanation, that is: at least one target character string feature matched with the character string feature can be determined in a preset feature library, and when the target character string feature meeting the condition does not exist in the feature library, the method comprises the following steps: and if the similarity information between the character string features and each standard character string feature in the template library is smaller than a similarity threshold, determining at least one piece of label information for identifying the short message according to other information (signature information) in the feature identification information, namely determining at least one target character string signature in the signature library, acquiring a target label corresponding to the target character string signature, and further determining the at least one target label as the at least one piece of label information for identifying the short message.
Therefore, when the characteristic identification information comprises the character string template, the character string characteristics and the signature information, the character string template can be analyzed firstly, and when at least one piece of label information cannot be determined according to the character string template, at least one piece of label information for identifying the short message is determined according to other information (character string characteristics) in the characteristic identification information, so that at least one piece of label information is determined; when the at least one piece of label information can not be determined according to the character string characteristics, the at least one piece of label information for identifying the short message is further determined according to other information (signature information) in the characteristic identification information, so that the at least one piece of label information is determined.
It is conceivable that the implementation process of determining the at least one signature information is not limited to the implementation sequence described above, for example, the signature information may be analyzed first, and when the at least one tag information cannot be determined according to the signature information, the at least one tag information for identifying the short message is determined according to other information (the character string template and the character string characteristics) in the characteristic identification information, so as to determine the at least one tag information; or, the character string features may be analyzed first, and when at least one piece of tag information cannot be determined, at least one piece of tag information for identifying the short message is determined according to other information (the character string template and the signature information) in the feature identification information, so as to determine the at least one piece of tag information, and so on, as long as the accuracy and reliability of the determination of the at least one piece of signature information can be ensured, which is not described herein again.
The third case that can be realized: when the feature identification information includes a character string template, character string features, and signature information, the possible cases further include: at least one piece of label information for identifying the short message cannot be determined according to all the feature identification information, namely, a target character string template which meets the condition does not exist in the preset template library, a target character string feature which meets the condition does not exist in the preset feature library, and target signature information which meets the condition does not exist in the preset label library.
The method determines at least one piece of label information, not only ensures the accuracy and reliability of label information acquisition, but also has various realization modes, effectively improves the flexibility of the method and enlarges the application range of the method.
Fig. 4 is a flowchart of another short message identification method according to an embodiment of the present invention; on the basis of the foregoing embodiment, with reference to fig. 4, after determining at least one piece of tag information for identifying a short message according to the feature identification information, the method in this embodiment may further include:
s201: and extracting all characteristic identification information of the short message character string.
S202: and when all the characteristic identification information comprises the information which is not stored in the database and is not put in storage, corresponding the information which is not put in storage and the label information, and storing the information in the database.
In a specific application, in order to improve the accuracy of the method, the database in this embodiment may perform autonomous learning and update data. Specifically, after at least one piece of tag information is determined according to the feature identification information, all feature identification information of the short message character string can be extracted, that is, a character string template, character string features, signature information and the like of the short message character string are extracted; analyzing and identifying the extracted feature identification information, and judging whether all feature identification information has non-storage information which is not stored in a corresponding database, wherein when the feature identification information comprises a character string template, the non-storage information can be part or all of the character string templates which are not stored in a template library in the character string template; or when the feature identification information includes a character string feature, the non-storage information may be a part or all of the character string features that are not stored in the feature library in the character string feature; when the feature identification information includes signature information, the non-storage information may also be part or all of the signature information that is not stored in the tag library in the signature information; if the label information exists, establishing a corresponding relation between the information which is not put in storage and the label information, and storing the information in a corresponding database.
For example, when the feature identification information includes a character string template, at least one piece of tag information may be determined according to the character string template, and after the tag information is determined, other information (character string features and signature information) in the feature identification information may be extracted, so that all feature identification information of the short message character string, that is, the character string template, the character string features, the signature information, and the like, is acquired; the label information is determined according to the character string template, so that the character string characteristic and the signature information can be directly analyzed and recognized without analyzing and recognizing the character string template, and if the character string characteristic comprises the non-warehousing information which is not stored in the characteristic library and/or the signature information comprises the non-warehousing information which is not stored in the label library, all the non-warehousing information is corresponding to at least one piece of label information which is determined and stored in the corresponding database.
It can be understood that, when at least one piece of tag information cannot be determined according to all the feature identification information, the identification can be manually checked, the tag information of the short message is determined, and after the tag information is determined, all the feature identification information of the extracted short message character string and the tag information are corresponded and respectively stored in corresponding databases.
By automatically or mutually learning the data, the data perfecting and updating process is realized, and the probability of manual intervention is further reduced, so that the quality and efficiency of short message identification and verification are improved, the stability, reliability and intelligent degree of the method are effectively ensured, and the popularization and application of the market are facilitated.
Fig. 5 is a flowchart of another short message identification method according to an embodiment of the present invention; referring to fig. 5, before acquiring the short message character string to be recognized, the method in this embodiment may further include:
s301: and acquiring a plurality of preset short message sample character strings and a sample label corresponding to each short message sample character string.
The short message sample character string and the sample label can be from historical operating data or historical storage data, or can also be sample data directly input by a user.
S302: extracting sample characteristic identification information of each short message sample character string, wherein the sample characteristic identification information comprises: a string template, string characteristics, and signature information.
S303: and corresponding the sample characteristic identification information with the sample label, and storing the sample characteristic identification information and the sample label in a preset database.
For example, the following text message sample strings exist: [ xx trade and trade ] you are ready to pick up a product by the pick-up code E41, and the product is stored for 24 hours without charge, see http:// you huiquan666. cn/? E41. Through the identification of the short message sample character string, the corresponding sample label can be obtained as follows: the first sample label determined based on the sample string template is: industry-express companies, intent-pick notification; the second sample label determined based on the sample string features is: industry-fraud companies, intent-fraud; the third exemplar label determined based on the exemplar signature information is: industry-foreign trade enterprise, intent-unknown/null.
After the short message sample character string is obtained, extracting a character string template, character string characteristics and signature information in the short message sample character string, corresponding the character string template and a first sample label, and storing the character string template and the first sample label in a template library; and the character string characteristics are corresponding to the second sample label and stored in a characteristic library, and the signature information is corresponding to the third sample label and stored in a signature library. Thus, a corresponding database is realized, namely: the system comprises a template library, a feature library and a signature library, wherein the template library stores corresponding relations between a plurality of character string templates and a plurality of first labels, the feature library stores corresponding relations between a plurality of character string features and a plurality of second labels, and the signature library stores corresponding relations between a plurality of signature information and a plurality of third labels.
The data in the database is obtained through the method, so that the accuracy and reliability of identifying the short message by using the database are effectively ensured, and the accuracy of the method is further improved.
Fig. 6 is a flowchart of a method for identifying a short message according to another embodiment of the present invention; on the basis of the foregoing embodiment, as can be seen with reference to fig. 6, in order to further improve the practicability of the method, the method in this embodiment may further include:
s401: and identifying whether the short message is legal or not according to the label information.
S402: and when the short message is legal, sending the short message according to a preset scheduling rule.
S403: and intercepting the short message when the short message is illegal.
After determining the tag information, whether the short message is legal or not may be identified according to the tag information, and a specific implementation manner of identifying whether the short message is legal or not according to the tag information is not limited in this embodiment, and a person skilled in the art may set the tag information according to a specific design requirement, for example: the tag information can be analyzed and compared with a preset legal tag library, if a legal tag corresponding to the tag information does not exist in the legal tag library, the short message can be proved to be illegal, otherwise, the short message is proved to be legal. Or, the tag information may also be checked according to a preset legal principle, and if the tag information conforms to the legal principle, the short message is legal, otherwise, the short message is illegal. Certainly, a person skilled in the art may also use other manners to recognize whether the short message is legal according to the tag information, as long as the accuracy and reliability of the recognition can be ensured, which is not described herein again.
When the short message is identified to be legal, the short message can be sent based on a preset scheduling rule, wherein the scheduling rule can be different according to different operators, different regions and different fields, so that the scheduling rule can be flexibly applied in specific application; for example, the operator corresponding to the first valid short message may include an operator a and an operator B, and the operator B is not allowed to send the first short message, so that the operator a may be used to send the first short message based on the preset scheduling rule. Or, for the existing legal second short message, the region where the second short message is sent includes beijing and shanghai, but for the operator, the second short message is not allowed to be sent to shanghai, and at this time, the operator can be used to send the second short message to beijing based on the preset scheduling rule. Of course, the scheduling rule in this embodiment may also be a scheduling rule in other forms, and those skilled in the art may perform any setting according to specific application requirements and use requirements, which are not described herein again.
In the embodiment, illegal and forbidden short messages can be identified through the label information, so that the part of short messages are intercepted, the legal short messages are scheduled and optimized, multi-dimensional and multi-level management and control scheduling and decision making can be realized, and the auditing quality and efficiency of the short messages are further improved.
Fig. 7 is a schematic structural diagram of a short message identification apparatus according to an embodiment of the present invention; referring to fig. 7, the present embodiment provides a short message identification apparatus, which can perform the above-mentioned identification method. Optionally, the identification device in this embodiment may be applied to mass texting, where the mass texting may include a short message sent by a merchant to a user in mass. Of course, those skilled in the art can select other types of mass texting according to specific application requirements and application scenarios, for example: the short messages sent by the enterprise to the staff, the school to the students, the tourist bureau to the individuals in the specific area, and the like, and the mass short messages can be the short messages sent by any organization or group organization to the individuals.
Specifically, the identification means may include: an acquisition module 11, an extraction module 12 and an identification module 13.
The acquisition module 11 is used for acquiring a short message character string to be identified;
the extraction module 12 is used for extracting the feature identification information of the short message character string;
and the identification module 13 is configured to determine at least one piece of tag information for identifying the short message according to the feature identification information.
Wherein the feature identification information includes at least one of: a character string template, character string characteristics, and signature information. In addition, the tag information includes at least: the first label information is used for identifying the industry of the short message and the second label information is used for identifying the application intention of the short message.
Optionally, when the identification module 13 determines at least one piece of tag information for identifying the short message according to the feature identification information, the identification module 13 may be configured to perform: determining at least one piece of target information matched with the characteristic identification information by using a preset database; acquiring a target label corresponding to target information; determining at least one target tag as at least one tag information.
Wherein the database comprises at least one of: the system comprises a template library, a feature library and a signature library, wherein the template library stores corresponding relations between a plurality of character string templates and a plurality of first labels, the feature library stores corresponding relations between a plurality of character string features and a plurality of second labels, and the signature library stores corresponding relations between a plurality of signature information and a plurality of third labels.
In addition, when the recognition module 13 determines at least one target information matching the feature identification information by using a preset database, the recognition module 13 may be configured to perform: analyzing and comparing the characteristic identification information with all standard characteristic information stored in a database to obtain the similarity between the characteristic identification information and each standard characteristic information; and determining at least one piece of standard characteristic information with the similarity greater than or equal to a preset similarity threshold as target information.
Optionally, the identification module 13 in the apparatus may be further configured to perform: and when the similarity information between the characteristic identification information and each standard characteristic information is smaller than a similarity threshold value, determining at least one piece of label information for identifying the short message according to other information in the characteristic identification information.
Optionally, after determining at least one piece of tag information for identifying the short message according to the feature identification information, the extraction module 12 is further configured to extract all feature identification information of the short message character string;
at this time, the apparatus may further include a storage module 14, where the storage module 14 is configured to, when all the feature identification information includes non-warehousing information that is not stored in the database, correspond the non-warehousing information to the tag information, and store the non-warehousing information in the database.
Optionally, the obtaining module 11 may be further configured to obtain a plurality of preset short message sample character strings and a sample tag corresponding to each short message sample character string before obtaining the short message character string to be identified;
at this time, the extracting module 12 is further configured to extract sample feature identification information of each short message sample character string, where the sample feature identification information includes: a character string template, character string characteristics and signature information;
and the storage module 14 is further configured to correspond the sample characteristic identification information to the sample label, and store the sample characteristic identification information in a preset database.
Optionally, in this embodiment, the identifying module 13 is further configured to: identifying whether the short message is legal or not according to the label information; when the short message is legal, the short message is sent according to a preset scheduling rule; or, when the short message is illegal, the short message is intercepted.
Optionally, when the feature identification information includes a character string template and the extraction module 12 extracts the feature identification information of the short message character string, the extraction module 12 may be configured to perform: removing first-class information in the short message character string to obtain a character string template of the short message character string, wherein the first-class information comprises at least one of the following information: address, name, nickname, password, order number; or removing second type information in the short message character string to obtain a character string template of the short message character string, wherein the second type information comprises at least one of the following information: numbers, letters, address libraries, name libraries, nickname libraries; or removing third type information in the short message character string to obtain a character string template of the short message character string, wherein the third type information comprises at least one of the following information: numbers, letters, symbols.
Optionally, when the identification module 13 determines at least one piece of tag information for identifying the short message according to the feature identification information, the identification module 13 is configured to: determining at least one piece of first label information for identifying the short message according to the character string template; and/or determining at least one piece of second label information for identifying the short message according to the character string characteristics; and/or determining at least one piece of third label information for identifying the short message according to the signature information; and processing the first label information and/or the second label information and/or the third label information according to a preset label principle, and determining at least one final label information for identifying the short message.
The apparatus shown in fig. 7 can perform the method of the embodiment shown in fig. 1-6, and the detailed description of this embodiment can refer to the related description of the embodiment shown in fig. 1-6. The implementation process and technical effect of the technical solution refer to the descriptions in the embodiments shown in fig. 1 to 6, and are not described herein again.
In one possible design, the structure of the short message identification apparatus shown in fig. 7 may be implemented as an electronic device, which may be a mobile phone, a tablet computer, a server, or other devices. As shown in fig. 8, the electronic device may include: a processor 21 and a memory 22. Wherein, the memory 22 is used for storing a program for supporting the electronic device to execute the identification method of the short message provided in the embodiments shown in fig. 1-6, and the processor 21 is configured to execute the program stored in the memory 22.
The program comprises one or more computer instructions which, when executed by the processor 21, are capable of performing the steps of:
acquiring a short message character string to be identified;
extracting feature identification information of the short message character string;
and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
Optionally, the processor 21 is further configured to perform all or part of the steps in the embodiments of fig. 1-6.
The electronic device may further include a communication interface 23 for communicating with other devices or a communication network.
In addition, an embodiment of the present invention provides a computer storage medium for storing computer software instructions for an electronic device, which includes a program for executing the method for identifying a short message in the method embodiments shown in fig. 1 to 6.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by adding a necessary general hardware platform, and of course, can also be implemented by a combination of hardware and software. With this understanding in mind, the above-described aspects and portions of the present technology which contribute substantially or in part to the prior art may be embodied in the form of a computer program product, which may be embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including without limitation disk storage, CD-ROM, optical storage, and the like.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (18)

1. A method for identifying short messages is characterized by comprising the following steps:
acquiring a short message character string to be identified;
extracting the characteristic identification information of the short message character string;
and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
2. The method of claim 1,
the feature identification information includes at least one of: a character string template, character string characteristics, and signature information.
3. The method of claim 2, wherein determining at least one tag information for identifying the short message according to the characteristic identification information comprises:
determining at least one piece of target information matched with the feature identification information by using a preset database;
acquiring a target label corresponding to the target information;
determining at least one of the target tags as at least one of the tag information.
4. The method of claim 3,
the database includes at least one of: the system comprises a template library, a feature library and a signature library, wherein the template library stores corresponding relations between a plurality of character string templates and a plurality of first labels, the feature library stores corresponding relations between a plurality of character string features and a plurality of second labels, and the signature library stores corresponding relations between a plurality of signature information and a plurality of third labels.
5. The method of claim 3, wherein determining at least one target information matching the feature identification information using a predetermined database comprises:
analyzing and comparing the characteristic identification information with all standard characteristic information stored in the database to obtain the similarity between the characteristic identification information and each standard characteristic information;
and determining at least one piece of standard characteristic information with the similarity greater than or equal to a preset similarity threshold as the target information.
6. The method of claim 5, further comprising:
and when the similarity information between the characteristic identification information and each standard characteristic information is smaller than a similarity threshold value, determining at least one piece of label information for identifying the short message according to other information in the characteristic identification information.
7. The method according to any one of claims 1-6, wherein after determining at least one tag information for identifying the short message according to the feature identification information, the method further comprises:
extracting all characteristic identification information of the short message character string;
and when all the characteristic identification information comprises non-warehousing information which is not stored in a database, corresponding the non-warehousing information and the label information, and storing the non-warehousing information and the label information into the database.
8. The method according to any one of claims 1-6, wherein before obtaining the short message character string to be identified, the method further comprises:
acquiring a plurality of preset short message sample character strings and a sample label corresponding to each short message sample character string;
extracting sample characteristic identification information of each short message sample character string, wherein the sample characteristic identification information comprises: a character string template, character string characteristics and signature information;
and corresponding the sample characteristic identification information and the sample label, and storing the sample characteristic identification information and the sample label in a preset database.
9. The method according to any one of claims 1-6, wherein the tag information comprises at least: the first label information is used for identifying the industry of the short message and the second label information is used for identifying the application intention of the short message.
10. The method according to any one of claims 1-6, further comprising:
identifying whether the short message is legal or not according to the label information;
when the short message is legal, the short message is sent according to a preset scheduling rule; alternatively, the first and second electrodes may be,
and intercepting the short message when the short message is illegal.
11. The method according to any one of claims 2 to 6, wherein when the feature identification information includes a character string template, extracting the feature identification information of the short message character string includes:
removing first-class information in the short message character string to obtain a character string template of the short message character string, wherein the first-class information comprises at least one of the following information: address, name, nickname, password, order number; alternatively, the first and second electrodes may be,
removing second type information in the short message character string to obtain a character string template of the short message character string, wherein the second type information comprises at least one of the following information: numbers, letters, address libraries, name libraries, nickname libraries; alternatively, the first and second electrodes may be,
removing third type information in the short message character string to obtain a character string template of the short message character string, wherein the third type information comprises at least one of the following information: numbers, letters, symbols.
12. The method according to any one of claims 2-6, wherein determining at least one tag information for identifying the short message according to the feature identification information comprises:
determining at least one piece of first label information for identifying the short message according to the character string template; and/or the presence of a gas in the gas,
determining at least one piece of second label information for identifying the short message according to the character string characteristics; and/or the presence of a gas in the gas,
determining at least one piece of third label information for identifying the short message according to the signature information;
and processing the first label information and/or the second label information and/or the third label information according to a preset label principle, and determining at least one final label information for identifying the short message.
13. A short message identification method is applied to mass texting and comprises the following steps:
acquiring a short message character string to be identified;
extracting the characteristic identification information of the short message character string;
and determining at least one piece of label information for identifying the short message according to the characteristic identification information.
14. The method of claim 13, wherein the mass texting comprises mass texting the merchant to the user.
15. An apparatus for recognizing a short message, comprising:
the acquisition module is used for acquiring a short message character string to be identified;
the extraction module is used for extracting the characteristic identification information of the short message character string;
and the identification module is used for determining at least one piece of label information for identifying the short message according to the characteristic identification information.
16. An electronic device, comprising: a memory, a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the method for identifying a short message according to any one of claims 1 to 12.
17. A short message identification device is applied to mass texting and comprises the following components:
the acquisition module is used for acquiring a short message character string to be identified;
the extraction module is used for extracting the characteristic identification information of the short message character string;
and the identification module is used for determining at least one piece of label information for identifying the short message according to the characteristic identification information.
18. An electronic device, comprising: a memory, a processor; the memory is used for storing one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the short message identification method according to any one of claims 13 to 14.
CN201811459607.0A 2018-11-30 2018-11-30 Short message identification method, device and equipment Pending CN111259207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811459607.0A CN111259207A (en) 2018-11-30 2018-11-30 Short message identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811459607.0A CN111259207A (en) 2018-11-30 2018-11-30 Short message identification method, device and equipment

Publications (1)

Publication Number Publication Date
CN111259207A true CN111259207A (en) 2020-06-09

Family

ID=70944869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811459607.0A Pending CN111259207A (en) 2018-11-30 2018-11-30 Short message identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN111259207A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783998A (en) * 2020-06-30 2020-10-16 百度在线网络技术(北京)有限公司 Illegal account recognition model training method and device and electronic equipment
CN112261600A (en) * 2020-12-22 2021-01-22 江苏音信通信息技术有限公司 Short message content fast matching method and short message intercepting method based on content
CN113746723A (en) * 2021-08-31 2021-12-03 广州智会云科技发展有限公司 Enterprise instant short message marketing method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102572744A (en) * 2010-12-13 2012-07-11 ***通信集团设计院有限公司 Recognition feature library acquisition method and device as well as short message identification method and device
CN105117724A (en) * 2015-07-30 2015-12-02 北京邮电大学 License plate positioning method and apparatus
CN105138611A (en) * 2015-08-07 2015-12-09 北京奇虎科技有限公司 Short message type identification method and device
CN107229638A (en) * 2016-03-24 2017-10-03 北京搜狗科技发展有限公司 A kind of text message processing method and device
CN108650260A (en) * 2018-05-09 2018-10-12 北京邮电大学 A kind of recognition methods of malicious websites and device
CN108875727A (en) * 2018-06-29 2018-11-23 龙马智芯(珠海横琴)科技有限公司 The detection method and device of graph-text identification, storage medium, processor

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102572744A (en) * 2010-12-13 2012-07-11 ***通信集团设计院有限公司 Recognition feature library acquisition method and device as well as short message identification method and device
CN105117724A (en) * 2015-07-30 2015-12-02 北京邮电大学 License plate positioning method and apparatus
CN105138611A (en) * 2015-08-07 2015-12-09 北京奇虎科技有限公司 Short message type identification method and device
CN107229638A (en) * 2016-03-24 2017-10-03 北京搜狗科技发展有限公司 A kind of text message processing method and device
CN108650260A (en) * 2018-05-09 2018-10-12 北京邮电大学 A kind of recognition methods of malicious websites and device
CN108875727A (en) * 2018-06-29 2018-11-23 龙马智芯(珠海横琴)科技有限公司 The detection method and device of graph-text identification, storage medium, processor

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783998A (en) * 2020-06-30 2020-10-16 百度在线网络技术(北京)有限公司 Illegal account recognition model training method and device and electronic equipment
CN111783998B (en) * 2020-06-30 2023-08-11 百度在线网络技术(北京)有限公司 Training method and device for illegal account identification model and electronic equipment
CN112261600A (en) * 2020-12-22 2021-01-22 江苏音信通信息技术有限公司 Short message content fast matching method and short message intercepting method based on content
CN112261600B (en) * 2020-12-22 2021-08-13 江苏音信通信息技术有限公司 Short message content fast matching method and short message intercepting method based on content
CN113746723A (en) * 2021-08-31 2021-12-03 广州智会云科技发展有限公司 Enterprise instant short message marketing method and system

Similar Documents

Publication Publication Date Title
CN108366045B (en) Method and device for setting wind control scoring card
CN107835496B (en) Spam short message identification method and device and server
CN111274782A (en) Text auditing method and device, computer equipment and readable storage medium
US11093774B2 (en) Optical character recognition error correction model
CN109766441B (en) Text classification method, device and system
CN111259207A (en) Short message identification method, device and equipment
CN114244611B (en) Abnormal attack detection method, device, equipment and storage medium
EP3961426A2 (en) Method and apparatus for recommending document, electronic device and medium
CN112241458B (en) Text knowledge structuring processing method, device, equipment and readable storage medium
CN111753090A (en) Document auditing method, device, equipment and medium based on RPA and AI
CN110046188A (en) Method for processing business and its system
CN114818705A (en) Method, electronic device and computer program product for processing data
CN111126071A (en) Method and device for determining questioning text data and data processing method of customer service group
CN110972086A (en) Short message processing method and device, electronic equipment and computer readable storage medium
CN115221893B (en) Quality inspection rule automatic configuration method and device based on rule and semantic analysis
CN113746814B (en) Mail processing method, mail processing device, electronic equipment and storage medium
US20190042653A1 (en) Automatic identification of user information
CN113112323B (en) Abnormal order identification method, device, equipment and medium based on data analysis
CN115544558A (en) Sensitive information detection method and device, computer equipment and storage medium
CN113901817A (en) Document classification method and device, computer equipment and storage medium
CN114329164A (en) Method, apparatus, device, medium and product for processing data
CN112015773A (en) Knowledge base retrieval method and device, electronic equipment and storage medium
CN113536788B (en) Information processing method, device, storage medium and equipment
CN113077272B (en) Communication business marketing scheme optimization method and device
CN116244740B (en) Log desensitization method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination