CN116361591A - Content auditing method, device, electronic equipment and computer readable storage medium - Google Patents

Content auditing method, device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN116361591A
CN116361591A CN202310323955.XA CN202310323955A CN116361591A CN 116361591 A CN116361591 A CN 116361591A CN 202310323955 A CN202310323955 A CN 202310323955A CN 116361591 A CN116361591 A CN 116361591A
Authority
CN
China
Prior art keywords
content
sub
target
hierarchical
audited
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310323955.XA
Other languages
Chinese (zh)
Inventor
唐天宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310323955.XA priority Critical patent/CN116361591A/en
Publication of CN116361591A publication Critical patent/CN116361591A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a content auditing method, a device, electronic equipment and a computer readable storage medium, and relates to the field of data processing, in particular to the field of content auditing. The specific implementation scheme is as follows: acquiring to-be-audited content and hierarchical description data of the to-be-audited content, wherein the to-be-audited content comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing hierarchical relations among the sub-contents; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.

Description

Content auditing method, device, electronic equipment and computer readable storage medium
Technical Field
The disclosure relates to the technical field of data processing, in particular to the technical field of content auditing, and specifically relates to a content auditing method, a device, electronic equipment and a computer readable storage medium.
Background
In internet content products, some content is often submitted by users. Before releasing the content submitted by the user, the content needs to be audited according to a preconfigured audit rule.
When auditing contents submitted by a user, the auditing is generally performed on part of specific contents, and how to effectively locate the part of specific contents from the contents submitted by the user so as to perform targeted auditing becomes an important technical problem.
Disclosure of Invention
In order to solve at least one of the defects, the disclosure provides a content auditing method, a content auditing device, an electronic device and a computer readable storage medium.
According to a first aspect of the present disclosure, there is provided a content auditing method, the method comprising:
acquiring to-be-audited content and hierarchical description data of the to-be-audited content, wherein the to-be-audited content comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing hierarchical relations among the sub-contents;
acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited;
locating target sub-content from the content to be checked based on the target hierarchical path;
And auditing the target sub-content based on the content auditing rule.
According to a second aspect of the present disclosure, there is provided a content auditing apparatus, the apparatus comprising:
the data acquisition module is used for acquiring the content to be audited and the hierarchical description data of the content to be audited, wherein the content to be audited comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing the hierarchical relationship among the sub-contents;
the target hierarchical path determining module is used for acquiring a target hierarchical path of target sub-content based on the hierarchical description data, wherein the target sub-content is the sub-content aimed at in the content auditing rule of the content to be audited;
the target sub-content positioning module is used for positioning target sub-content from the content to be checked based on the target hierarchical path;
and the content auditing module is used for auditing the target sub-content based on the content auditing rule.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the content auditing method.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the above-described content auditing method.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the above-described content auditing method.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a method for auditing contents according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a method for executing content location instructions according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a specific implementation of a content auditing method provided by an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a content auditing apparatus according to an embodiment of the present disclosure;
Fig. 5 is a block diagram of an electronic device for implementing a content auditing method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
When auditing contents submitted by a user, the auditing is generally performed for a part of specific contents therein. Taking encyclopedic entry content as an example, the auditing is generally performed by requiring an overview of the content, etc., for the directory content in the entry. How to effectively locate the specific content from the content submitted by the user so as to conduct targeted auditing on the specific content based on the auditing rules becomes an important technical problem.
In the related art, a plurality of rules are generally formulated according to the normalization requirement on the content, and the rules exist in a code form, so that modification of the rules can be realized only by redevelopment of the codes, and therefore, rule adjustment is inconvenient. In addition, rules implemented by code may exist that are logically coupled.
In the related art, the auditing rule generally only can support some simple logic decisions based on single-value type data, such as comparison of numerical values and comparison of whether character strings are identical, and cannot meet the actual requirements.
The embodiment of the disclosure provides a content auditing method, a device, an electronic device and a computer readable storage medium, which aim to solve at least one of the above technical problems in the prior art.
Fig. 1 shows a flow chart of a content auditing method according to an embodiment of the present disclosure, as shown in fig. 1, the method may mainly include:
step S110: acquiring to-be-audited content and hierarchical description data of the to-be-audited content, wherein the to-be-audited content comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing hierarchical relations among the sub-contents;
step S120: acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited;
step S130: locating target sub-content from the content to be checked based on the target hierarchical path;
step S140: and auditing the target sub-content based on the content auditing rule.
The content to be audited can be structured data with a hierarchical structure, including but not limited to encyclopedia entry content submitted by a user, structured business data and the like.
The pending content will typically include multiple items of sub-content. Taking encyclopedic entry content as an example, sub-content of encyclopedic entry content may be entry names, semantic item descriptions, catalogues, summaries, texts, etc.
The hierarchical description data is used to describe hierarchical relationships between the sub-contents. As one example, the hierarchical description data may be a tree structure, and each sub-content corresponds to each node in the tree structure, thereby realizing a description of a hierarchical relationship between each sub-content.
In the embodiment of the disclosure, the content auditing rule of the content to be audited can be set according to actual requirements.
The content auditing rules are generally set for a portion of the sub-content in the content to be audited, which may be referred to as target sub-content. For example, the corresponding content auditing rules in encyclopedia entry content may be set separately for target word content such as a directory, entry name, and the like.
Since the hierarchical description data describes the hierarchical relationship between respective contents among the contents to be audited, a target hierarchical path of the target sub-content can be extracted therefrom.
The target hierarchical path may involve sub-content of multiple hierarchies, and the sub-content involved in the target hierarchical path is referred to as path sub-content. The target subcontent is nested in the path subcontent of the previous level, and the path subcontent of the next level is nested in the path subcontent of the previous level.
The target hierarchy path represents the location of the target sub-content in the pending content in the form of a path. The target hierarchical path may adopt a content identifier of the path sub-content in the first hierarchical level in the content to be checked, and the content identifiers of the path sub-content in each subsequent hierarchical level are listed sequentially according to the hierarchical order, and finally the content identifier of the target sub-content is ended.
For example, the target hierarchy path is: text-directory-username. The user name is the target sub-content, the directory is the path sub-content of the previous hierarchy of the user name, the directory is positioned at the previous hierarchy of the user name, and the text is the path sub-content of the previous hierarchy of the directory.
Because the target level path represents the positioning of the target sub-content in the content to be checked, the target sub-content can be positioned in the content to be checked based on the target level path, and then the target sub-content can be checked based on the content checking rule, so that the checking of the content to be checked is realized.
According to the method provided by the embodiment of the disclosure, the content to be audited comprises a plurality of levels of sub-content by acquiring the content to be audited and the level description data of the content to be audited, wherein the level description data is used for describing the level relation among the sub-content; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.
In an alternative manner of the present disclosure, locating target sub-content from a pending content based on a target hierarchy path includes:
generating a content locating instruction based on the target hierarchy path;
and locating the target sub-content from the content to be audited based on the content locating instruction.
In the embodiment of the disclosure, the target level path can provide an indication for positioning the target sub-content, so that the content positioning instruction can be generated based on the target level path, and the target sub-content is positioned from the content to be audited by executing the content positioning instruction.
By generating the content locating instruction based on the target hierarchical path, the target sub-content can be more quickly and efficiently located from the content to be checked.
In one alternative of the present disclosure, generating content localization instructions based on a target-level path includes:
generating sub-content positioning instructions for each sub-content involved in the target hierarchical path respectively;
determining an execution sequence of each sub-content positioning instruction based on a hierarchical order of each sub-content involved in the target hierarchical path;
and combining the sub-content positioning instructions based on the execution sequence to obtain the content positioning instruction.
In the embodiment of the disclosure, the target hierarchy path represents the positioning of the target sub-content in the pending content in the form of a path. The target tier path is in a form that is ordered by the content identification of the path sub-content of each tier involved.
When the content locating instruction is generated, the content identification of each path sub-content in the target hierarchical path can be used as a search keyword to respectively generate the sub-content locating instruction for each hierarchical path sub-content. And then the execution sequence of each sub-content positioning instruction can be set according to the hierarchical sequence of the sub-content of each path, and the execution sequence is combined with each sub-content positioning instruction to obtain the content positioning instruction.
Taking encyclopedic entry content as an example, the target hierarchical path is: text-directory-username. The user name, the catalogue and the catalogue are all content identifiers of target sub-content, the catalogue is located in the previous layer of the user name, and the text is located in the previous layer of the catalogue. And generating a first sub-content positioning instruction aiming at the text content by taking the text as a search keyword, wherein the first sub-content positioning instruction is used for positioning the text content from the encyclopedia entry content. And generating a second sub-content positioning instruction aiming at the directory content by taking the directory as a searching keyword, and positioning the directory content from the text content. Generating a third sub-content locating instruction for the user name by using the user name as a search keyword, wherein the third sub-content locating instruction is used for locating the user name from the directory content
In an optional manner of the disclosure, locating the target sub-content from the content to be audited based on the content locating instruction includes:
Positioning first-level sub-content from the content to be audited by executing a first-term sub-content positioning instruction in the content positioning instruction;
and sequentially executing each sub-content positioning instruction according to the execution sequence, and respectively positioning the sub-content of the next level from the sub-content of the previous level in the target level path until the target sub-content is positioned.
In the embodiment of the disclosure, each sub-content positioning instruction may be sequentially executed according to the execution sequence, and sub-content of a subsequent level is positioned from sub-content positioned in a previous level until the target sub-content is positioned.
As an example, a flowchart of executing content location instructions provided by an embodiment of the present disclosure is shown in fig. 2.
As shown in fig. 2, content represents the content identification of text content, index represents the content identification of directory content, uuid represents the content identification of user code content, and name represents the content identification of user name content. 1, 2, 3 in fig. 2 each represent a user code value, and a, b, c each identify a value of a user name. The positioning logic 1 corresponds to a first sub-content positioning instruction in the content positioning instructions, the positioning logic 2 corresponds to a second sub-content positioning instruction in the content positioning instructions, and the positioning logic 3 corresponds to a third sub-content positioning instruction in the content positioning instructions.
By executing the first item of subcontent locating instruction (i.e. locating logic 1) in the content locating instruction, it is possible to locate the text content from the encyclopedia entry content. By executing the second sub-content locating instruction (i.e., locating logic 2) of the content locating instructions, directory content can be located from the body content. By executing the third sub-content locating instruction (i.e., locating logic 3) of the content locating instructions, the value of the user code and the value of the user name can be located from the body content.
In this example, the located target sub-content may be format converted. In this example, the value of the user code and the value of the user name may be constructed in the form of key-value (K-V) pairs.
In the embodiment of the disclosure, the target data is searched by sequentially executing the sub-content positioning instructions, and the data amount to be traversed each time gradually decreases as the positioning operation goes deep, so that the target data can be positioned quickly and accurately. According to the scheme, the hierarchical structure of the content to be audited is effectively utilized, the target data is rapidly and accurately positioned, and the data processing efficiency can be effectively improved.
In an optional manner of the disclosure, the method further includes:
Displaying each piece of sub-content positioning instructions and sub-content respectively positioned based on each piece of sub-content positioning instructions to a user through a content positioning instruction configuration interface;
in response to detecting an instruction modification operation of the user in the content location instruction configuration interface, the content location instruction is adjusted based on the instruction modification operation.
In the embodiments of the present disclosure, to facilitate configuration, debugging, and modification of content positioning instructions, a positioning instruction configuration interface may be provided.
And displaying each piece of sub-content positioning instruction and the sub-content respectively positioned based on each piece of sub-content positioning instruction in the positioning instruction configuration interface, so that a user can intuitively observe each piece of sub-content positioning instruction and positioning result.
When the positioning result of the sub-content positioning instruction does not meet the requirement, the user can modify the instruction in the content positioning instruction configuration interface to adjust the content positioning instruction.
As an example, the code corresponding to each sub-content positioning instruction and the corresponding positioning result may be respectively displayed in the instruction configuration interface, and the user may edit the code corresponding to the sub-content positioning instruction.
In one alternative of the present disclosure, the hierarchy description data is determined based on metadata of the content to be audited.
In embodiments of the present disclosure, the content to be checked will generally correspond to metadata that is used to describe its attributes. Metadata generally contains multiple pieces of description information, and does not pertinently describe hierarchical relationships of sub-content in the content to be audited.
In this public embodiment, the hierarchy description data may be generated as follows: extracting each sub-content identifier and the hierarchical relation among the sub-contents from metadata of the content to be checked, using the sub-content identifier as a node of a tree structure, and obtaining hierarchical description data based on the node relation among the nodes of the hierarchical relation component attribute structure among the sub-contents.
In the embodiment of the disclosure, if the hierarchical relationship is extracted by using metadata, the diversified data contained in the metadata may cause data redundancy, and the hierarchical relationship described in the metadata may not be intuitive enough, so that the quick extraction is inconvenient. The hierarchical description data in the scheme can be created in a content auditing scene, redundant data is not contained, the hierarchical description data can intuitively describe the hierarchical relationship among the sub-contents, and the hierarchical relationship can be extracted rapidly and effectively based on the hierarchical description data, so that the processing efficiency is improved.
In the embodiment of the disclosure, the hierarchical description data may describe a hierarchical relationship between all sub-contents in the content to be checked, and in order to reduce the data amount, the hierarchical description data may also describe a hierarchical relationship between sub-contents in the content to be checked.
Because only the hierarchical relationship of the target sub-content is needed when the content to be audited is audited, only the hierarchical relationship of the target sub-content can be described in the hierarchical description data, namely, the hierarchical description data can only contain the hierarchical relationship among the sub-contents involved in the target hierarchical path of the target sub-content, thereby avoiding data redundancy in the hierarchical description data and reducing the data volume.
In the embodiment of the disclosure, when a content auditing rule is configured for a content to be audited, target sub-content aimed by the content auditing rule is obtained, then, hierarchical description data is generated based on metadata of the content to be audited, the hierarchical description data only comprises a hierarchical relation of the target sub-content, a target hierarchical path of the target sub-content is extracted based on the hierarchical description data, and the target sub-content is positioned from the content to be audited based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule.
In the embodiment of the disclosure, the content to be audited may correspond to different content audit rules under different scenes, and the adaptive hierarchical description data can be respectively generated for the different content audit rules and stored for later use in content audit.
In an alternative manner of the present disclosure, locating target sub-content from a pending content based on a target hierarchy path includes:
extracting candidate content from the content to be checked based on a preset extraction rule;
target sub-content is located from the candidate content based on the target hierarchical path.
In the embodiment of the disclosure, under different scenes, the content to be checked may have multiple forms, and candidate content with uniform structure can be extracted for subsequent processing by preprocessing the content to be checked.
In the embodiment of the disclosure, the extracted candidate content may be general content or high-frequency use content in the content to be audited. Such as entry names, sense item descriptions, catalogues, summaries, text, etc. in the encyclopedia entry content.
In an alternative mode of the disclosure, auditing the target sub-content based on the content auditing rules includes at least one of:
Determining whether the target sub-content belongs to a preset data type;
determining whether the content value in the target sub-content belongs to a preset numerical range;
determining whether the target sub-content contains preset content;
performing data conversion on the target sub-content based on a preset conversion rule, and determining whether a conversion result of the data conversion meets a preset condition;
and carrying out logic operation on the target sub-content based on a preset logic operation rule, and determining whether an operation result of the logic operation meets a preset condition.
In the embodiment of the disclosure, the content auditing rule may be used to determine whether the target sub-content belongs to a preset data type, and if so, the target sub-content may be considered to pass the content auditing rule.
The content auditing rule may be to judge whether the content value in the target sub-content belongs to a preset numerical range, and if so, it may be considered that the content value passes the content auditing rule. The preset data range may be set in the form of a numerical interval.
The content auditing rule may be used to determine whether the target sub-content includes the preset content, and if so, it may be considered that the target sub-content passes the content auditing rule.
In the embodiment of the disclosure, the data conversion may be further performed on the target sub-content based on a preset conversion rule, and at this time, the content auditing rule may determine whether a conversion result of the data conversion satisfies a preset condition, and if so, may consider that the content auditing rule is passed.
As one example, the preset conversion rule may be to convert the target word content into a preset data format.
In the embodiment of the disclosure, a logic operation rule may be preset to perform a logic operation on the target sub-content, and at this time, the content audit rule may determine whether an operation result of the logic operation satisfies a preset condition for performing the logic operation.
As one example, the logical operation rule may be a logical operation, such as addition, subtraction, multiplication, division, etc., on at least two target sub-contents.
As one example, the data types of the target sub-content may generally include a value type, a string type, and an array type. The logical judgment of the value type is as follows: judging that the value is more than or equal to the value, wherein the character string types are as follows: whether containing some substrings, whether present in some collections, etc., array types are: whether member elements are duplicated, whether intersections exist with other arrays, etc. For some judgment of the character string type and the array type, the judgment of the numerical type can be changed after conversion calculation, for example, the judgment of the numerical type is carried out after counting the number of members of the array.
In one alternative of the present disclosure, the content auditing rules include hierarchical content auditing rules for target sub-content within the same hierarchy, auditing the target sub-content based on the content auditing rules,
Acquiring target sub-content in a hierarchy to be checked, wherein the hierarchy to be checked is a hierarchy corresponding to a hierarchy content checking rule;
and determining whether the content condition of the target sub-content in the hierarchy to be checked meets the hierarchy content checking rule.
In the embodiment of the disclosure, a hierarchical audit rule can be configured for auditing the condition of the target sub-content in the hierarchy.
For example, the hierarchy corresponding to the hierarchy audit rule is the hierarchy of the primary directory and the hierarchy of the secondary directory. The hierarchical audit rule is used for judging whether repeated directory names exist in all directory names under the primary directory and all directory names under the secondary directory, and if not, the hierarchical audit rule is considered to pass the hierarchical content audit rule.
For another example, the hierarchy corresponding to the hierarchy audit rule is a hierarchy in which the primary directory is located and a hierarchy in which the secondary directory is located. The hierarchical audit rule is to judge that the primary catalogue contains at least two secondary catalogues, and if the primary catalogue contains at least two secondary catalogues, the primary catalogue is considered to pass the hierarchical content audit rule.
In an optional manner of the disclosure, after auditing the target sub-content based on the content auditing rule, the method further includes:
and in response to the target sub-content not meeting the content auditing rule, displaying the target sub-content in the to-be-audited content in a distinguishing manner.
In the embodiment of the disclosure, when the auditing of the target sub-content is completed, the auditing result can be displayed in the page for displaying the content to be audited.
If a target sub-content does not meet the content auditing rule, indicating that the target sub-content may not meet the content specification, the target sub-content may be displayed differently in the content to be audited, for example, such that the font of the target sub-content is different from other content in the content to be audited.
In the embodiment of the disclosure, details that the target sub-content does not meet the content auditing rule can be displayed, so that a user is prompted to modify the target word content based on the content auditing rule.
As an example, fig. 3 shows a schematic flow chart of a specific implementation of the content auditing method provided by the embodiments of the present disclosure.
The content auditing system in this solution may include a Loader (Loader), a Locator (Locator) and a determiner (jugger).
The Loader is used for coping with various forms of to-be-checked contents input under different scenes, and candidate contents with uniform structures can be extracted based on the Loader and provided for the Loader; the Locator is responsible for locating and extracting the target sub-content, the output of the Locator is used as the input of the Judger, and the Judger is responsible for carrying out logic judgment on the target sub-content based on the content auditing rule.
As shown in fig. 3, the loader outputs structured content, i.e., the loader extracts structurally uniform candidate content from the pending content and provides it to the locator.
The locator outputs the content to be judged, namely, the locator locates the target sub-content from the candidate content.
And inputting the content to be judged into a judging device, namely, carrying out logic judgment on the target sub-content by the judging device based on the content auditing rule, and determining whether the target word content meets the content auditing rule.
The scheme can be applied to editing, auditing and quality links of encyclopedia entry contents. And in the editing link, the editing user is given to prompt the irregular content existing in the entry content, so that the editing efficiency is improved, the content risk is avoided, and the follow-up auditing workload is reduced by reducing the non-compliant content submission. In the auditing link, the working efficiency of auditing personnel is improved by listing the non-compliant content in the entry content and marking. Some problem marks affecting the quality of the entry content can enter an encyclopedia quality center for recording, so that a content auditing rule can be conveniently further generated, and the continuous perfection of the content is realized.
The content auditing rule in the scheme can be managed through a visual configuration platform, so that the rule can be conveniently added and modified, and the dependence on code development is reduced.
Based on the same principle as the method shown in fig. 1, fig. 4 shows a schematic structural diagram of a content auditing apparatus provided by an embodiment of the disclosure, as shown in fig. 4, the content auditing apparatus 40 may include:
the data acquisition module 410 is configured to acquire content to be audited and hierarchical description data of the content to be audited, where the content to be audited includes sub-content of multiple hierarchies, and the hierarchical description data is used to describe hierarchical relationships between the sub-content;
the target hierarchical path determining module 420 is configured to obtain a target hierarchical path of target sub-content based on the hierarchical description data, where the target sub-content is a sub-content targeted in a content auditing rule of the content to be audited;
a target sub-content locating module 430 for locating target sub-content from the content to be checked based on the target hierarchical path;
the content auditing module 440 is configured to audit the target sub-content based on the content auditing rule.
According to the device provided by the embodiment of the disclosure, the content to be audited comprises a plurality of levels of sub-content by acquiring the content to be audited and the level description data of the content to be audited, wherein the level description data is used for describing the level relation among the sub-content; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.
Optionally, the target sub-content positioning module is specifically configured to:
generating a content locating instruction based on the target hierarchy path;
and locating the target sub-content from the content to be audited based on the content locating instruction.
Optionally, the target sub-content locating module is specifically configured to, when generating the content locating instruction based on the target hierarchical path:
generating sub-content positioning instructions for each sub-content involved in the target hierarchical path respectively;
determining an execution sequence of each sub-content positioning instruction based on a hierarchical order of each sub-content involved in the target hierarchical path;
and combining the sub-content positioning instructions based on the execution sequence to obtain the content positioning instruction.
Optionally, the target sub-content positioning module is specifically configured to, when positioning the target sub-content from the content to be audited based on the content positioning instruction:
positioning first-level sub-content from the content to be audited by executing a first-term sub-content positioning instruction in the content positioning instruction;
and sequentially executing each sub-content positioning instruction according to the execution sequence, and respectively positioning the sub-content of the next level from the sub-content of the previous level in the target level path until the target sub-content is positioned.
Optionally, the apparatus further includes:
the interface display module is used for configuring an interface through the content positioning instruction and displaying each piece of sub-content positioning instruction and the sub-content respectively positioned based on each piece of sub-content positioning instruction to a user;
and the content positioning instruction adjusting module is used for responding to the detection of the instruction modification operation of the user in the content positioning instruction configuration interface and adjusting the content positioning instruction based on the instruction modification operation.
Optionally, the hierarchy description data is determined based on metadata of the content to be audited.
Optionally, the target sub-content positioning module is specifically configured to:
extracting candidate content from the content to be checked based on a preset extraction rule;
target sub-content is located from the candidate content based on the target hierarchical path.
Optionally, the content auditing module is specifically configured to at least one of:
determining whether the target sub-content belongs to a preset data type;
determining whether the content value in the target sub-content belongs to a preset numerical range;
determining whether the target sub-content contains preset content;
performing data conversion on the target sub-content based on a preset conversion rule, and determining whether a conversion result of the data conversion meets a first preset condition;
And carrying out logic operation on the target sub-content based on a preset logic operation rule, and determining whether an operation result of the logic operation meets a second preset condition.
Optionally, the content auditing rule includes a hierarchical content auditing rule for target sub-content within the same hierarchy, and the content auditing module is specifically configured to:
acquiring target sub-content in a hierarchy to be checked, wherein the hierarchy to be checked is a hierarchy corresponding to a hierarchy content checking rule;
and determining whether the content condition of the target sub-content in the hierarchy to be checked meets the hierarchy content checking rule.
Optionally, the apparatus further includes:
and the target sub-content display module is used for responding to the fact that the target sub-content does not meet the content auditing rule after auditing the target sub-content based on the content auditing rule, and displaying the target sub-content in the content to be audited in a distinguishing way.
It will be appreciated that the above-described modules of the content auditing apparatus in the embodiments of the present disclosure have functions to implement the corresponding steps of the content auditing method in the embodiment shown in fig. 1. The functions can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules may be software and/or hardware, and each module may be implemented separately or may be implemented by integrating multiple modules. The functional description of each module of the content auditing apparatus may be specifically referred to the corresponding description of the content auditing method in the embodiment shown in fig. 1, and will not be repeated here.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
The electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a content auditing method as provided by embodiments of the present disclosure.
Compared with the prior art, the electronic equipment acquires the content to be audited and the hierarchical description data of the content to be audited, wherein the content to be audited comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing the hierarchical relationship among the sub-contents; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.
The readable storage medium is a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a content auditing method as provided by embodiments of the present disclosure.
Compared with the prior art, the readable storage medium is characterized in that the content to be audited comprises a plurality of levels of sub-content by acquiring the content to be audited and the level description data of the content to be audited, wherein the level description data is used for describing the level relation among the sub-content; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.
The computer program product comprises a computer program which, when executed by a processor, implements a content auditing method as provided by embodiments of the present disclosure.
Compared with the prior art, the computer program product is characterized in that the content to be audited comprises a plurality of levels of sub-content by acquiring the content to be audited and the level description data of the content to be audited, wherein the level description data is used for describing the level relation among the sub-content; acquiring a target hierarchical path of target sub-content based on hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited; locating target sub-content from the content to be checked based on the target hierarchical path; and auditing the target sub-content based on the content auditing rule. Based on the scheme, the target sub-content can be effectively positioned from the content to be audited, so that the target sub-content is audited based on the content audit rule, and the effective audit of the content to be audited is facilitated.
Fig. 5 shows a schematic block diagram of an example electronic device 50 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the electronic device 50 includes a computing unit 510 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 520 or a computer program loaded from a storage unit 580 into a Random Access Memory (RAM) 530. In RAM 530, various programs and data required for the operation of device 50 may also be stored. The computing unit 510, ROM 520, and RAM 530 are connected to each other by a bus 540. An input/output (I/O) interface 550 is also connected to bus 540.
Various components in the device 50 are connected to the I/O interface 550, including: an input unit 560 such as a keyboard, a mouse, etc.; an output unit 570 such as various types of displays, speakers, and the like; a storage unit 580 such as a magnetic disk, an optical disk, or the like; and a communication unit 590 such as a network card, a modem, a wireless communication transceiver, etc. The communication unit 590 allows the device 50 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 510 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 510 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 510 performs the content auditing method provided in the embodiments of the present disclosure. For example, in some embodiments, performing the content auditing methods provided in embodiments of the present disclosure may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 580. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 50 via the ROM 520 and/or the communication unit 590. One or more steps of the content auditing method provided in embodiments of the present disclosure may be performed when a computer program is loaded into RAM 530 and executed by computing unit 510. Alternatively, in other embodiments, the computing unit 510 may be configured to perform the content auditing methods provided in embodiments of the present disclosure in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (20)

1. A content auditing method, comprising:
acquiring to-be-audited content and hierarchical description data of the to-be-audited content, wherein the to-be-audited content comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing hierarchical relations among the sub-contents;
acquiring a target hierarchical path of target sub-content based on the hierarchical description data, wherein the target sub-content is the sub-content aimed in a content auditing rule of the content to be audited;
Positioning the target sub-content from the content to be audited based on the target hierarchical path;
and auditing the target sub-content based on the content auditing rule.
2. The method of claim 1, wherein the locating the target sub-content from the content to be audited based on the target tier path comprises:
generating a content locating instruction based on the target hierarchy path;
and positioning the target sub-content from the content to be audited based on the content positioning instruction.
3. The method of claim 2, wherein the generating content location instructions based on the target-level path comprises:
generating sub-content positioning instructions for each sub-content involved in the target hierarchical path respectively;
determining the execution sequence of each sub-content positioning instruction based on the hierarchical sequence of each sub-content related in the target hierarchical path;
and combining the sub-content positioning instructions based on the execution sequence to obtain the content positioning instructions.
4. A method according to claim 3, wherein said locating the target sub-content from the content to be audited based on the content locating instruction comprises:
Positioning first-level sub-content from the content to be audited by executing the first-item sub-content positioning instruction in the content positioning instruction;
and sequentially executing each sub-content positioning instruction according to the execution sequence, and respectively positioning the sub-content of the next level from the sub-content of the previous level in the target level path until the target sub-content is positioned.
5. The method of claim 4, further comprising:
displaying each piece of sub-content positioning instructions and sub-content respectively positioned based on each piece of sub-content positioning instructions to a user through a content positioning instruction configuration interface;
in response to detecting an instruction modification operation of a user in the content location instruction configuration interface, the content location instruction is adjusted based on the instruction modification operation.
6. The method of any of claims 1-5, wherein the tier description data is determined based on metadata of the pending content.
7. The method of any of claims 1-6, wherein the locating the target sub-content from the content to be audited based on the target tier path comprises:
extracting candidate content from the content to be checked based on a preset extraction rule;
The target sub-content is located from the candidate content based on the target hierarchical path.
8. The method of any of claims 1-7, wherein the auditing the target sub-content based on the content auditing rules includes at least one of:
determining whether the target sub-content belongs to a preset data type;
determining whether the content value in the target sub-content belongs to a preset numerical range;
determining whether the target sub-content contains preset content or not;
performing data conversion on the target sub-content based on a preset conversion rule, and determining whether a conversion result of the data conversion meets a first preset condition;
and carrying out logic operation on the target sub-content based on a preset logic operation rule, and determining whether an operation result of the logic operation meets a second preset condition.
9. The method of any of claims 1-8, wherein the content auditing rules include hierarchical content auditing rules for target sub-content within a same hierarchy, the auditing of the target sub-content based on the content auditing rules including:
acquiring the target sub-content in a hierarchy to be checked, wherein the hierarchy to be checked is a hierarchy corresponding to the hierarchy content checking rule;
And determining whether the content condition of the target sub-content in the hierarchy to be checked meets the hierarchy content checking rule.
10. The method of any of claims 1-9, after the auditing the target sub-content based on the content auditing rules, the method further comprising:
and responding to the target sub-content not meeting the content auditing rule, and displaying the target sub-content in the content to be audited in a distinguishing way.
11. A content auditing apparatus, comprising:
the data acquisition module is used for acquiring to-be-audited content and hierarchical description data of the to-be-audited content, wherein the to-be-audited content comprises a plurality of hierarchical sub-contents, and the hierarchical description data is used for describing hierarchical relations among the sub-contents;
the target hierarchical path determining module is used for acquiring a target hierarchical path of target sub-content based on the hierarchical description data, wherein the target sub-content is the sub-content aimed at in the content auditing rule of the content to be audited;
the target sub-content positioning module is used for positioning the target sub-content from the content to be audited based on the target hierarchical path;
and the content auditing module is used for auditing the target sub-content based on the content auditing rule.
12. The apparatus of claim 11, wherein the target sub-content locating module is specifically configured to:
generating a content locating instruction based on the target hierarchy path;
and positioning the target sub-content from the content to be audited based on the content positioning instruction.
13. The apparatus of claim 12, wherein the target sub-content locating module, when generating content locating instructions based on the target-level path, is specifically to:
generating sub-content positioning instructions for each sub-content involved in the target hierarchical path respectively;
determining the execution sequence of each sub-content positioning instruction based on the hierarchical sequence of each sub-content related in the target hierarchical path;
and combining the sub-content positioning instructions based on the execution sequence to obtain the content positioning instructions.
14. The apparatus of claim 13, wherein the target sub-content locating module is configured to, when locating the target sub-content from the content to be audited based on the content locating instruction:
positioning first-level sub-content from the content to be audited by executing the first-item sub-content positioning instruction in the content positioning instruction;
And sequentially executing each sub-content positioning instruction according to the execution sequence, and respectively positioning the sub-content of the next level from the sub-content of the previous level in the target level path until the target sub-content is positioned.
15. The apparatus of claim 14, further comprising:
the interface display module is used for configuring an interface through the content positioning instruction and displaying each piece of sub-content positioning instruction and the sub-content respectively positioned based on each piece of sub-content positioning instruction to a user;
and the content positioning instruction adjusting module is used for responding to the detection of the instruction modifying operation of the user in the content positioning instruction configuration interface and adjusting the content positioning instruction based on the instruction modifying operation.
16. The apparatus of any of claims 11-15, wherein the tier description data is determined based on metadata of the pending content.
17. The apparatus of any of claims 11-16, wherein the target subcontent positioning module is specifically configured to:
extracting candidate content from the content to be checked based on a preset extraction rule;
the target sub-content is located from the candidate content based on the target hierarchical path.
18. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
19. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-10.
20. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-10.
CN202310323955.XA 2023-03-29 2023-03-29 Content auditing method, device, electronic equipment and computer readable storage medium Pending CN116361591A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310323955.XA CN116361591A (en) 2023-03-29 2023-03-29 Content auditing method, device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310323955.XA CN116361591A (en) 2023-03-29 2023-03-29 Content auditing method, device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116361591A true CN116361591A (en) 2023-06-30

Family

ID=86941483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310323955.XA Pending CN116361591A (en) 2023-03-29 2023-03-29 Content auditing method, device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN116361591A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861463A (en) * 2023-07-25 2023-10-10 江苏中卫信软件科技有限公司 Processing method for SaaS transformation of general information system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861463A (en) * 2023-07-25 2023-10-10 江苏中卫信软件科技有限公司 Processing method for SaaS transformation of general information system
CN116861463B (en) * 2023-07-25 2024-01-23 江苏中卫信软件科技有限公司 Processing method for SaaS transformation of general information system

Similar Documents

Publication Publication Date Title
JP2021174516A (en) Knowledge graph construction method, device, electronic equipment, storage medium, and computer program
CN113792154B (en) Method and device for determining fault association relationship, electronic equipment and storage medium
CN112989235B (en) Knowledge base-based inner link construction method, device, equipment and storage medium
CN113660541B (en) Method and device for generating abstract of news video
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN116361591A (en) Content auditing method, device, electronic equipment and computer readable storage medium
CN113378091A (en) Visual project generation method and device, electronic equipment and storage medium
CN114676678A (en) Structured query language data parsing method and device and electronic equipment
CN114064925A (en) Knowledge graph construction method, data query method, device, equipment and medium
CN113836316A (en) Processing method, training method, device, equipment and medium for ternary group data
CN111274353B (en) Text word segmentation method, device, equipment and medium
CN117171296A (en) Information acquisition method and device and electronic equipment
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN114995719B (en) List rendering method, device, equipment and storage medium
CN114168119B (en) Code file editing method, device, electronic equipment and storage medium
CN116009847A (en) Code generation method, device, electronic equipment and storage medium
CN116185389A (en) Code generation method and device, electronic equipment and medium
CN115687717A (en) Method, device and equipment for acquiring hook expression and computer readable storage medium
CN114860872A (en) Data processing method, device, equipment and storage medium
CN112989066A (en) Data processing method and device, electronic equipment and computer readable medium
CN114417862A (en) Text matching method, and training method and device of text matching model
CN113377922B (en) Method, device, electronic equipment and medium for matching information
CN113254826B (en) Dump file processing method and device
CN115828915B (en) Entity disambiguation method, device, electronic equipment and storage medium
CN113377921B (en) Method, device, electronic equipment and medium for matching information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination