CN113033217B - Automatic shielding translation method and device for subtitle sensitive information - Google Patents

Automatic shielding translation method and device for subtitle sensitive information Download PDF

Info

Publication number
CN113033217B
CN113033217B CN202110416698.5A CN202110416698A CN113033217B CN 113033217 B CN113033217 B CN 113033217B CN 202110416698 A CN202110416698 A CN 202110416698A CN 113033217 B CN113033217 B CN 113033217B
Authority
CN
China
Prior art keywords
subtitle
sensitive
word
sensitive information
sentences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110416698.5A
Other languages
Chinese (zh)
Other versions
CN113033217A (en
Inventor
杨雨薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Huanwang Technology Co Ltd
Original Assignee
Guangdong Huanwang Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Huanwang Technology Co Ltd filed Critical Guangdong Huanwang Technology Co Ltd
Priority to CN202110416698.5A priority Critical patent/CN113033217B/en
Publication of CN113033217A publication Critical patent/CN113033217A/en
Application granted granted Critical
Publication of CN113033217B publication Critical patent/CN113033217B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a subtitle sensitive information automatic shielding translation method and a device, wherein the method comprises the following steps: firstly, acquiring a subtitle source file, and analyzing the source subtitle file to obtain a subtitle analysis result; then, calculating the association degree of each line in the subtitle based on a preset semantic recognition algorithm and a subtitle analysis result, and judging the meaning and meaning of the words in the subtitle; the sensitivity judgment is carried out on words and sentences in the caption through a preset semantic recognition algorithm and the word meaning and the semantic meaning determined in the steps, so that sensitive information is determined; and finally, replacing the sensitive information with a hyponym to generate the desensitized subtitle. Therefore, the automatic shielding and translating method for the subtitle sensitive information can automatically identify the sensitive information and replace the sensitive information by the paraphraseology, so that the automatic shielding and translating of the sensitive information is realized, and the problems that in the prior art, the screening and the processing of the sensitive information are both required to be manually processed, the workload of staff is large and the processing speed is low are solved.

Description

Automatic shielding translation method and device for subtitle sensitive information
Technical Field
The application relates to the technical field of computers, in particular to an automatic shielding translation method and device for subtitle sensitive information.
Background
The video content seen by the television user at the present stage can cover different types, different countries and different languages, and the video service content delivery also has a large amount of program resources and has a small workload for operation work. The user needs to assist in understanding the intention expressed by the video programs by means of corresponding subtitles and lines when browsing the language programs in different countries and different regions. However, the subtitle information is in the problems of sensitive words and the like, which affect the user's viewing experience.
In the prior art, the caption editing is generally carried out according to movies and television dramas with lines such as actors and dramas through later manual editing and calibration, and the traditional modes have large manual dependence, large workload and insufficient processing speed.
Disclosure of Invention
Aiming at the problems of high manual dependency, large manual workload and low processing speed in the existing sensitive subtitle screening and editing, the application provides an automatic subtitle sensitive information screening and translating method and device, which realize automatic screening and translating of subtitle sensitive information, thereby freeing up manpower and improving the processing speed.
The above object of the present application is achieved by the following technical solutions:
in a first aspect, an embodiment of the present application provides a method for automatically shielding and translating subtitle sensitive information, including:
acquiring a caption source file;
analyzing the caption source file to obtain a caption analysis result;
performing association degree calculation on each term in the subtitle based on a preset semantic recognition algorithm and the subtitle analysis result, and judging word meaning of the term in the subtitle and sentence meaning of the sentence in the subtitle;
performing sensitivity judgment on words and sentences in the subtitle through a preset sensitive semantic word recognition algorithm based on the word meaning and the semantic meaning, and determining sensitive information;
the sensitive information is replaced by the hyponym, and the desensitization subtitle is generated.
Optionally, the parsing the subtitle file includes parsing a subtitle file attribute; the subtitle file attributes include a file format, a line time point and a file size.
Optionally, the calculating the association degree of each term in the subtitle based on the preset semantic recognition algorithm and the subtitle analysis result, and the judging the meaning of the term in the subtitle and the meaning of the sentence in the subtitle includes:
performing association degree calculation on caption lines through a preset semantic recognition algorithm;
determining a context according to the plurality of lines with high association degree;
judging sentence meaning of the sentence with high consistency under the context;
and judging word senses of the words with high consistency under the context.
Optionally, the sensitivity judgment is performed on the words and sentences in the subtitle through a preset sensitive word recognition algorithm based on the word meaning and the semantic meaning, and determining the sensitive information includes:
performing sensitivity judgment on words in the caption through a preset sensitive semantic recognition algorithm based on the word senses, and determining sensitive words;
and carrying out sensitivity judgment on sentences in the caption through a preset sensitive semantic recognition algorithm based on the sentence meaning, and determining sensitive sentences.
Optionally, the replacing the sensitive information with a hyponym, and generating the desensitized subtitle includes:
finding a paraphrasing word of the sensitive word in a preset word stock, replacing the sensitive word, and generating a desensitization subtitle;
and finding the hyponyms of the words in the sensitive sentences in a preset word stock, forming sentences with the semantemes similar to the semanteme of the sensitive sentences by the hyponyms, and generating desensitized sentences.
Optionally, the preset word stock is a local word stock in a preset system.
Optionally, the preset word stock is an online third party word stock.
In a second aspect, an embodiment of the present application provides an automatic subtitle sensitive information shielding and translating device, which is characterized by comprising:
the acquisition module is used for acquiring the subtitle source file;
the analysis module is used for analyzing the file format, the line time point and the file size of the source caption file to obtain a caption analysis result;
the recognition judging module is used for recognizing and calculating the association degree of the analysis characters through a preset semantic analysis algorithm, judging word sense semantics and judging sensitive information based on the word sense semantics;
and the replacing module is used for replacing the sensitive information with the hyponym to generate the desensitized subtitle.
Optionally, the device further comprises a storage module;
the storage module is used for storing and storing the desensitized subtitles.
Optionally, the system further comprises a communication module;
the communication module is respectively in communication connection with the storage module and the external playing device and is used for sending the desensitized caption to the external playing device for the external playing device to use.
The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:
in the technical scheme provided by the embodiment of the application, a caption source file is firstly obtained, and the source caption file is analyzed to obtain a caption analysis result; then, calculating the association degree of each line in the subtitle based on a preset semantic recognition algorithm and a subtitle analysis result, and judging the meaning of the words in the subtitle and the meaning of the sentences in the subtitle; the sensitivity judgment is carried out on words and sentences in the subtitles through presetting a sensitive semantic word recognition algorithm and the word meaning and the semantic meaning determined in the steps, so that sensitive information is determined; and finally, replacing the sensitive information with a hyponym to generate the desensitized subtitle. Thus, by the method provided by the embodiment of the application, the sensitive information can be automatically identified and replaced by the paraphrasing, so that the sensitive information is automatically shielded and translated, and the problems of high workload and low processing speed caused by manual processing in the screening and processing of the sensitive information in the prior art are solved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flow chart of an automatic subtitle sensitive information shielding and translating method according to an embodiment of the present application;
fig. 2 is a schematic flow chart of an automatic subtitle sensitive information shielding and translating method according to another embodiment of the present application;
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
Examples:
referring to fig. 1, fig. 1 is a flowchart illustrating an automatic subtitle sensitive information shielding and translating method according to an embodiment of the present application. As shown in fig. 1, the method includes:
s101, acquiring a caption source file;
specifically, the acquired subtitle source file may be subtitles in different languages in different countries, files only containing the subtitles may be directly acquired, or subtitle files acquired by extracting the subtitles in the video to be played through a preset system or device, where the subtitles mainly include dialogues, bystanders, and the like of characters in the video.
S102, analyzing the caption source file to obtain a caption analysis result;
specifically, after the subtitle source file is acquired, the original subtitle file needs to be parsed, that is, after the subtitle source file is fully imported, file data of the subtitle source file is parsed, wherein file attributes can include file format, line time point, file size and the like, and the subtitle source file is subjected to preliminary parsing, so that subsequent semantic recognition and sensitive information judgment processing are facilitated, and the processing speed is improved.
S103, calculating the association degree of each line in the subtitle based on a preset semantic recognition algorithm and the subtitle analysis result, and judging the word meaning of the word in the subtitle and the sentence meaning of the sentence in the subtitle;
specifically, the subtitle is subjected to association calculation by algorithms such as semantic recognition and the like, and scene recognition and word calibration of subtitle lines are performed. Judging the association degree between each word and each sentence, analyzing and identifying the words and sentences with high association degree, thereby determining a use scene, further analyzing and determining the use intention of the lines in the caption under the scene, and finally determining accurate word meaning and sentence meaning.
In a specific embodiment, if the role a and the role B in the scene are in a dialogue in a one-to-one mode, the coherence degree of each sentence line can be calculated according to semantic recognition, when the sentences with higher correlation degree are analyzed, the word use context at the place can be determined, and after the whole context is determined, the accurate meaning of each sentence and each word can be more accurately analyzed. Through association degree calculation and context analysis, word meaning and sentence meaning can be more accurately analyzed, and an accurate basis is provided for shielding translation of sensitive words.
S104, carrying out sensitivity judgment on words and sentences in the subtitle through a preset sensitive semantic word recognition algorithm based on the word meaning and the semantic meaning, and determining sensitive information;
specifically, after analyzing and sorting the accurate meanings of words and sentences in the caption lines, sensitivity calculation is carried out on the sentences and the vocabularies through a sensitive semantic recognition algorithm, and sensitive information is screened. The sensitive semantic recognition algorithm may be a special sensitive information recognition algorithm, and determines sensitive information when the sensitivity reaches a certain degree by performing sensitivity judgment on words or sentences to be detected according to the frequency, association degree and other standards of the preset words and sentences in the detected words. Or pre-storing related sensitive words in a preset system, and determining that sensitive information exists in the caption when the sensitive words appear in the caption line.
It should be noted that, the manner of pre-storing the related sensitive vocabulary in the preset system may be replaced by other manners, for example, a sensitive semantic recognition algorithm is run in the preset system, and a basis is provided for sensitive word recognition by connecting an external third party word library, so that the storage pressure of the preset system is saved.
In addition, in practical application, the above-mentioned semantic recognition algorithm and the sensitive semantic recognition algorithm can be used in various ways, namely, the word meaning and sentence meaning in the caption line are analyzed by adopting various recognition algorithms, and sensitive information is screened by adopting various recognition algorithms, so that the recognition and detection effects are improved, the same algorithm comprising two functions can be adopted, the system resource pressure is reduced, and the specific implementation can be set according to the practical situation.
S105, replacing the sensitive information with a hyponym to generate the desensitized subtitle.
Specifically, after detecting that sensitive information exists in the subtitle, replacing the sensitive information with a hyponym in a preset system word stock, thereby generating the desensitized subtitle. The word stock of the preset system can be a word stock which is locally stored by the system through hardware such as a server and can be continuously expanded according to daily use and rules, on the other hand, the word stock can also be a word stock of a third party, the preset system is connected with the third party word stock, so that the function of acquiring information in the third party word stock, such as a network input method and the like, can be realized, and a plurality of third party word stocks can be simultaneously connected, so that the word stock content is expanded, and the realization range of sensitive information shielding translation is ensured.
In the scheme provided by the embodiment, the use scene and the use intention are judged from the meaning of the words and sentences in the recognition caption line, so that the word meaning is more accurately recognized, when the existence of the sensitive word is detected, the sensitive word is replaced by the paranym of the sensitive word, so that the translation of the sensitive word is shielded, and the sentence meaning only plays an auxiliary function. In practical applications, there may be a case where the sentence is a sensitive sentence as a whole, and for this case, the vocabulary and the sentence may be processed separately.
For example, performing association calculation on caption lines through a preset semantic recognition algorithm; after determining the context according to the multi-sentence lines with high association, respectively judging the sentence meaning of the sentence with high association under the context, and then judging the word meaning of the word with high association under the context. After judging word senses and semantics, based on the method, determining sensitive words through a sensitive word recognition algorithm according to the word senses and carrying out sensitivity judgment on words in subtitles; and meanwhile, based on the method, the sensitive sentence is determined according to the sentence meaning through a sensitive word recognition algorithm and the sensitivity judgment of sentences in the subtitle, and at the moment, the sensitive information comprises two kinds of sensitive words and sensitive sentences. And finally, replacing the sensitive vocabulary with the hyponyms, recombining the sensitive sentences with a plurality of hyponyms to generate new sentences, and replacing the sensitive sentences with the new sentences. And translating and replacing the sensitive vocabulary and the sensitive sentence to produce the desensitized caption, and ensuring that the desensitized caption does not have sensitive information more strictly.
Fig. 2 is a flow chart of an automatic subtitle sensitive information shielding and translating method according to another embodiment of the present application, as shown in fig. 2:
in a specific implementation process, a source file, namely a subtitle source file, is injected first; then, analyzing word association degree, use context and the like by using a semantic recognition algorithm, and analyzing word use intention so as to accurately judge word meaning and sentence meaning; then sensitivity screening is carried out through a semantic recognition algorithm, and whether sensitive words exist in the subtitles is detected; after detecting that sensitive vocabulary exists, automatically matching vocabulary similar to the sensitive vocabulary in terms of semantics, replacing the sensitive vocabulary, and generating desensitized subtitles; finally, the desensitized subtitles are saved. The whole process realizes automatic recognition and automatic replacement of sensitive vocabulary, and solves the problems of large personnel workload and low processing speed caused by that the sensitive vocabulary in the caption can only be recognized and processed manually in the prior art.
Based on the same inventive concept, the embodiment of the application also provides an automatic shielding translation device for caption sensitive information, comprising:
the acquisition module is used for acquiring the subtitle source file;
the analysis module is used for analyzing the file format, the line time point and the file size of the source caption file to obtain a caption analysis result;
the recognition judging module is used for recognizing and calculating the association degree of the analysis characters through a preset semantic analysis algorithm, judging word sense semantics and judging sensitive information based on the word sense semantics;
and the replacing module is used for replacing the sensitive information with the hyponym to generate the desensitized subtitle.
Furthermore, the automatic shielding and translating device for the subtitle sensitive information provided by the embodiment of the application further comprises a storage module;
the storage module is used for storing and storing the desensitized subtitles.
In addition, the automatic shielding and translating device for the subtitle sensitive information provided by the embodiment of the application can also comprise a communication module;
the communication module is respectively in communication connection with the storage module and the external playing device and is used for sending the desensitized caption to the external playing device for the external playing device to use.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (8)

1. An automatic shielding translation method for caption sensitive information, which is characterized by comprising the following steps:
acquiring a caption source file;
analyzing the caption source file to obtain a caption analysis result;
performing association degree calculation on each term in the subtitle based on a preset semantic recognition algorithm and the subtitle analysis result, and judging word meaning of the term in the subtitle and sentence meaning of the sentence in the subtitle;
based on the word meaning and sentence meaning, carrying out sensitivity judgment on the words and sentences in the caption through a preset sensitive semantic recognition algorithm, and determining sensitive information, wherein the method comprises the following steps:
performing sensitivity judgment on words in the subtitle through a preset sensitive semantic recognition algorithm based on the word senses, determining sensitive words, performing sensitivity judgment on sentences in the subtitle through the preset sensitive semantic recognition algorithm based on the sentence senses, and determining sensitive sentences; the sensitive semantic recognition algorithm judges the sensitivity of the words or sentences to be detected according to the frequency and the relevancy standard of the preset words and sentences in the detected words, and determines sensitive information;
replacing sensitive information with a hyponym to generate a desensitized subtitle, comprising:
finding a paraphrasing word of the sensitive word in a preset word stock, replacing the sensitive word, and generating a desensitization subtitle;
and finding the hyponyms of the words in the sensitive sentences in a preset word stock, forming sentences with the semantemes similar to the semanteme of the sensitive sentences by the hyponyms, and generating desensitized sentences.
2. The method for automatically masking and translating subtitle sensitive information according to claim 1, wherein parsing the subtitle source file includes parsing a subtitle file attribute; the subtitle file attributes include a file format, a line time point and a file size.
3. The method for automatically masking and translating subtitle sensitive information according to claim 1, wherein the calculating the relevancy of each term in the subtitle based on the preset semantic recognition algorithm and the subtitle parsing result, and determining the meaning of the term in the subtitle and the meaning of the sentence in the subtitle includes:
performing association degree calculation on caption lines through a preset semantic recognition algorithm;
determining a context according to the plurality of lines with high association degree;
judging sentence meaning of the sentence with high consistency under the context;
and judging word senses of the words with high consistency under the context.
4. The method for automatically masking and translating subtitle sensitive information according to claim 3, wherein the predetermined word stock is a local word stock in a predetermined system.
5. The method for automatically masking and translating subtitle-sensitive information according to claim 3, wherein the predetermined word stock is an online third-party word stock.
6. An automatic shielding and translating device for caption sensitive information, comprising:
the acquisition module is used for acquiring the subtitle source file;
the analysis module is used for analyzing the file format, the line time point and the file size of the caption source file to obtain a caption analysis result;
the recognition judging module is used for carrying out recognition and association degree calculation on the analysis subtitle through a preset semantic recognition algorithm, judging word meaning and sentence meaning, and judging sensitive information based on the word meaning and sentence meaning, and comprises the following steps:
performing sensitivity judgment on words in the subtitle through a preset sensitive semantic recognition algorithm based on the word senses, determining sensitive words, performing sensitivity judgment on sentences in the subtitle through the preset sensitive semantic recognition algorithm based on the sentence senses, and determining sensitive sentences; the sensitive semantic recognition algorithm judges the sensitivity of the words or sentences to be detected according to the frequency and the relevancy standard of the preset words and sentences in the detected words, and determines sensitive information;
a replacing module for replacing sensitive information with a paraphrasing to generate a desensitized subtitle, comprising:
finding a paraphrasing word of the sensitive word in a preset word stock, replacing the sensitive word, and generating a desensitization subtitle;
and finding the hyponyms of the words in the sensitive sentences in a preset word stock, forming sentences with the semantemes similar to the semanteme of the sensitive sentences by the hyponyms, and generating desensitized sentences.
7. The automatic subtitle sensitive information screening and translating apparatus of claim 6, further comprising a memory module;
the storage module is used for storing and storing the desensitized subtitles.
8. The automatic subtitle sensitive information screening and translating apparatus of claim 7, further comprising a communication module;
the communication module is respectively in communication connection with the storage module and the external playing device and is used for sending the desensitized caption to the external playing device for the external playing device to use.
CN202110416698.5A 2021-04-19 2021-04-19 Automatic shielding translation method and device for subtitle sensitive information Active CN113033217B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110416698.5A CN113033217B (en) 2021-04-19 2021-04-19 Automatic shielding translation method and device for subtitle sensitive information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110416698.5A CN113033217B (en) 2021-04-19 2021-04-19 Automatic shielding translation method and device for subtitle sensitive information

Publications (2)

Publication Number Publication Date
CN113033217A CN113033217A (en) 2021-06-25
CN113033217B true CN113033217B (en) 2023-09-15

Family

ID=76456778

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110416698.5A Active CN113033217B (en) 2021-04-19 2021-04-19 Automatic shielding translation method and device for subtitle sensitive information

Country Status (1)

Country Link
CN (1) CN113033217B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115544240B (en) * 2022-11-24 2023-04-07 闪捷信息科技有限公司 Text sensitive information identification method and device, electronic equipment and storage medium
CN116701614B (en) * 2023-08-02 2024-07-19 南京壹行科技有限公司 Sensitive data model building method for intelligent text collection
CN117998145B (en) * 2024-04-03 2024-06-18 海看网络科技(山东)股份有限公司 Subtitle real-time monitoring method, system and equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183761A (en) * 2015-07-27 2015-12-23 网易传媒科技(北京)有限公司 Sensitive word replacement method and apparatus
CN108984530A (en) * 2018-07-23 2018-12-11 北京信息科技大学 A kind of detection method and detection system of network sensitive content
CN109600681A (en) * 2018-11-29 2019-04-09 南昌与德软件技术有限公司 Caption presentation method, device, terminal and storage medium
CN109740053A (en) * 2018-12-26 2019-05-10 广州灵聚信息科技有限公司 Sensitive word screen method and device based on NLP technology
CN111368535A (en) * 2018-12-26 2020-07-03 珠海金山网络游戏科技有限公司 Sensitive word recognition method, device and equipment
CN112001174A (en) * 2020-08-10 2020-11-27 深圳中兴网信科技有限公司 Text desensitization method, apparatus, electronic device and computer-readable storage medium
CN112417103A (en) * 2020-12-02 2021-02-26 百度国际科技(深圳)有限公司 Method, apparatus, device and storage medium for detecting sensitive words

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8713420B2 (en) * 2011-06-30 2014-04-29 Cable Television Laboratories, Inc. Synchronization of web applications and media

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183761A (en) * 2015-07-27 2015-12-23 网易传媒科技(北京)有限公司 Sensitive word replacement method and apparatus
CN108984530A (en) * 2018-07-23 2018-12-11 北京信息科技大学 A kind of detection method and detection system of network sensitive content
CN109600681A (en) * 2018-11-29 2019-04-09 南昌与德软件技术有限公司 Caption presentation method, device, terminal and storage medium
CN109740053A (en) * 2018-12-26 2019-05-10 广州灵聚信息科技有限公司 Sensitive word screen method and device based on NLP technology
CN111368535A (en) * 2018-12-26 2020-07-03 珠海金山网络游戏科技有限公司 Sensitive word recognition method, device and equipment
CN112001174A (en) * 2020-08-10 2020-11-27 深圳中兴网信科技有限公司 Text desensitization method, apparatus, electronic device and computer-readable storage medium
CN112417103A (en) * 2020-12-02 2021-02-26 百度国际科技(深圳)有限公司 Method, apparatus, device and storage medium for detecting sensitive words

Also Published As

Publication number Publication date
CN113033217A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN113033217B (en) Automatic shielding translation method and device for subtitle sensitive information
US10621988B2 (en) System and method for speech to text translation using cores of a natural liquid architecture system
US11521603B2 (en) Automatically generating conference minutes
KR102069698B1 (en) Apparatus and Method Correcting Linguistic Analysis Result
US9424253B2 (en) Domain specific natural language normalization
CN109684634B (en) Emotion analysis method, device, equipment and storage medium
CN109614604B (en) Subtitle processing method, device and storage medium
CN109558513B (en) Content recommendation method, device, terminal and storage medium
US7996227B2 (en) System and method for inserting a description of images into audio recordings
US20020076112A1 (en) Apparatus and method of program classification based on syntax of transcript information
CN109600681B (en) Subtitle display method, device, terminal and storage medium
CN111210842A (en) Voice quality inspection method, device, terminal and computer readable storage medium
US20240064383A1 (en) Method and Apparatus for Generating Video Corpus, and Related Device
US9940326B2 (en) System and method for speech to speech translation using cores of a natural liquid architecture system
US11756301B2 (en) System and method for automatically detecting and marking logical scenes in media content
CN113035199A (en) Audio processing method, device, equipment and readable storage medium
US10595098B2 (en) Derivative media content systems and methods
WO2020181417A1 (en) Internationalization of automated test scripts
US10499121B2 (en) Derivative media content systems and methods
KR20200063316A (en) Apparatus for searching video based on script and method for the same
CN117725161A (en) Method and system for identifying variant words in text and extracting sensitive words
CN109800430B (en) Semantic understanding method and system
CN114697762B (en) Processing method, processing device, terminal equipment and medium
CN112233661B (en) Video content subtitle generation method, system and equipment based on voice recognition
US20240073476A1 (en) Method and system for accessing user relevant multimedia content within multimedia files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant