CN107071553B

CN107071553B - Method, device and computer readable storage medium for modifying video and voice

Info

Publication number: CN107071553B
Application number: CN201710411693.7A
Authority: CN
Inventors: 张声联
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2017-06-05
Filing date: 2017-06-05
Publication date: 2020-02-07
Anticipated expiration: 2037-06-05
Also published as: CN107071553A

Abstract

The invention is applicable to the technical field of electronics, and provides a method, a device and a computer readable storage medium for modifying video and voice, wherein the method comprises the following steps: acquiring a video to be modified according to a video name input by a user, and analyzing the video to be modified; acquiring voice information to be modified, and searching the voice information to be modified in the analyzed video to be modified; and if the voice information to be modified is found in the analyzed video to be modified, replacing the voice information to be modified which appears in the video to be modified for many times in batch according to target voice information. According to the invention, the voice information to be modified is searched in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to replace the voice information to be modified in batch, and the modification efficiency of the video voice modification is improved.

Description

Method, device and computer readable storage medium for modifying video and voice

Technical Field

The present invention relates to the field of electronic technologies, and in particular, to a method and an apparatus for modifying video and speech, and a computer-readable storage medium.

Background

With the development of science and technology, video teaching has slowly become a normal state of people's study and life. However, in the existing video teaching, the problems of blurred pictures, speech errors, inaccurate knowledge points and the like often occur, for the problem of speech errors, if a certain point has a speech error, the video area can be directly modified and replaced, but if a large amount of videos have the same speech error problem, the video area needs to be modified by a modifier, and the method greatly increases the workload and modification time of the modifier and has low modification efficiency.

Disclosure of Invention

In view of the above, embodiments of the present invention provide a method, an apparatus, and a computer-readable storage medium for modifying a video speech, so as to solve the problem of low efficiency in modifying a video speech error in the prior art.

A first aspect of an embodiment of the present invention provides a method for modifying video and speech, where the method includes:

acquiring a video to be modified according to a video name input by a user, and analyzing the video to be modified;

acquiring voice information to be modified, and searching the voice information to be modified in the analyzed video to be modified;

and if the voice information to be modified is found in the analyzed video to be modified, replacing the voice information to be modified which appears in the video to be modified for many times in batch according to target voice information.

A second aspect of an embodiment of the present invention provides an apparatus for modifying video and audio, where the apparatus includes:

the analysis module is used for acquiring a video to be modified according to a video name input by a user and analyzing the video to be modified;

the acquisition module is used for acquiring the voice information to be modified and searching the voice information to be modified in the analyzed video to be modified;

and the first replacement module is used for replacing the voice information to be modified which appears in the video to be modified for many times in batch according to the target voice information if the voice information to be modified is found in the analyzed video to be modified.

A third aspect of an embodiment of the present invention provides an apparatus for modifying video and audio, including: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method of modifying video speech when executing the computer program.

A fourth aspect of embodiments of the present invention provides a computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the steps of the above-described method for modifying video speech.

Compared with the prior art, the embodiment of the invention has the following beneficial effects: according to the invention, the voice information to be modified is searched in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to replace the voice information to be modified in batch, and the modification efficiency of the video voice modification is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart illustrating an implementation of a method for modifying video and speech according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of another implementation of a method for modifying video speech according to an embodiment of the present invention;

FIG. 3 is a diagram of an apparatus for modifying video and audio according to an embodiment of the present invention;

fig. 4 is another schematic diagram of an apparatus for modifying video and speech according to an embodiment of the present invention;

fig. 5 is a further schematic diagram of an apparatus for modifying video and speech according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Referring to fig. 1, a schematic flowchart of a method for modifying video and speech according to an embodiment of the present invention is shown. As shown in fig. 1, the method for modifying video speech may include the steps of:

step S101: and acquiring the video to be modified according to the video name input by the user, and analyzing the video to be modified.

In the embodiment of the present invention, a large amount of videos are imported in advance in the device for modifying video and sound, and the large amount of videos are stored in a storage unit for modifying video and sound, and the storage unit may be implemented by using a nonvolatile Memory, such as an EPROM (Erasable Programmable Read-Only Memory), an EEPROM (Electrically Erasable Programmable Read-Only Memory), or a FLASH Memory. When a certain video has errors, the device for modifying the video sound determines the video needing to be modified according to the video name input by the user and analyzes the video.

In a specific implementation, the mode of inputting the video name by the user is not limited to a certain mode, for example, the user may directly input the video name of the video to be modified, or the user directly operates a certain key of the device for modifying the video sound, the device for modifying the video sound obtains the video name input by the user according to the key operation of the user, or the user performs a touch operation on a display screen of the device for modifying the video sound after entering a video list interface, so as to select the video to be modified in the video list, so that the device for modifying the video sound obtains the video name input by the user according to the touch operation of the user.

Step S102: and acquiring the voice information to be modified, and searching the voice information to be modified in the analyzed video to be modified.

In the embodiment of the present invention, when the device for modifying video and sound acquires a video to be modified and analyzes the video to be modified, the device for modifying video and sound can acquire voice information to be modified according to the input of the user.

In specific implementation, sound collection equipment such as a microphone is integrated in the video sound modification device, when a user needs to modify error audio information in a certain video, voice information can be directly output, the video sound modification device collects the voice information, a voice content recognition unit in the video sound modification device recognizes the voice information, and the voice information of the video to be modified is searched in the analyzed video to be modified according to a recognition result. It should be noted that, a sound collecting device such as a microphone may also be used as a peripheral device of the device for modifying video sound, and when collecting the voice information to be modified inputted by voice, the sound collecting device sends the voice information to the device for modifying video sound.

For example, a user needs to modify the euler theorem voice information which is erroneously read from the fourier interpretation video, and then the user needs to input the voice information which needs to be modified, that is, the euler theorem voice information, to the video sound modification device, and then the video sound modification device collects and identifies the voice information, and searches the euler theorem voice information in the analyzed fourier interpretation video.

Step S103: and if the voice information to be modified is found in the analyzed video to be modified, replacing the voice information to be modified which appears in the video to be modified for many times in batch according to the target voice information.

In the embodiment of the present invention, if the device for modifying video sound searches the voice information to be modified in the video to be modified, the target voice information is obtained according to the voice information to be modified, where the target voice information may be the voice information which is input by the user and has a one-to-one correspondence relationship with the voice information to be modified, and when the device for modifying video sound obtains the target voice information, the target voice information may be used to replace the voice information to be modified which appears in the video to be modified for many times in batch, so as to implement automatic search, modification and batch processing of error points of error voices, thereby not only improving the quality of teaching videos to the greatest extent, but also saving a large amount of labor modification cost and improving the modification efficiency.

In the embodiment of the invention, the voice information to be modified is searched in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to carry out batch replacement on the voice information to be modified, and the modification efficiency of the video voice is improved.

Referring to fig. 2, a schematic flow chart of another method for modifying video and speech according to an embodiment of the present invention is shown. As shown in fig. 2, the method for modifying video speech may include the steps of:

step S201: and acquiring the video to be modified according to the video name input by the user, and analyzing the video to be modified.

Step S202: and acquiring the voice information to be modified, and searching the voice information to be modified in the analyzed video to be modified.

Step S203: and if the voice information to be modified is found in the analyzed video to be modified, replacing the voice information to be modified which appears in the video to be modified for many times in batch according to the target voice information.

In the embodiment of the present invention, if the device for modifying video and sound finds the voice information to be modified in the video to be modified, the target voice information is obtained according to the voice information to be modified, where the target voice information may be the voice information which is input by the user and has a one-to-one correspondence relationship with the voice information to be modified. When the device for modifying the video sound acquires the target voice information, the target voice information can be adopted to replace the voice information to be modified which appears in the video to be modified for many times in batch.

Further, if the voice information to be modified is found in the analyzed video to be modified, the step of replacing the voice information to be modified, which appears in the video to be modified for multiple times, in batches according to the target voice information specifically includes:

determining a plurality of time nodes of the voice information to be modified appearing in the analyzed video to be modified for a plurality of times;

and acquiring target voice information, and replacing a plurality of voice information to be modified at a plurality of time nodes with the target voice information in batch.

In the embodiment of the invention, the device for modifying the video and the voice is provided with a voice content time positioning device and a batch processing device. When the device for modifying the video voice finds the voice information to be modified in the analyzed video to be modified, a voice content time positioning device in the device for modifying the video voice determines time nodes of the voice information to be modified appearing in the analyzed video to be modified, wherein the number of the time nodes is multiple, the voice information to be modified appears once at each time node, and the voice information to be modified appearing at each time node is the same.

After the time positioning device of the voice content determines the time node of the voice information to be modified in the video to be modified, the device for modifying the video voice acquires the target voice information and replaces the voice information to be modified at each time node, namely, the device for modifying the video voice replaces the voice information to be modified at a plurality of time nodes in batches by adopting the target voice information so as to realize automatic searching, modification and batch processing of error points of error voice, so that the quality of the teaching video can be improved to the maximum extent, a large amount of labor modification cost can be saved, and the modification efficiency is improved.

In specific implementation, the speech content time positioning apparatus and the batch processing apparatus may be implemented by software or hardware, and are not limited herein. The video modification software is installed in the video voice modification device, when the video voice modification device determines the time node of the voice information to be modified and acquires the target voice information, the voice content recognition unit in the video voice modification device sends out a modification instruction, and the modification instruction is fed back to the video modification software, so that the video modification software replaces the voice information to be modified by the target voice information.

Further, the method for modifying the video sound further comprises the following steps:

step S204: and if the voice information to be modified is found in the analyzed video to be modified, acquiring the picture information to be modified which corresponds to the voice information to be modified and appears in the video to be modified for multiple times, and replacing the picture information to be modified according to the target picture information.

In the embodiment of the present invention, if the device for modifying video and sound finds the voice information to be modified in the video to be modified, the video content inspection unit in the device for modifying video and sound performs content identification on the analyzed video to be modified, so as to obtain the picture information to be modified, which corresponds to the voice information to be modified and appears in the video to be modified for multiple times.

When the device for modifying the video sound acquires the picture information to be modified, the device for modifying the video sound can further acquire target picture information, and the target picture information can be stored in the device for modifying the video sound in advance. When the device for modifying the video sound acquires the target picture information, the target picture information can be adopted to replace the picture information to be modified which appears in the video to be modified for many times.

It should be noted that, in the embodiment of the present invention, the picture information to be modified refers to picture information in which characters corresponding to the voice information to be modified appear in the picture content, or picture information in which a mouth shape of an interpreter corresponds to the voice information to be modified in the picture; in addition, the information of the to-be-modified picture appearing in the to-be-modified video for multiple times may be partially the same, may be completely the same, or may be completely different, and is determined according to the specific video content of the to-be-modified video, and is not specifically limited herein.

Further, if the voice information to be modified is found in the analyzed video to be modified, acquiring the picture information to be modified which corresponds to the voice information to be modified and appears in the video to be modified for multiple times, and replacing the picture information to be modified according to the target picture information specifically comprises:

acquiring a plurality of pieces of picture information to be modified corresponding to the voice information to be modified;

determining a plurality of time nodes of a plurality of pieces of picture information to be modified in the analyzed video to be modified;

and acquiring target picture information, and replacing the plurality of to-be-modified picture information at the plurality of time nodes with the target picture information.

In the embodiment of the present invention, the obtaining of the multiple pieces of to-be-modified picture information corresponding to the to-be-modified voice information may refer to the description in step S204, which is not repeated herein.

Further, after the device for modifying the video and voice obtains the picture information to be modified, the voice content time positioning device in the device for modifying the video and voice determines a time node where the picture information to be modified corresponding to the voice information to be modified appears in the analyzed video to be modified. Specifically, the voice content time positioning device may determine the time node of the to-be-modified video in which the to-be-modified information appears according to the time node of the to-be-modified video in which the to-be-modified voice information appears after positioning the time node of the to-be-modified voice information appearing in the to-be-modified video and after acquiring the to-be-modified picture information.

It should be noted that there are a plurality of time nodes, each time node appears the to-be-modified picture information once, and the to-be-modified picture information appearing in each time node may be completely the same, partially the same, or completely different, and is not limited herein.

When the time positioning device of the voice content determines the time node of the picture information to be modified in the video to be modified, the device for modifying the video voice acquires the target picture information and replaces the picture information to be modified at each time node, namely the device for modifying the video voice replaces the voice information to be modified at a plurality of time nodes by adopting the target picture information, wherein the target picture information corresponds to the picture information to be modified and is pre-stored in a storage unit of the device for modifying the video voice.

In the embodiment of the invention, the voice information to be modified is searched in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to replace the voice information to be modified in batch, thus the quality of the teaching video can be improved to the greatest extent, a large amount of labor modification cost can be saved, and the modification efficiency of the video voice modification is improved.

In addition, the picture information to be modified can be searched in the video to be modified, so that when the picture information to be modified is searched, the picture information to be modified is replaced by adopting correct target picture information, the quality of the teaching video is improved, a large amount of manpower modification cost can be saved, and the modification efficiency is improved.

Referring to fig. 3, it is a schematic block diagram of an apparatus 3 for modifying video and speech according to an embodiment of the present invention. The modules included in the apparatus 3 for modifying video and speech according to the embodiment of the present invention are used to execute the steps in the embodiment corresponding to fig. 1, please refer to fig. 1 and the related description in the embodiment corresponding to fig. 1 specifically, which are not described herein again. The apparatus 3 for modifying video and audio provided by the embodiment of the present invention includes a parsing module 300, an obtaining module 301, and a first replacing module 302.

The parsing module 300 is configured to obtain a video to be modified according to a video name input by a user, and parse the video to be modified.

The obtaining module 301 is configured to obtain the voice information to be modified, and search the analyzed video to be modified for the voice information to be modified.

The first replacement module 302 is configured to, if the voice information to be modified is found in the analyzed video to be modified, perform batch replacement on the voice information to be modified appearing in the video to be modified for multiple times according to the target voice information.

In the embodiment of the invention, the device 3 for modifying the video and voice searches the voice information to be modified in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to replace the voice information to be modified in batch, thus not only improving the quality of the teaching video to the maximum extent, but also saving a large amount of labor modification cost and improving the modification efficiency when modifying the video and voice.

Referring to fig. 4, it is a schematic block diagram of an apparatus 4 for modifying video and speech according to an embodiment of the present invention. The modules included in the apparatus 4 for modifying video and speech according to the embodiment of the present invention are used to execute the steps in the embodiment corresponding to fig. 2, please refer to fig. 2 and the related description in the embodiment corresponding to fig. 2 specifically, which are not described herein again. The apparatus 4 for modifying video and audio provided by the embodiment of the present invention includes an analysis module 400, an obtaining module 401, a first replacing module 402, and a second replacing module 403.

The parsing module 400 is configured to obtain a video to be modified according to a video name input by a user, and parse the video to be modified.

The obtaining module 401 is configured to obtain the voice information to be modified, and search the analyzed video to be modified for the voice information to be modified.

The first replacing module 402 is configured to, if the voice information to be modified is found in the analyzed video to be modified, perform batch replacement on the voice information to be modified appearing in the video to be modified for multiple times according to the target voice information.

Further, the first replacement module 402 includes a first determination unit and a first replacement unit.

The first determining unit is used for determining a plurality of time nodes of the voice information to be modified, which appear for a plurality of times in the analyzed video to be modified.

The first replacing unit is used for obtaining target voice information and replacing a plurality of voice information to be modified at a plurality of time nodes with the target voice information in batch.

The second replacing module 403 is configured to, if the voice information to be modified is found in the analyzed video to be modified, obtain picture information to be modified that corresponds to the voice information to be modified and appears in the video to be modified multiple times, and replace the picture information to be modified according to the target picture information.

Further, the second replacement module 403 includes an acquisition unit, a second determination unit, and a second replacement unit.

The acquisition unit is used for acquiring a plurality of pieces of picture information to be modified corresponding to the voice information to be modified;

the second determining unit is used for determining a plurality of time nodes of the plurality of pieces of picture information to be modified in the analyzed video to be modified;

the second replacement unit is used for acquiring the target picture information and replacing the plurality of pieces of picture information to be modified at the plurality of time nodes with the target picture information.

In the embodiment of the invention, the device 4 for modifying video and voice searches the voice information to be modified in the video to be modified, so that when the voice information to be modified is searched, the correct target voice information is adopted to replace the voice information to be modified in batch, thus not only improving the quality of the teaching video to the maximum extent, but also saving a large amount of labor modification cost and improving the modification efficiency when modifying the video and voice.

In addition, the device 4 for modifying video and voice can also search the information of the picture to be modified in the video to be modified, so that when the information of a plurality of pictures to be modified is searched, the information of the plurality of pictures to be modified is replaced by adopting correct target picture information, thereby improving the quality of the teaching video, saving a large amount of labor modification cost and improving the modification efficiency.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 5 is a schematic diagram of an apparatus 5 for modifying video and sound according to an embodiment of the present invention. As shown in fig. 5, the apparatus 5 for modifying video sound of this embodiment includes: a processor 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processor 50, such as a program of a method of modifying video sound. The processor 50, when executing the computer program 52, implements the steps in the above-described embodiments of the method for modifying video sound, such as the steps 101 to 103 shown in fig. 1 and the steps 201 to 204 shown in fig. 2. Alternatively, the processor 50, when executing the computer program 52, implements the functions of the modules/units in the device embodiments, such as the functions of the modules 300 to 302 shown in fig. 3 and the functions of the modules 400 to 403 shown in fig. 4.

Illustratively, the computer program 52 may be partitioned into one or more modules/units that are stored in the memory 51 and executed by the processor 50 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 52 in the modified video sound 5. For example, the computer program 52 may be divided into a synchronization module, a summary module, an acquisition module, and a return module (a module in a virtual device), and each module has the following specific functions:

Or the parsing module 400 is configured to obtain a video to be modified according to the video name input by the user, and parse the video to be modified.

The device 5 for modifying video and sound may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The means for modifying 5 the video sound may comprise, but is not limited to, a processor 50, a memory 51. It will be understood by those skilled in the art that fig. 5 is merely an example of the apparatus 5 for modifying the video sound, and does not constitute a limitation of the apparatus 5 for modifying the video sound, and may include more or less components than those shown, or combine some components, or different components, for example, the apparatus 5 for modifying the video sound may further include an input-output device, a network access device, a bus, etc.

The Processor 50 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the apparatus for modifying video and sound 5, such as a hard disk or a memory of the apparatus for modifying video and sound 5. The memory 51 may also be an external storage device of the apparatus for modifying video and sound 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), etc. provided on the apparatus for modifying video and sound 5. Further, the memory 51 may also comprise both an internal storage unit and an external storage device of the apparatus for modifying video sound 5. The memory 51 is used for storing the computer program and other programs and data required by the means 5 for modifying video sound. The memory 51 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A method for modifying video speech, the method comprising:

if the voice information to be modified is found in the analyzed video to be modified, replacing the voice information to be modified which appears in the video to be modified for multiple times in batch according to target voice information, acquiring the picture information to be modified which corresponds to the voice information to be modified and appears in the video to be modified for multiple times, and replacing the picture information to be modified according to the target picture information.

2. The method according to claim 1, wherein the performing the batch replacement of the voice information to be modified, which appears in the video to be modified for a plurality of times, according to the target voice information specifically comprises:

and acquiring target voice information, and replacing the plurality of voice information to be modified at the plurality of time nodes with the target voice information in batch.

3. The method according to claim 1, wherein the obtaining of the to-be-modified picture information corresponding to the to-be-modified voice information and appearing in the to-be-modified video multiple times, and the replacing of the to-be-modified picture information according to the target picture information specifically comprises:

determining a plurality of time nodes of the plurality of pieces of picture information to be modified in the analyzed video to be modified;

4. An apparatus for modifying video speech, the apparatus comprising:

the first replacement module is used for replacing the voice information to be modified which appears in the video to be modified for many times in batch according to target voice information if the voice information to be modified is found in the analyzed video to be modified;

and the second replacement module is used for acquiring the picture information to be modified which corresponds to the voice information to be modified and appears in the video to be modified for multiple times if the voice information to be modified is found in the analyzed video to be modified, and replacing the picture information to be modified according to the target picture information.

5. The apparatus of claim 4, wherein the first replacement module comprises:

a first determining unit, configured to determine multiple time nodes where the voice information to be modified appears multiple times in the analyzed video to be modified;

the first replacing unit is used for obtaining target voice information and replacing the voice information to be modified at the time nodes with the target voice information in batch.

6. The apparatus of claim 4, wherein the second replacement module comprises:

an acquisition unit configured to acquire a plurality of pieces of to-be-modified picture information corresponding to the voice information to be modified;

a second determining unit, configured to determine a plurality of time nodes where the plurality of pieces of picture information to be modified appear in the analyzed video to be modified;

and the second replacing unit is used for acquiring target picture information and replacing the plurality of pieces of picture information to be modified at the plurality of time nodes with the target picture information.

7. An apparatus for modifying video speech, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 3 when executing the computer program.

8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 3.