WO2019023953A1 - 一种基于智能终端的视频剪辑方法及视频剪辑*** - Google Patents

一种基于智能终端的视频剪辑方法及视频剪辑*** Download PDF

Info

Publication number
WO2019023953A1
WO2019023953A1 PCT/CN2017/095540 CN2017095540W WO2019023953A1 WO 2019023953 A1 WO2019023953 A1 WO 2019023953A1 CN 2017095540 W CN2017095540 W CN 2017095540W WO 2019023953 A1 WO2019023953 A1 WO 2019023953A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
portrait
feature
character
picture
Prior art date
Application number
PCT/CN2017/095540
Other languages
English (en)
French (fr)
Inventor
覃桐
Original Assignee
深圳传音通讯有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳传音通讯有限公司 filed Critical 深圳传音通讯有限公司
Priority to PCT/CN2017/095540 priority Critical patent/WO2019023953A1/zh
Publication of WO2019023953A1 publication Critical patent/WO2019023953A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs

Definitions

  • the present invention relates to the field of smart devices, and in particular, to a video editing method and a video editing system based on a smart terminal.
  • the image recognition algorithm distinguishes the back of the target person from the front picture, separately compares and provides a method of video editing in combination with the user's needs, and is applicable not only to the video program but also to the life of the user.
  • Video you can edit a video containing a specific character, or you can edit a video that contains multiple characters, or a video that does not contain a certain character, and provide a video containing the back of the target person, saving time and accuracy. And by filtering the extracted content by interacting with the user, again improving the accuracy.
  • an object of the present invention is to provide a video editing method and a video editing system based on a smart terminal, which can perform a dedicated video clip with or without a certain character or a plurality of characters according to the needs of the user, and Fast and convenient, high precision and time saving.
  • the invention discloses a video editing method based on a smart terminal, comprising the following steps:
  • the step of acquiring a person portrait picture having a portrait element and extracting the person portrait feature of the person portrait element comprises:
  • the body contour feature of the character portrait element is extracted as a first body contour feature
  • the facial portrait feature of the person portrait element is extracted as a first facial portrait feature
  • the step of acquiring the video clip of the character to be clipped that matches the character portrait feature comprises:
  • the video clip or the remaining video clips other than the video clip in the video to be clipped are spelled
  • the steps to follow include:
  • the step of acquiring a video segment of the character in the video to be clipped that matches the character portrait feature is splicing the remaining video segments other than the video segment in the video segment or the video to be clipped between the steps, the video editing method further includes:
  • the video clip is pushed to the user and filtered by the user to remove irrelevant video clips.
  • the invention also discloses a video editing system based on a smart terminal, comprising:
  • a video acquisition module which acquires a video file to be edited and stores the video file in the smart terminal
  • a portrait feature extraction module acquiring a portrait image of a person having a portrait element, and extracting a portrait feature of the portrait element
  • a video segment acquisition module configured to connect to the video acquisition module and the portrait feature extraction module, and acquire a video segment of the to-be-edited video that includes a character that matches the character portrait feature;
  • the video splicing module is connected to the video segment obtaining module to splicing the video clip or the remaining video segments except the video segment in the video to be clipped.
  • the portrait feature extraction module comprises:
  • a picture obtaining unit which acquires a portrait picture of a person having a portrait element and stores it in the smart terminal;
  • a portrait element identification unit connected to the picture acquisition unit to identify a person portrait element in the person portrait picture
  • a portrait feature extraction unit coupled to the portrait element recognition unit, extracting a body profile feature of the person portrait element as a first body profile feature, and extracting a face portrait feature of the person portrait element as a first face portrait feature .
  • the video segment obtaining module includes:
  • An element extracting unit splits the video to be clipped, obtains a frame of each frame, and extracts a pair of portrait elements in the frame of each frame, including a character back element and a character face element;
  • a feature extraction unit connected to the element extraction unit, extracting a body contour feature of the character back image element as a second body shape contour feature, and extracting a face portrait feature of the character face element as a second face portrait feature;
  • a back image acquisition unit connected to the feature extraction unit, and comparing the second body contour feature with the first body contour feature, and acquiring the second when the similarity is greater than or equal to the first similarity threshold
  • the picture corresponding to the figure outline feature is a character back picture
  • a front view acquiring unit connected to the feature extracting unit, comparing the second facial portrait feature with the first facial portrait feature, and acquiring the second facial when the similarity is greater than or equal to a second similarity threshold
  • the picture corresponding to the portrait feature is the front view of the character
  • the cutting unit is connected to the back view picture acquiring unit and the front picture acquiring unit, and cuts the character back picture from the character front picture from the to-be-edited video to form a video segment.
  • the video splicing module comprises:
  • a separating unit separating audio information and video information in the video segment to form an audio portion and a video portion
  • a splicing unit connected to the separating unit, splicing the audio portion and the video portion separately to form a complete audio portion and a complete video portion;
  • a synchronization unit coupled to the tiling unit to synchronize the complete audio portion with the complete video portion.
  • the video editing system further includes:
  • a video clip screening module that pushes the video clip to a user, and the user filters the video to eliminate irrelevant video Fragment.
  • the back of the target person can be identified to obtain a video clip containing or not including a character or a plurality of characters;
  • FIG. 1 is a flow chart showing a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 2 is a flow chart showing a method for extracting a portrait feature of a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 3 is a flow chart showing a method for acquiring a video clip by a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 4 is a schematic flow chart of a method for splicing a video clip or a video clip other than a video clip in a video clip or a video to be clipped according to a preferred embodiment of the present invention
  • FIG. 5 is a schematic flow chart of a video editing method according to another preferred embodiment of the present invention.
  • Figure 6 is a block diagram showing the structure of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a portrait feature extraction module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a video clip acquiring module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a video splicing module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a system of a video editing system in accordance with another preferred embodiment of the present invention.
  • the mobile terminal can be implemented in various forms.
  • the terminal described in the present invention may include a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a PDA (Personal Digital Assistant), a PAD (Tablet), a PMP (Portable Multimedia Player), a navigation device, and the like, and such as Fixed terminal for digital TV, desktop computer, etc.
  • a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a PDA (Personal Digital Assistant), a PAD (Tablet), a PMP (Portable Multimedia Player), a navigation device, and the like, and such as Fixed terminal for digital TV, desktop computer, etc.
  • PDA Personal Digital Assistant
  • PAD Tablet
  • PMP Portable Multimedia Player
  • FIG. 1 is a schematic flowchart of a video clip method based on a smart terminal according to a preferred embodiment of the present invention.
  • the video editing method specifically includes the following steps:
  • S100 Acquire a video file to be edited and store it in the smart terminal.
  • the video file to be edited In order to implement the video clip, the video file to be edited must first be obtained.
  • the method for obtaining the video file to be edited includes importing the video in the smart terminal, and importing the video from outside the smart terminal and storing it in the smart terminal.
  • the video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.
  • the method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal.
  • the portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character.
  • the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one
  • the user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters.
  • the characters of the picture in the video need to be compared according to the extracted portrait feature to obtain a video segment containing the character corresponding to the extracted portrait feature.
  • it is necessary to establish a strategy according to the needs of the user including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters.
  • the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained. If the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired.
  • the picture of the back if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person.
  • the relationship between the characters needs to be considered.
  • the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or.
  • the logical relationship between the target characters can be determined by the user's needs, such as two of them. Logic and relationship, the other is a logical OR relationship with the two. Take TV dramas as an example.
  • the two characters should appear at the same time.
  • the relationship between the two should be logical and the only picture of both characters appears. Both the back and the front.
  • the number of characters extracted in the picture should be consistent with the number of target characters in the picture.
  • the process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.
  • the clip After obtaining the video clip of the character to be clipped that matches the character of the target person, the clip needs to be stitched, and the stitching should be in a certain order, either in chronological order or according to the characters in the screen. More or less in the order, in which the change of characters from small to many is the change from only the target person to the other person, from the opposite to the other, it can also be small according to the proportion of the target person in the video picture. In the order of large or small to large, the latter two sequences should be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example.
  • the ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture.
  • the proportion of the character of a certain picture is relatively high, if the picture is in a selected video segment, the video segment is regarded as one body, and the time ratio is spliced regardless of the proportion of the other pictures in the video segment.
  • the user can quickly and accurately obtain the exclusive video with or without a certain character or a plurality of characters according to the user's needs, and can accurately identify the video segment containing the back of the target person.
  • the step of acquiring a portrait image of a person with a portrait element and extracting the portrait feature of the portrait element includes:
  • S201 Acquire a portrait picture of a person having a portrait element and store it in the smart terminal.
  • the method of obtaining the portrait image includes both importing the image in the smart terminal and importing from outside the smart terminal.
  • the picture is stored in the smart terminal.
  • image transformation such as Fourier transform, Wo The Ershi-Adama transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt components in the frequency domain image by image enhancement technology to enhance the edge of the image and enhance the edge of the image.
  • image recognition technology is needed to extract the portrait elements in the picture by extracting features, establishing indexes, and query steps, and extracting the portrait elements by image segmentation technology.
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the portrait feature needs to be extracted.
  • the portrait image should be distinguished.
  • the outline feature of the character's back should be extracted, including the body contour.
  • the proportion of each part and other features form a first figure outline feature;
  • the facial portrait features of the frontal portrait of the person should be extracted, including facial skin color, facial features, size, position distance Relationships and features with recognizable features on the face, such as black squats in the corners of the mouth, can also be uniformly sampled on the face and the size of the face portrait in the picture.
  • acquiring the video to be edited includes matching the character portrait feature.
  • the steps of the video clip of the character may specifically include:
  • S301 split the video to be clipped, obtain a frame of each frame, and extract a to-be-obtained portrait element in each frame, including a character back element and a character face element.
  • the video In order to identify the portrait of a person in a clip video, it is necessary to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform.
  • image transformation such as Fourier transform.
  • the Walsh-Hadamard transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt component in the frequency domain image by image enhancement technology to enhance the image edge and image edge.
  • image enhancement technology After being enhanced, it is necessary to identify the portrait elements in the image by extracting features, indexing and query steps through image recognition technology, and finally extracting the portrait elements by image segmentation technology, and the extracted portrait elements include the characters of the characters and the facial elements of the characters. .
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the extraction method here should be The method of extracting the first figure outline feature of the character is consistent; after extracting the face part of the character in the picture of the video to be edited, the face part of the character is sampled, and the facial skin color, the facial features shape, the position distance relationship, and the face are extracted.
  • the step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database.
  • the constructed sunglasses model is indexed by the sunglasses model indexing.
  • S303 Align the second body contour feature with the first body contour feature, and obtain a picture corresponding to the second body contour feature when the similarity is greater than or equal to the first similarity threshold.
  • the first body contour feature is indexed according to the first body contour feature, and the second body contour feature is scaled to the same size as the first body contour feature, and according to the index pair
  • the scaled second body contour feature is sampled and searched, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., and the first similarity threshold is 90%, and the coincidence degree is greater than or equal to the first similarity.
  • the threshold value is considered to be greater than or equal to the first similarity threshold value, and the person corresponding to the second figure-shaped contour feature and the first figure-shaped contour feature is considered to be The same person obtains a picture corresponding to the contour feature of the second figure, and is a picture of the character's back.
  • the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.
  • S304 Align the second facial portrait feature with the first facial portrait feature, and obtain a screen corresponding to the second facial portrait feature when the similarity is greater than or equal to the second similarity threshold.
  • indexing is performed according to the first facial portrait feature, and the second facial portrait feature is scaled to the same size as the first facial portrait feature, and is scaled according to the index pair
  • the second facial portrait feature is compared, such as whether the facial skin color is the same, whether the facial shape shape, the position distance relationship are consistent, and the facial recognition feature, such as whether the black corner of the mouth is the same, etc., and the second similarity threshold is taken.
  • the second facial portrait feature is considered to be the first facial portrait
  • the similarity of the feature is greater than or equal to the second similarity threshold, and the person corresponding to the second facial portrait feature is considered to be the same person as the first facial portrait feature, and the image corresponding to the second facial portrait feature is obtained as the front of the character.
  • the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold.
  • the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.
  • the continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment.
  • the step of splicing the video clips or the remaining video clips other than the video clips in the video to be clipped includes:
  • S401 Separating the audio information and the video information in the remaining video segments except the video segment in the video clip or the video to be clipped to form an audio portion and a video portion.
  • the audio information in each clip needs to be separated from the video information, where the video information does not include audio information, and the audio information needs to be recorded.
  • the positional relationship of the video information, and the audio information and the video information are extracted to form an audio part and a video part.
  • the audio portion and the video portion unit need to be sequentially stitched together to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.
  • the complete audio portion is synchronized with the complete video portion according to the recorded positional relationship between the audio information and the video information to form a final complete video.
  • S500 Push the video clip to a user, and the user performs screening to remove irrelevant video clips.
  • the user can perform screening by pushing the obtained video clip to the user, and can perform the delete operation to eliminate the discriminating unrelated video clip.
  • a smart terminal-based video editing system 100 in accordance with a preferred embodiment of the present invention specifically includes the following components:
  • the video capture module 11 In order to implement the video clip, the video capture module 11 must first obtain the video file to be edited, and the method for obtaining the video file to be edited includes not only importing the video in the smart terminal, but also importing the video from outside the smart terminal and storing it in the smart terminal. Inside.
  • the video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.
  • the portrait feature extraction module 13 is configured to acquire a character portrait image having a portrait element and extract a character portrait of the portrait element in the image after the video to be clipped is acquired. Sign.
  • the method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal.
  • the portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character.
  • the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one
  • the user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters.
  • the video segment obtaining module 12 is connected to the video acquiring module 11 and the portrait feature extracting module 13 to obtain a to-be-edited video file and a portrait image and extract the character portrait feature, and then needs to be based on the extracted character portrait feature in the video.
  • the characters of the screen are compared to obtain a video clip containing the character corresponding to the extracted portrait feature.
  • it is necessary to establish a strategy according to the needs of the user including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters.
  • the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained.
  • the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired.
  • the picture of the back if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person.
  • the relationship between the characters needs to be considered.
  • the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or.
  • the logical relationship between the target characters can be determined by the user's needs, such as two of them.
  • the process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.
  • the video splicing module 14 is connected to the video segment obtaining module 12, and after acquiring the video segment of the character to be clipped that matches the character of the target person, the spliced segment needs to be spliced, and the splicing should be in a certain order. It can be in chronological order or in the order of the characters in the picture from less to more or from more to less. The change of characters from small to many is the change from only the target person to other people, from more to less. Conversely, the order of the target characters in the video screen may be in the order of small to large or large to small, and the latter two sequences shall be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example.
  • the ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture.
  • the proportion of the character of a certain picture is high, if the picture is in a selected video segment, the video segment is regarded as one body, regardless of the proportion of other pictures in the video segment. Low, all spliced in chronological order, preventing the splicing of the picture by the ratio splicing to cause the picture to be broken. It is equivalent to comparing the maximum ratio of the pictures in the video clip, and splicing in the order of the largest ratio. At the same time, when the ratio is the same, it is also spliced in chronological order.
  • the portrait feature extraction module 13 specifically includes:
  • the image acquisition unit obtains the video to be edited, in order to realize the video clip centered on the target person, it is necessary to obtain the portrait image of the person having the portrait element, and the manner of obtaining the portrait image includes both the image imported into the smart terminal and the The smart terminal externally imports the picture and stores it in the smart terminal.
  • the portrait element identification unit is connected to the picture acquisition unit, and after acquiring the person portrait picture having the person portrait element, since there may be a background with interference factors in the picture, the person portrait element in the picture needs to be extracted, here Firstly, the image is transformed from the time domain to the frequency domain by image transform, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform, and then the image in the frequency domain image is high.
  • image transform such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform
  • the frequency mutation component is strengthened, and the edge of the image is strengthened.
  • the image recognition technology is used to extract the character portrait element in the image by extracting the feature, establishing the index build and the query step, and extracting the portrait element by the image segmentation technology.
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • a portrait feature extraction unit is connected to the portrait element recognition unit, and after extracting the portrait element, the portrait feature needs to be extracted, where the portrait image should be distinguished, and when the portrait element in the portrait image is the back of the character,
  • the silhouette of the figure of the character should be extracted, including the contour of the body, the proportion of each part, etc., to form the first figure outline feature; when the portrait element in the portrait picture is the front portrait of the character, the front portrait of the character should be extracted.
  • the facial portrait features including the facial skin color, the size of the facial features, the positional distance relationship, and the features of the facial recognition, such as the black scorpion of the corner of the mouth, can also be uniformly sampled on the face and record the size of the face portrait in the picture.
  • the video segment obtaining module 12 specifically includes:
  • the element extraction unit in order to recognize the portrait of the person in the clip video, needs to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain, and then enhance the high frequency mutation component in the frequency domain image by image enhancement technology to enhance the image.
  • image transformation such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain
  • image enhancement technology to enhance the image.
  • Edge after the image edge is strengthened, it is necessary to identify the portrait element in the image by extracting features, building an index build and query step through image recognition technology, and finally extracting the portrait element by image segmentation technology, and the extracted character portrait element includes the character back view.
  • image transformation such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the feature extraction unit is connected to the element extraction unit, and after extracting the back view element of the character in the picture of the video to be edited, the back view element of the character is sampled, and the outline of the body, the proportion of each part, and the like are extracted to form a second shape.
  • the outline contour feature, the extraction method here should be consistent with the method of extracting the first figure outline feature of the character; after extracting the face element of the character in the picture of the video to be edited, the face element of the character needs to be sampled, and the extraction includes Facial skin color, facial shape shape, position distance relationship, and facial recognition features, such as the corner of the mouth
  • the facial portrait features of features such as black scorpion form a second facial portrait feature
  • the extraction method here should be consistent with the method of extracting the first facial portrait feature of the character.
  • the target person may wear sunglasses, so it is necessary to delete the sunglasses portion when extracting the facial portrait feature, and only the remaining facial portrait portion is considered.
  • the step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database.
  • the constructed sunglasses model is indexed by the sunglasses model indexing.
  • the part matching the sunglasses model is inquired, the part is considered to be the sunglasses part, and the part is deleted.
  • a back image acquisition unit is connected to the feature extraction unit, and after acquiring the second body contour feature and the first body contour feature, the index is established according to the first body contour feature, and the second body contour feature is scaled to The shape of the contour is the same size, and the scaled second contour contour feature is sampled according to the index, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., taking the first similarity threshold 90%, when the degree of coincidence is greater than or equal to the first similarity threshold, the similarity between the second body contour feature and the first body contour feature is considered to be greater than or equal to the first similarity threshold, and the second body contour feature is considered to correspond to The character corresponding to the first figure contour feature is the same person, and the picture corresponding to the second body shape contour feature is obtained, which is a character back picture.
  • the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.
  • a front picture acquiring unit connected to the feature extracting unit, after acquiring the second facial portrait feature and the first facial portrait feature, indexing according to the first facial portrait feature, and scaling the second facial portrait feature to the first face
  • the portrait features the same size, and the scaled second facial portrait features are compared according to the index, such as whether the facial skin color is the same, the facial features, the positional distance relationship, and the facial recognition feature, such as the black corner of the mouth.
  • the second similarity threshold is 85%, and when the degree of coincidence is greater than or equal to the second similarity threshold, the similarity between the second facial portrait feature and the first facial portrait feature is considered to be greater than or equal to the second similarity.
  • the degree threshold is that the person corresponding to the second facial portrait feature and the person corresponding to the first facial portrait feature are the same person, and the screen corresponding to the second facial portrait feature is acquired as the front view of the character.
  • the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold. Similarly, for the comparison of facial portrait features, setting a higher similarity threshold is beneficial to improve the accuracy.
  • the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.
  • the cutting unit is connected to the back image acquiring unit and the front screen acquiring unit, and after acquiring the back image of the character and the front view of the character, whether the image of the adjacent frame is also acquired, when the image of the adjacent frame is acquired.
  • the continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment.
  • the video splicing module 14 specifically includes:
  • the separating unit after acquiring the video clip or the remaining video clips except the video clip in the video to be clipped, separates the audio information in each segment from the video information, where the video information does not include the audio information, and needs to be recorded.
  • the positional relationship between the audio information and the video information, and the audio information and the video information are extracted to form an audio part and a video part.
  • the audio portion and the video portion unit need to be spliced in order to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.
  • the synchronization unit after acquiring the complete audio part and the complete video part, synchronizes the complete audio part with the complete video part according to the recorded position relationship of the audio information and the video information to form a final complete video.
  • the video editing system 100 further includes the following components:
  • the video segment screening module 15 obtains the video segment, and further improves the accuracy. By pushing the obtained video segment to the user, the user performs screening, and the deletion operation may be performed to remove the unrelated video segment that identifies the error.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

提供了一种基于智能终端的视频剪辑方法及视频剪辑***,视频剪辑方法包括以下步骤:获取待剪辑视频文件,并存储于所述智能终端内;获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。采用上述技术方案后,可根据用户的需求结合用户导入的人物图片实现智能剪辑并形成含或不含某一人物或多个人物的完整视频,同时提供交互接口,由用户对提取的视频片段再次筛选,提高准确度。

Description

一种基于智能终端的视频剪辑方法及视频剪辑*** 技术领域
本发明涉及智能设备领域,尤其涉及一种基于智能终端的视频剪辑方法及视频剪辑***。
背景技术
随着视频节目的多元蓬勃发展,电视剧、电影等视频节目成为了人们生活中不可或缺的一部分,而人们对视频节目中的情节、人物都存在一定的偏好,经常出现想看某一演员但由不想看完全剧或整个电影的情况,人们一般都会采取快进的手段,但此方法不仅浪费时间、而且容易错过故事情节。现有技术中存在对感兴趣人物的视频片段进行提取的算法,但准确度不高。
因此,本发明通过图像识别算法,将目标人物的背影与正面画面相区分,分别比对,并结合用户的需求提供一种视频剪辑的方法,不仅适用与视频节目,同样适用于用户拍摄的生活视频,可以剪辑包含某一人物的专属视频,也可剪辑同时包含某多个人物的视频,或不包含某一人物的视频,同时可以提供包含目标人物背影的视频,节省时间,并且精确度高,并通过与用户交互对提取内容进行筛选,再次提高精确度。
发明内容
为了克服上述技术缺陷,本发明的目的在于提供一种基于智能终端的视频剪辑方法及视频剪辑***,可根据用户的需求进行包含或不包含某一人物或某多个人物的专属视频剪辑,且快捷方便,精确度高,节省时间。
本发明公开了一种基于智能终端的视频剪辑方法,包括以下步骤:
获取待剪辑视频文件,并存储于所述智能终端内;
获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;
获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;
将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。
优选地,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤包括:
获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;
识别所述人物肖像图片中的人物肖像元素;
提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。
优选地,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤包括:
将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;
提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;
对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;
对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;
从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。
优选地,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼 接的步骤包括:
分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分;
将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;
将所述完整音频部分与所述完整视频部分进行同步。
优选地,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,所述视频剪辑方法还包括:
向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。
本发明还公开了一种基于智能终端的视频剪辑***,包括:
视频获取模块,获取待剪辑视频文件,并存储于所述智能终端内;
肖像特征提取模块,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;
视频片段获取模块,与所述视频获取模块及所述肖像特征提取模块连接,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;
视频拼接模块,与所述视频片段获取模块连接,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。
优选地,所述肖像特征提取模块包括:
图片获取单元,获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;
肖像元素识别单元,与所述图片获取单元连接,识别所述人物肖像图片中的人物肖像元素;
肖像特征提取单元,与所述肖像元素识别单元连接,提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。
优选地,所述视频片段获取模块包括:
元素提取单元,将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;
特征提取单元,与所述元素提取单元连接,提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;
背影画面获取单元,与所述特征提取单元连接,对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;
正面画面获取单元,与所述特征提取单元连接,对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;
剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。
优选地,所述视频拼接模块包括:
分离单元,分离所述视频片段中的音频信息与视频信息,形成音频部分与视频部分;
拼接单元,与所述分离单元连接,将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;
同步单元,与所述拼接单元连接,将所述完整音频部分与所述完整视频部分进行同步。
优选地,在视频片段获取模块与视频拼接模块之间,所述视频剪辑***还包括:
视频片段筛选模块,向用户推送所述视频片段,由用户进行筛选,剔除无关的视频 片段。
采用了上述技术方案后,与现有技术相比,具有以下有益效果:
1.满足用户对获取包含或不包含某一人物或某多个人物的专属视频的需求;
2.可识别目标人物的背影,获取包含或不包含某一人物或某多个人物背影的视频片段;
3.快捷方便,准确度高;
附图说明
图1为符合本发明一优选实施例中视频剪辑方法的流程示意图;
图2为符合本发明一优选实施例中,视频剪辑方法的提取人物肖像特征的方法的流程示意图;
图3为符合本发明一优选实施例中,视频剪辑方法的获取视频片段的方法的流程示意图;
图4为符合本发明一优选实施例中,视频剪辑方法的将视频片段或待剪辑视频中除视频片段外的剩余视频片段进行拼接的方法的流程示意图;
图5为符合本发明另一优选实施例中视频剪辑方法的流程示意图;
图6为符合本发明一优选实施例视频剪辑***的***结构示意图。
图7为符合本发明一优选实施例中,视频剪辑***的肖像特征提取模块的结构示意图。
图8为符合本发明一优选实施例中,视频剪辑***的视频片段获取模块的结构示意图。
图9为符合本发明一优选实施例中,视频剪辑***的视频拼接模块的结构示意图。
图10为符合本发明另一优选实施例视频剪辑***的***结构示意图。
附图标记:
100-视频剪辑***;11-视频获取模块;12-视频片段获取模块;13-肖像特征提取模块;14-视频拼接模块;15-视频片段筛选模块。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
在本发明的描述中,除非另有规定和限定,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本发明的说明,其本身并没有特定的意义。因此,“模块”与“部件”可以混合地使用。
移动终端可以以各种形式来实施。例如,本发明中描述的终端可以包括诸如移动电话、智能电话、笔记本电脑、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、导航装置等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。
参阅图1,为本发明一优选实施例中基于智能终端的视频剪辑方法的流程示意图。该实施例中,视频剪辑方法具体包括以下步骤:
S100:获取待剪辑视频文件,并存储于所述智能终端内
为了实现视频剪辑,首先必须要获取待剪辑的视频文件,获取待剪辑视频文件的方式既包括导入智能终端内的视频,也包括从智能终端外部导入视频,并存储在智能终端内。此处导入的待剪辑视频文件必须包含用户想要剪辑的目标人物,若用户导入视频错误,即不包含其想要剪辑的目标人物,则在后续图像识别获取视频片段时将会没有结果,并提醒用户未获取相关视频,请用户核对待剪辑视频文件或目标人物肖像图片是否导入错误。
S200:获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征
获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,并提取图片中人物肖像元素的人物肖像特征。获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。此处导入的人物肖像图片必须与用户的需求紧密结合,如果用户需要目标人物的正面肖像视频片段,则用户提供的人物肖像图片必须包含目标人物的面部元素,如果用户需要包含目标人物背影的视频片段,则用户提供的人物肖像图片必须包含目标人物的背影,同时,当目标人物数量为1时,用户提供的人物肖像图片也必须为包含目标人物单独一人的图片;当目标人物数量大于1时,用户需提供目标人物的相应的人物肖像图片,所有图片中均不可出现除目标人物以外的其他人,但可为同时并仅包含多个目标人物的人物肖像图片。为提供结果的精确度,将建议用户从待剪辑视频中截取满足上述要求的图片作为人物肖像图片,比对结果更加准确。
S300:获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段
获取待剪辑视频文件以及人物肖像图片并提取人物肖像特征后,需要根据提取的人物肖像特征对视频中的画面的人物进行比对,获取包含与提取的人物肖像特征相符的人物的视频片段。在此过程中,需要根据用户的需求建立策略,既包括目标人物正面画面与目标人物背影画面的区分,目标人物数量的区分,也包括目标人物与其他人物的区分。首先,用户需要的视频片段若为只含目标人物正面肖像的画面则只需要获取含目标人物正面肖像的画面,用户需要的视频片段若为只含目标人物背影的画面则只需要获取含目标人物背影的画面,若二者都需要,则含目标人物正面肖像的画面与含目标人物背影的画面之间应为逻辑或的关系。其次,用户的目标人物数量超过1时,则需要考虑各人物之间的关系,当用户需要各目标人物同时出现的画面时,各目标人物肖像特征之间的关系应为逻辑与,当用户需要各目标人物任一出现的画面即可时,各目标人物肖像特征之间的关系应为逻辑或,除此之外,各目标人物之间的逻辑关系均可用户需求确定,如其中两者是逻辑与关系,另一者与该两者为逻辑或的关系。以电视剧为例,若用户需要某一男配角与女主角的所有对戏集锦,则两人物应同时出现,二者之间应为逻辑与的关系,则只获取二者人物肖像均出现的画面,既包括背影也包括正面。最后,关于目标人物与其他人物的区分,若用户需要只包含目标人物的画面,则画面中提取的人物数量应与画面中目标人物数量一致。
获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段的过程如下,对待剪辑视频进行分帧,获取每一帧画面,通过图像变换技术、图像增强技术、图像识别技术以及图像分割技术将每一帧画面中的人物肖像元素提取出来,再通过取样提取人物肖像元素中的人物肖像特征,一一与目标人物图片中提取的人物肖像元素比对,二者相一致时,则该画面即为含该目标人物的画面,相连的画面即形成一个视频片段。
在上述电视剧例子中,将人物肖像特征分为面部肖像特征与身形轮廓特征,若男配角的面部肖像特征为A1,身形轮廓特征为A2,女主角的面部肖像特征为B1,身形轮廓特征为B2,画面人物肖像元素数量为N,则只有该男配角与该女主角对戏,既包括正面 也包括背影,且不包含其他人的画面应包含的特征的逻辑关系应为(A1 orA2)and(B1 or B2)and(N=2)。
S400:将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接
获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段后,需要将获取的片段进行拼接,拼接时应按照一定的顺序,既可以按照时间顺序,也可以按照画面中人物由少到多或由多到少的顺序,其中人物由少到多的变化即为仅含目标人物到含其他人的变化,由多到少与之相反,也可以按照目标人物占视频画面的比例由小到大或由大到小的顺序,后两种顺序均应辅以时间顺序。以按目标人物占视频画面的比例为标准的顺序为例,该人物占视频画面的比例可通过该人物肖像元素在视频画面中的面积除以视频画面的面积计算,记为占比率,此计算应在在识别每一帧画面后进行。当某一画面的人物占比率较高时,若该画面在某一被选取的视频片段中,则该视频片段视为一体,不论该视频片段其它画面中占比率高低,均按时间顺序拼接,防止单纯由占比率进行拼接造成画面断裂不连续。相当于用视频片段中画面的最大占比率进行比较,并按最大占比率大小顺序进行拼接,同时,占比率相同时,也按时间顺序拼接。
除了将获取的视频片段进行拼接外,当用户需求为不含某一人物或某多个人物的视频,则需要将待剪辑视频中除获取的视频片段外的剩余视频片段进行拼接,去除获取的视频片段。
具有上述配置后,用户可根据用户的需求快速准确获取包含或不包含某一人物或某多个人物的专属视频,同时可准确识别含目标人物背影的视频片段。
参阅图2,在一优选实施例中,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤,具体包括有:
S201:获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内
获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。
S202:识别所述人物肖像图片中的人物肖像元素
获取具有人物肖像元素的人物肖像图片后,由于图片中可能存在具有干扰因素的背景,因此,需要将图片中的人物肖像元素提取出来,此处首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引以及查询步骤识别图片中的人物肖像元素,通过图像分割技术提取人物肖像元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。
S203:提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征
提取人物肖像元素后,需要提取其中的肖像特征,此处应对人物肖像图片进行区分,当人物肖像图片中的人物肖像元素为人物背影时,应提取人物背影的身形轮廓特征,包括身体轮廓,各部分的比例等特征,形成第一身形轮廓特征;当人物肖像图片中的人物肖像元素为人物正面肖像时,应提取人物正面肖像的面部肖像特征,包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征,也可以在面部进行均匀取样,并记录面部肖像在画面中的面积大小。
参阅图3,在一优选实施例中,获取所述待剪辑视频中包含与所述人物肖像特征相符 的人物的视频片段的步骤可具体包括:
S301:将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素
为了对待剪辑视频中的人物肖像进行识别,需要首先对视频进行分帧,形成一帧一帧的画面,并提取每一帧画面中的人物肖像元素,首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引以及查询步骤识别图片中的人物肖像元素,最后通过图像分割技术提取人物肖像元素,提取的人物肖像元素包括人物背影元素与人物面部元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。
S302:提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征
在待剪辑视频的画面中提取到人物背影元素后,需对人物背影元素进行采样,提取身体轮廓、各部分的比例等身形轮廓特征形成第二身形轮廓特征,此处的提取方法应与提取人物第一身形轮廓特征中的方法保持一致;在待剪辑视频的画面中提取到人物面部元素后,需对人物面部元素进行采样,提取包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征的面部肖像特征形成第二面部肖像特征,此处的提取方法应与提取人物第一面部肖像特征中的方法保持一致。存在目标人物有可能会戴墨镜,因此在提取面部肖像特征时需要对墨镜部分删除,只考虑剩余面部肖像部分,删除墨镜部分的步骤包括通过对大量墨镜数据库的采样形成由轮廓线条与颜色等特征构成的墨镜模型,通过墨镜模型建立索引对面部元素进行查询,当查询到与墨镜模型一致的部分时,认为该部分为墨镜部分,删除该部分。
S303:对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面
获取第二身形轮廓特征与第一身形轮廓特征后,根据第一身形轮廓特征建立索引,将第二身形轮廓特征缩放至与第一身形轮廓特征相同的尺寸,并根据索引对缩放后的第二身形轮廓特征进行采样查询,如身体的轮廓线条是否吻合,身体各部分的比例是否吻合等等,取第一相似度阈值为90%,当吻合度均大于等于第一相似度阈值时认为第二身形轮廓特征与第一身形轮廓特征的相似度大于等于第一相似度阈值,认为该第二身形轮廓特征对应的人物与第一身形轮廓特征对应的人物为同一人,获取该第二身形轮廓特征对应的画面,为人物背影画面。此处第一相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于背影识别的难度较大,容易识别错误,因此,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在85%-90%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。
S304:对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面
获取第二面部肖像特征与第一面部肖像特征后,根据第一面部肖像特征建立索引,将第二面部肖像特征缩放至与第一面部肖像特征相同的尺寸,并根据索引对缩放后的第二面部肖像特征进行比对,如面部肤色是否相同、五官形状大小、位置距离关系是否吻合以及面部具有识别意义的特征,如嘴角的黑痣是否相同等等,取第二相似度阈值为85%,当吻合度均大于等于第二相似度阈值时,认为第二面部肖像特征与第一面部肖像 特征的相似度大于等于第二相似度阈值,认为该第二面部肖像特征对应的人物与第一面部肖像特征对应的人物为同一人,获取该第二面部肖像特征对应的画面,为人物正面画面。此处第二相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于面部肖像特征较多,比对更准确,因此面部肖像特征比对时阈值即第二相似度阈值比身形轮廓特征比对时阈值即第一相似度阈值稍低。同样,对于面部肖像特征的比对,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在80%-85%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。
S305:从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段
获取人物背影画面与人物正面画面后,查询相邻帧的画面是否也被获取,当相邻帧的画面被获取时,将连续帧画面视为一体从待剪辑视频中剪切下来,形成视频片段。
参阅图4,在一优选实施例中,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤具体包括:
S401:分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分
获取视频片段或待剪辑视频中除所述视频片段外的剩余视频片段后,需将每一片段中的音频信息与视频信息分离,此处的视频信息不包括音频信息,同时需要记录音频信息与视频信息的位置关系,并将音频信息与视频信息提取出来,形成音频部分与视频部分。
S402:将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分
获取每一片段的音频部分与视频部分后,需要将音频部分与视频部分单元按顺序进行拼接,形成完全由音频部分构成的完整音频部分与完全由视频部分构成的完整视频部分。
S403:将所述完整音频部分与所述完整视频部分进行同步
获取完整音频部分与完整视频部分后,根据所记录的音频信息与视频信息的位置关系,将完整音频部分与完整视频部分进行同步,形成最终完整的视频。
参阅图5,为本发明另一优选实施例中基于智能终端的视频剪辑方法的流程示意图。该实施例中,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,该视频剪辑方法还包括以下步骤:
S500:向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段
获取视频片段后,为进一步提高准确度,通过向用户推送获取的视频片段,由用户进行筛选,并可进行删除操作剔除识别错误的无关视频片段。
参阅图6,为符合本发明一优选实施例中基于智能终端的视频剪辑***100,其具体包括以下部件:
视频获取模块11,为了实现视频剪辑,首先必须要获取待剪辑的视频文件,获取待剪辑视频文件的方式既包括导入智能终端内的视频,也包括从智能终端外部导入视频,并存储在智能终端内。此处导入的待剪辑视频文件必须包含用户想要剪辑的目标人物,若用户导入视频错误,即不包含其想要剪辑的目标人物,则在后续图像识别获取视频片段时将会没有结果,并提醒用户未获取相关视频,请用户核对待剪辑视频文件或目标人物肖像图片是否导入错误。
肖像特征提取模块13,获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,并提取图片中人物肖像元素的人物肖像特 征。获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。此处导入的人物肖像图片必须与用户的需求紧密结合,如果用户需要目标人物的正面肖像视频片段,则用户提供的人物肖像图片必须包含目标人物的面部元素,如果用户需要包含目标人物背影的视频片段,则用户提供的人物肖像图片必须包含目标人物的背影,同时,当目标人物数量为1时,用户提供的人物肖像图片也必须为包含目标人物单独一人的图片;当目标人物数量大于1时,用户需提供目标人物的相应的人物肖像图片,所有图片中均不可出现除目标人物以外的其他人,但可为同时并仅包含多个目标人物的人物肖像图片。为提供结果的精确度,将建议用户从待剪辑视频中截取满足上述要求的图片作为人物肖像图片,比对结果更加准确。
视频片段获取模块12,与所述视频获取模块11及所述肖像特征提取模块13连接,获取待剪辑视频文件以及人物肖像图片并提取人物肖像特征后,需要根据提取的人物肖像特征对视频中的画面的人物进行比对,获取包含与提取的人物肖像特征相符的人物的视频片段。在此过程中,需要根据用户的需求建立策略,既包括目标人物正面画面与目标人物背影画面的区分,目标人物数量的区分,也包括目标人物与其他人物的区分。首先,用户需要的视频片段若为只含目标人物正面肖像的画面则只需要获取含目标人物正面肖像的画面,用户需要的视频片段若为只含目标人物背影的画面则只需要获取含目标人物背影的画面,若二者都需要,则含目标人物正面肖像的画面与含目标人物背影的画面之间应为逻辑或的关系。其次,用户的目标人物数量超过1时,则需要考虑各人物之间的关系,当用户需要各目标人物同时出现的画面时,各目标人物肖像特征之间的关系应为逻辑与,当用户需要各目标人物任一出现的画面即可时,各目标人物肖像特征之间的关系应为逻辑或,除此之外,各目标人物之间的逻辑关系均可用户需求确定,如其中两者是逻辑与关系,另一者与该两者为逻辑或的关系。以电视剧为例,若用户需要某一男配角与女主角的所有对戏集锦,则两人物应同时出现,二者之间应为逻辑与的关系,则只获取二者人物肖像均出现的画面,既包括背影也包括正面。最后,关于目标人物与其他人物的区分,若用户需要只包含目标人物的画面,则画面中提取的人物数量应与画面中目标人物数量一致。
获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段的过程如下,对待剪辑视频进行分帧,获取每一帧画面,通过图像变换技术、图像增强技术、图像识别技术以及图像分割技术将每一帧画面中的人物肖像元素提取出来,再通过取样提取人物肖像元素中的人物肖像特征,一一与目标人物图片中提取的人物肖像元素比对,二者相一致时,则该画面即为含该目标人物的画面,相连的画面即形成一个视频片段。
在上述电视剧例子中,将人物肖像特征分为面部肖像特征与身形轮廓特征,若男配角的面部肖像特征为A1,身形轮廓特征为A2,女主角的面部肖像特征为B1,身形轮廓特征为B2,画面人物肖像元素数量为N,则只有该男配角与该女主角对戏,既包括正面也包括背影,且不包含其他人的画面应包含的特征的逻辑关系应为(A1 orA2)and(B1 or B2)and(N=2)。
视频拼接模块14,与所述视频片段获取模块12连接,获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段后,需要将获取的片段进行拼接,拼接时应按照一定的顺序,既可以按照时间顺序,也可以按照画面中人物由少到多或由多到少的顺序,其中人物由少到多的变化即为仅含目标人物到含其他人的变化,由多到少与之相反,也可以按照目标人物占视频画面的比例由小到大或由大到小的顺序,后两种顺序均应辅以时间顺序。以按目标人物占视频画面的比例为标准的顺序为例,该人物占视频画面的比例可通过该人物肖像元素在视频画面中的面积除以视频画面的面积计算,记为占比率,此计算应在在识别每一帧画面后进行。当某一画面的人物占比率较高时,若该画面在某一被选取的视频片段中,则该视频片段视为一体,不论该视频片段其它画面中占比率高 低,均按时间顺序拼接,防止单纯由占比率进行拼接造成画面断裂不连续。相当于用视频片段中画面的最大占比率进行比较,并按最大占比率大小顺序进行拼接,同时,占比率相同时,也按时间顺序拼接。
除了将获取的视频片段进行拼接外,当用户需求为不含某一人物或某多个人物的视频,则需要将待剪辑视频中除获取的视频片段外的剩余视频片段进行拼接,去除获取的视频片段。
参阅图7,在一优选实施例中,肖像特征提取模块13具体包括:
图片获取单元,获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。
肖像元素识别单元,与所述图片获取单元连接,获取具有人物肖像元素的人物肖像图片后,由于图片中可能存在具有干扰因素的背景,因此,需要将图片中的人物肖像元素提取出来,此处首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引build以及查询步骤识别图片中的人物肖像元素,通过图像分割技术提取人物肖像元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。
肖像特征提取单元,与所述肖像元素识别单元连接,提取人物肖像元素后,需要提取其中的肖像特征,此处应对人物肖像图片进行区分,当人物肖像图片中的人物肖像元素为人物背影时,应提取人物背影的身形轮廓特征,包括身体轮廓,各部分的比例等特征,形成第一身形轮廓特征;当人物肖像图片中的人物肖像元素为人物正面肖像时,应提取人物正面肖像的面部肖像特征,包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征,也可以在面部进行均匀取样,并记录面部肖像在画面中的面积大小。
参阅图8,一优选实施例中,视频片段获取模块12具体包括:
元素提取单元,为了对待剪辑视频中的人物肖像进行识别,需要首先对视频进行分帧,形成一帧一帧的画面,并提取每一帧画面中的人物肖像元素,首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引build以及查询步骤识别图片中的人物肖像元素,最后通过图像分割技术提取人物肖像元素,提取的人物肖像元素包括人物背影元素与人物面部元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。
特征提取单元,与所述元素提取单元连接,在待剪辑视频的画面中提取到人物背影元素后,需对人物背影元素进行采样,提取身体轮廓、各部分的比例等身形轮廓特征形成第二身形轮廓特征,此处的提取方法应与提取人物第一身形轮廓特征中的方法保持一致;在待剪辑视频的画面中提取到人物面部元素后,需对人物面部元素进行采样,提取包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的 黑痣等特征的面部肖像特征形成第二面部肖像特征,此处的提取方法应与提取人物第一面部肖像特征中的方法保持一致。存在目标人物有可能会戴墨镜,因此在提取面部肖像特征时需要对墨镜部分删除,只考虑剩余面部肖像部分,删除墨镜部分的步骤包括通过对大量墨镜数据库的采样形成由轮廓线条与颜色等特征构成的墨镜模型,通过墨镜模型建立索引对面部元素进行查询,当查询到与墨镜模型一致的部分时,认为该部分为墨镜部分,删除该部分。
背影画面获取单元,与所述特征提取单元连接,获取第二身形轮廓特征与第一身形轮廓特征后,根据第一身形轮廓特征建立索引,将第二身形轮廓特征缩放至与第一身形轮廓特征相同的尺寸,并根据索引对缩放后的第二身形轮廓特征进行采样查询,如身体的轮廓线条是否吻合,身体各部分的比例是否吻合等等,取第一相似度阈值为90%,当吻合度均大于等于第一相似度阈值时认为第二身形轮廓特征与第一身形轮廓特征的相似度大于等于第一相似度阈值,认为该第二身形轮廓特征对应的人物与第一身形轮廓特征对应的人物为同一人,获取该第二身形轮廓特征对应的画面,为人物背影画面。此处第一相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于背影识别的难度较大,容易识别错误,因此,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在85%-90%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。
正面画面获取单元,与所述特征提取单元连接,获取第二面部肖像特征与第一面部肖像特征后,根据第一面部肖像特征建立索引,将第二面部肖像特征缩放至与第一面部肖像特征相同的尺寸,并根据索引对缩放后的第二面部肖像特征进行比对,如面部肤色是否相同、五官形状大小、位置距离关系是否吻合以及面部具有识别意义的特征,如嘴角的黑痣是否相同等等,取第二相似度阈值为85%,当吻合度均大于等于第二相似度阈值时,认为第二面部肖像特征与第一面部肖像特征的相似度大于等于第二相似度阈值,认为该第二面部肖像特征对应的人物与第一面部肖像特征对应的人物为同一人,获取该第二面部肖像特征对应的画面,为人物正面画面。此处第二相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于面部肖像特征较多,比对更准确,因此面部肖像特征比对时阈值即第二相似度阈值比身形轮廓特征比对时阈值即第一相似度阈值稍低。同样,对于面部肖像特征的比对,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在80%-85%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。
剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,获取人物背影画面与人物正面画面后,查询相邻帧的画面是否也被获取,当相邻帧的画面被获取时,将连续帧画面视为一体从待剪辑视频中剪切下来,形成视频片段。
参阅图9,在一优选实施例中,视频拼接模块14具体包括:
分离单元,获取视频片段或待剪辑视频中除所述视频片段外的剩余视频片段后,需将每一片段中的音频信息与视频信息分离,此处的视频信息不包括音频信息,同时需要记录音频信息与视频信息的位置关系,并将音频信息与视频信息提取出来,形成音频部分与视频部分。
拼接单元,获取每一片段的音频部分与视频部分后,需要将音频部分与视频部分单元按顺序进行拼接,形成完全由音频部分构成的完整音频部分与完全由视频部分构成的完整视频部分。
同步单元,获取完整音频部分与完整视频部分后,根据所记录的音频信息与视频信息的位置关系,将完整音频部分与完整视频部分进行同步,形成最终完整的视频。
参阅图10,为符合本发明另一优选实施例中基于智能终端的视频剪辑***100,在视频片段获取模块12与视频拼接模块14之间,所述视频剪辑***100还包括以下部件:
视频片段筛选模块15,获取视频片段后,为进一步提高准确度,通过向用户推送获取的视频片段,由用户进行筛选,并可进行删除操作剔除识别错误的无关视频片段。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当注意的是,本发明的实施例有较佳的实施性,且并非对本发明作任何形式的限制,任何熟悉该领域的技术人员可能利用上述揭示的技术内容变更或修饰为等同的有效实施例,但凡未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所作的任何修改或等同变化及修饰,均仍属于本发明技术方案的范围内。

Claims (10)

  1. 一种基于智能终端的视频剪辑方法,其特征在于,包括以下步骤:
    获取待剪辑视频文件,并存储于所述智能终端内;
    获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;
    获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;
    将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。
  2. 如权利要求1所述的视频剪辑方法,其特征在于,
    获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤包括:
    获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;
    识别所述人物肖像图片中的人物肖像元素;
    提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。
  3. 如权利要求2所述的视频剪辑方法,其特征在于,
    获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤包括:
    将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;
    提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;
    对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;
    对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;
    从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。
  4. 如权利要求1所述的视频剪辑方法,其特征在于,
    将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤包括:
    分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分;
    将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;
    将所述完整音频部分与所述完整视频部分进行同步。
  5. 如权利要求1-4任一所述的视频剪辑方法,其特征在于,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,所述视频剪辑方法还包括:
    向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。
  6. 一种基于智能终端的视频剪辑***,其特征在于,包括:
    视频获取模块,获取待剪辑视频文件,并存储于所述智能终端内;
    肖像特征提取模块,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;
    视频片段获取模块,与所述视频获取模块及所述肖像特征提取模块连接,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;
    视频拼接模块,与所述视频片段获取模块连接,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。
  7. 如权利要求6所述的视频剪辑***,其特征在于,
    所述肖像特征提取模块包括:
    图片获取单元,获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;;
    肖像元素识别单元,与所述图片获取单元连接,识别所述人物肖像图片中的人物肖像元素;
    肖像特征提取单元,与所述肖像元素识别单元连接,提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。
  8. 如权利要求7所述的视频剪辑***,其特征在于,
    所述视频片段获取模块包括:
    元素提取单元,将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;
    特征提取单元,与所述元素提取单元连接,提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;
    背影画面获取单元,与所述特征提取单元连接,对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;
    正面画面获取单元,与所述特征提取单元连接,对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;
    剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。
  9. 如权利要求6所述的视频剪辑***,其特征在于,
    所述视频拼接模块包括:
    分离单元,分离所述视频片段中的音频信息与视频信息,形成音频部分与视频部分;
    拼接单元,与所述分离单元连接,将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;
    同步单元,与所述拼接单元连接,将所述完整音频部分与所述完整视频部分进行同步。
  10. 如权利要求6-9任一所述的视频剪辑***,其特征在于,在视频片段获取模块与视频拼接模块之间,所述视频剪辑***还包括:
    视频片段筛选模块,向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。
PCT/CN2017/095540 2017-08-02 2017-08-02 一种基于智能终端的视频剪辑方法及视频剪辑*** WO2019023953A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/095540 WO2019023953A1 (zh) 2017-08-02 2017-08-02 一种基于智能终端的视频剪辑方法及视频剪辑***

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/095540 WO2019023953A1 (zh) 2017-08-02 2017-08-02 一种基于智能终端的视频剪辑方法及视频剪辑***

Publications (1)

Publication Number Publication Date
WO2019023953A1 true WO2019023953A1 (zh) 2019-02-07

Family

ID=65232285

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/095540 WO2019023953A1 (zh) 2017-08-02 2017-08-02 一种基于智能终端的视频剪辑方法及视频剪辑***

Country Status (1)

Country Link
WO (1) WO2019023953A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111953919A (zh) * 2019-05-17 2020-11-17 成都鼎桥通信技术有限公司 视频单呼中手持终端的视频录制方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521565A (zh) * 2011-11-23 2012-06-27 浙江晨鹰科技有限公司 低分辨率视频的服装识别方法及***
JP2013196518A (ja) * 2012-03-21 2013-09-30 Casio Comput Co Ltd 画像処理装置、画像処理方法及びプログラム
CN103577063A (zh) * 2012-07-23 2014-02-12 Lg电子株式会社 移动终端及其控制方法
CN103827913A (zh) * 2011-09-27 2014-05-28 三星电子株式会社 用于在便携式终端中剪辑和共享内容的装置和方法
CN104820711A (zh) * 2015-05-19 2015-08-05 深圳久凌软件技术有限公司 复杂场景下对人形目标的视频检索方法
CN106021496A (zh) * 2016-05-19 2016-10-12 海信集团有限公司 视频搜索方法及视频搜索装置
CN106534967A (zh) * 2016-10-25 2017-03-22 司马大大(北京)智能***有限公司 视频剪辑方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103827913A (zh) * 2011-09-27 2014-05-28 三星电子株式会社 用于在便携式终端中剪辑和共享内容的装置和方法
CN102521565A (zh) * 2011-11-23 2012-06-27 浙江晨鹰科技有限公司 低分辨率视频的服装识别方法及***
JP2013196518A (ja) * 2012-03-21 2013-09-30 Casio Comput Co Ltd 画像処理装置、画像処理方法及びプログラム
CN103577063A (zh) * 2012-07-23 2014-02-12 Lg电子株式会社 移动终端及其控制方法
CN104820711A (zh) * 2015-05-19 2015-08-05 深圳久凌软件技术有限公司 复杂场景下对人形目标的视频检索方法
CN106021496A (zh) * 2016-05-19 2016-10-12 海信集团有限公司 视频搜索方法及视频搜索装置
CN106534967A (zh) * 2016-10-25 2017-03-22 司马大大(北京)智能***有限公司 视频剪辑方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111953919A (zh) * 2019-05-17 2020-11-17 成都鼎桥通信技术有限公司 视频单呼中手持终端的视频录制方法和装置
CN111953919B (zh) * 2019-05-17 2022-11-04 成都鼎桥通信技术有限公司 视频单呼中手持终端的视频录制方法和装置

Similar Documents

Publication Publication Date Title
US10872416B2 (en) Object oriented image editing
Dhall et al. Emotion recognition in the wild challenge 2013
US6578040B1 (en) Method and apparatus for indexing of topics using foils
US20170147869A1 (en) Digital image processing method and apparatus, and storage medium
US10229323B2 (en) Terminal and method for managing video file
KR20090010855A (ko) 인물 별로 디지털 컨텐츠를 분류하여 저장하는 시스템 및방법
WO2013060269A1 (zh) 建立关联关系的方法及装置
CN105513007A (zh) 一种基于移动终端的拍照美颜方法、***及移动终端
RU2667802C2 (ru) Способ и терминал сопоставления изображений по адресной книге
WO2010027481A1 (en) Indexing related media from multiple sources
CN103604271A (zh) 一种基于智能冰箱的食品识别方法
CN104756188A (zh) 基于自动的单词翻译改变嘴唇形状的装置及方法
US20110305437A1 (en) Electronic apparatus and indexing control method
CN110110147A (zh) 一种视频检索的方法及装置
TWI472936B (zh) 人物照片搜尋系統
JP2006079458A (ja) 画像伝送システム、画像伝送方法、及び画像伝送プログラム
CN105247606B (zh) 一种照片显示方法及用户终端
US20130308864A1 (en) Information processing apparatus, information processing method, computer program, and image display apparatus
WO2013152682A1 (zh) 一种新闻视频字幕标注方法
WO2019023953A1 (zh) 一种基于智能终端的视频剪辑方法及视频剪辑***
US20110304644A1 (en) Electronic apparatus and image display method
US8494347B2 (en) Electronic apparatus and movie playback method
WO2016145827A1 (zh) 终端的控制方法及装置
CN104021215B (zh) 一种图片整理方法及装置
JP2011164949A (ja) 情報処理装置、検索条件設定方法及びプログラム並びに記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17920352

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17920352

Country of ref document: EP

Kind code of ref document: A1