CN107509021B

CN107509021B - Shooting method, shooting device and storage medium

Info

Publication number: CN107509021B
Application number: CN201710584382.0A
Authority: CN
Inventors: 刘昕; 廖宇
Original assignee: Migu Cultural Technology Co Ltd; MIGU Music Co Ltd
Current assignee: Migu Cultural Technology Co Ltd; MIGU Music Co Ltd
Priority date: 2017-07-18
Filing date: 2017-07-18
Publication date: 2020-08-07
Anticipated expiration: 2037-07-18
Also published as: CN107509021A

Abstract

The invention discloses a shooting method, which comprises the steps of acquiring state information of a target object in a state of acquiring an audio signal input by the target object; judging whether the state information meets shooting conditions or not; and if the state information accords with shooting conditions, controlling the camera device to shoot the multimedia aiming at the target object. The invention also discloses a shooting device and a storage medium.

Description

Shooting method, shooting device and storage medium

Technical Field

The present invention relates to the field of shooting technologies, and in particular, to a shooting method, a shooting device, and a storage medium.

Background

Along with the continuous improvement of living standard of people, the entertainment modes that can be selected by people are more and more, wherein go to the room of K song and sing song is a very general entertainment mode.

Recently, in places with large people flow, such as shopping malls, cinemas, restaurants and the like, a small, exquisite and convenient entertainment device appears. The entertainment equipment is a glass house integrating the functions of singing, listening to songs, recording songs and the like, and the appearance of the entertainment equipment is similar to that of a closed telephone booth, so the entertainment equipment is called as a 'mobile KTV', 'mini song practice room' or a 'mini K song room' (hereinafter, the mini K song room is called as the 'mini K song room'). People can sing in the conditions without being interfered by others in the mini K song room, and various rich expressions are not forbidden to be exposed in the singing process.

However, the expression of the user is to be shot in the existing mini K song room, the user is often required to draw out the mobile phone or the camera and other terminal equipment from the mini K song room by himself to manually shoot, the efficiency is low, and the mini K song room is cumbersome, so that the user experience is poor.

Disclosure of Invention

In view of the above, embodiments of the present invention are intended to provide a shooting method, device and storage medium, which can implement automatic shooting.

In order to achieve the above purpose, the technical solution of the embodiment of the present invention is realized as follows:

the embodiment of the invention provides a shooting method, which is applied to equipment provided with a camera device and comprises the following steps:

acquiring state information of a target object in a state of acquiring an audio signal input by the target object;

judging whether the state information meets shooting conditions or not;

and if the state information accords with shooting conditions, controlling the camera device to shoot the multimedia aiming at the target object.

In the foregoing solution, before the obtaining the state information of the target object, the method further includes:

adjusting the shooting angle of at least one camera device to enable the target object to be in the shooting range of the at least one camera device; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device.

In the above solution, the adjusting a shooting angle of at least one image capturing apparatus includes:

automatically adjusting the shooting angle of the camera device; or,

and responding to the received shooting angle adjusting instruction, and adjusting the shooting angle of the camera device.

In the foregoing solution, the obtaining of the state information of the target object includes obtaining at least one of the following information:

a pitch of the target object;

a volume of the target object;

facial expression features of the target object.

In the foregoing solution, when the state information of the target object includes a tone of the target object, the determining whether the state information meets a shooting condition includes: when the tone of the target object in the audio signal output process is larger than a tone threshold, judging that the shooting condition is met;

when the state information of the target object includes the volume of the target object, the determining whether the state information meets a shooting condition includes: when the volume of the target object in the audio signal output process is larger than a volume threshold, judging that the shooting condition is met;

when the state information of the target object includes the facial expression feature of the target object, the determining whether the state information meets a shooting condition includes: judging whether preset facial expression features with similarity larger than a similarity threshold value with the facial expression features of the target object exist in an expression library or not; if yes, judging that the shooting condition is met; the expression library is a database used for storing preset facial expression characteristics; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

In the above scheme, the method further comprises: playing preset multimedia;

and if the state information does not accord with the shooting condition, controlling the camera device to shoot the multimedia aiming at the target object when the preset multimedia playing progress reaches a preset time point.

In the above scheme, the equipment provided with the camera device is applied to a mini K song room.

In the above solution, the multimedia for the target object includes one of the following:

a picture; video;

the method further comprises the following steps:

preprocessing the multimedia aiming at the target object, and storing the preprocessed multimedia;

wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; and making the acquired multimedia into a personalized video.

In the above scheme, the method further comprises:

if the state information accords with the shooting condition, controlling the equipment to record the audio signal currently input by the target object;

and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video.

In the above scheme, the method further comprises:

and replacing the preset multimedia by using the synthesized video.

In the above scheme, the method further comprises:

and sending the multimedia and/or the video aiming at the target object to a specified terminal device.

An embodiment of the present invention further provides a shooting device, where the device includes: the device comprises an acquisition module, a judgment module and a shooting module; wherein,

the acquisition module is used for acquiring the state information of the target object in the state of acquiring the audio signal input by the target object;

the judging module is used for judging whether the state information meets shooting conditions or not;

and the shooting module is used for controlling the camera device to shoot the multimedia aiming at the target object if the state information accords with the shooting condition.

In the above scheme, the apparatus further includes an adjusting module, configured to adjust a shooting angle of at least one camera device, so that the target object is within a shooting range of the at least one camera device; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device.

In the foregoing solution, the adjusting module is specifically configured to:

automatically adjusting the shooting angle of the camera device; or,

In the foregoing scheme, the obtaining module is specifically configured to: obtaining at least one of the following information:

a pitch of the target object;

a volume of the target object;

facial expression features of the target object.

In the foregoing solution, the determining module is specifically configured to: when the state information of the target object acquired by the acquisition module comprises the tone of the target object and the tone of the target object in the audio signal output process is greater than a tone threshold, judging that the shooting condition is met;

it is also specifically used for: when the state information of the target object acquired by the acquisition module comprises the volume of the target object and the volume of the target object in the audio signal output process is greater than a volume threshold, judging that the shooting condition is met;

it is also specifically used for: when the state information of the target object acquired by the acquisition module comprises the facial expression features of the target object, judging whether preset facial expression features with the similarity larger than a similarity threshold value exist in an expression library or not; if yes, judging that the shooting condition is met; the expression library is a database used for storing preset facial expression characteristics; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

In the above scheme, the apparatus further includes a preset module specifically configured to:

playing preset multimedia;

In the above solution, the apparatus further includes a preprocessing module, configured to preprocess the multimedia of the target object, and store the preprocessed multimedia; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; and making the acquired multimedia into a personalized video.

In the foregoing scheme, the preprocessing module is specifically configured to: if the state information accords with the shooting condition, controlling the equipment to record the audio signal currently input by the target object; and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video.

In the above scheme, the apparatus further includes a replacing module, configured to replace the preset multimedia with the synthesized video.

In the foregoing solution, the apparatus further includes a sending module, configured to send the multimedia and/or the video for the target object to a specified terminal device.

The embodiment of the present invention further provides a storage medium, on which an executable program is stored, and the executable program implements the steps in the above technical solution when executed by a processor.

The embodiment of the invention also provides a shooting device, which comprises a memory, a processor and an executable program which is stored on the memory and can be run by the processor, wherein the processor executes the steps in the technical scheme when running the executable program.

In any of the above schemes provided by the embodiments of the present invention, the state information of the target object is acquired in a state of acquiring an audio signal input by the target object, and then the state information of the target object is determined. And when the state information of the target object meets the shooting condition, shooting the multimedia aiming at the target object by using a camera device. Therefore, in the whole shooting process, the user does not need to draw out the terminal equipment such as a mobile phone or a camera and the like for manual shooting, and automatic shooting of the target object (user) is realized. This scheme is efficient for prior art, and to the user, owing to need not oneself to draw out the manual shooting of equipment to the shooting process becomes convenient.

Drawings

Fig. 1 is a schematic flow chart of an implementation of a shooting method according to an embodiment of the present invention;

fig. 2 is a detailed flowchart of a shooting method according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a shooting device according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a hardware structure of the shooting device according to the embodiment of the present invention.

Detailed Description

The first embodiment,

In the embodiment of the present invention, an implementation flow diagram of a shooting method is shown in fig. 1, and the implementation flow diagram includes the following steps:

step 101: acquiring state information of a target object in a state of acquiring an audio signal input by the target object;

step 102: judging whether the state information meets shooting conditions or not;

step 103: and if the state information accords with shooting conditions, controlling the camera device to shoot the multimedia aiming at the target object.

Here, the photographing method is applied to an apparatus provided with an image pickup device. Wherein, the equipment provided with the camera device can be applied to a mini Karaoke house.

Here, the target object is a user who controls the image pickup apparatus to perform multimedia shooting by the multimedia processing system in an audio output state. For example, during singing in a mini Karaoke room, a Karaoke system controls a camera to shoot pictures or videos. Further, when the target object outputs an audio signal, the multimedia processing system automatically acquires the audio signal, and the multimedia processing system is in a state of acquiring the audio signal input by the target object. For example, a target object starts singing in a mini Karaoke room, and a Karaoke system automatically acquires a sound signal of the target object. Before the obtaining of the state information of the target object, the method further includes: adjusting the shooting angle of at least one camera device to enable the target object to be in the shooting range of the at least one camera device; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device. Specifically, when the target object is in a shooting range of the camera device, and the shooting range needs to be adjusted when the shooting angle is not good, the shooting angle of the camera device is adjusted, so that the target object is in the shooting range of at least one camera device; and when the target object is located outside the designated area of the preview image shot by the camera device, prompting information appears to prompt the target object to change the position, so that the target object is located in the shooting range of at least one camera device. Wherein, the shooting angle of adjusting at least one camera device includes: automatically adjusting the shooting angle of the camera device; or responding to the received shooting angle adjusting instruction, and adjusting the shooting angle of the camera device.

Here, the acquiring the state information of the target object includes acquiring at least one of: a pitch of the target object; a volume of the target object; facial expression features of the target object. The tone of the target object is obtained by the multimedia processing system in a voice tone recognition mode after the audio signal of the target object is obtained; the volume of the target object is obtained by the multimedia processing system in a voice volume recognition mode after the audio signal of the target object is obtained; the facial expression characteristics of the target object are monitored and acquired by the multimedia processing system through the camera equipment in the working state.

Further, when the state information of the target object includes a tone of the target object, the determining whether the state information meets a shooting condition includes: when the tone of the target object in the audio signal output process is larger than a tone threshold, judging that the shooting condition is met;

Furthermore, the method further comprises: playing preset multimedia; and if the state information does not accord with the shooting condition, controlling the camera device to shoot the multimedia aiming at the target object when the preset multimedia playing progress reaches a preset time point. Wherein the preset multimedia is the existing multimedia. Optionally, at least one preset time point may be preset in the preset multimedia. The reason why the preset time point is set in the multimedia will be described later, and will not be described herein.

Further, the multimedia for the target object includes one of the following: a picture; video; specifically, the target object may be subjected to picture shooting in a state of acquiring an audio signal input by the target object; alternatively, the target object may be subjected to video recording by the camera device in a state where an audio signal input by the target object is captured.

Further, the method further comprises: preprocessing the multimedia aiming at the target object, and storing the preprocessed multimedia; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; and making the acquired multimedia into a personalized video. The emotion bag is a group of pictures used for expressing specific emotions, and is made by taking a shot picture of a target object as a material and matching a series of characters, symbols and the like matched with the picture. The personalized video takes a shot picture of a target object as a material, and audio, characters and the picture are synthesized through synthesis software, so that the video meeting the intention of a producer is manufactured. Further, the method further comprises: if the state information accords with the shooting condition, controlling the equipment to record the audio signal currently input by the target object; and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video. Further, the method further comprises: and replacing the preset multimedia by using the synthesized video. Further, the method further comprises: and sending the multimedia and/or the video aiming at the target object to a specified terminal device.

According to the scheme provided by the embodiment, the state information of the target object is acquired in the state of acquiring the audio signal input by the target object, whether the state information of the target object meets the shooting condition or not is judged, and when the state information of the target object is judged to meet the shooting condition, the camera device is controlled to shoot the multimedia aiming at the target object; or, under the condition that it is determined that the state information of the target object does not meet the shooting condition, when the preset playing progress of the multimedia reaches a preset time point, controlling the camera device to shoot the multimedia aiming at the target object. In this way, automatic photographing of the target object is achieved.

Example II,

In the embodiment of the present invention, an implementation flow of a shooting method is shown in fig. 1, and includes the following steps:

In step 101 of the embodiment of the present invention, at least one image capturing device is disposed in a space where a target object is located, and the image capturing device is connected to a multimedia processing system in a wired or Wireless manner, specifically, the image capturing device may be connected to the multimedia processing system in a Wireless manner such as a Wireless local Area network (W L AN), bluetooth, Near Field Communication (NFC), or may be connected to the multimedia processing system via a wired network.

Here, the photographing method is applied to an apparatus provided with an image pickup device. Wherein, the equipment that is provided with camera device is applied to mini K sings room.

Here, the acquiring the state information of the target object includes acquiring at least one of: a pitch of the target object; a volume of the target object; facial expression features of the target object. Wherein the multimedia includes images, video, and the like. In addition, before the obtaining the state information of the target object, the method further includes: adjusting the shooting angle of at least one camera device to enable the target object to be in the shooting range of the at least one camera device; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device. Specifically, when the target object is in a shooting range of the camera device, and the shooting range needs to be adjusted when the shooting angle is not good, the shooting angle of the camera device is adjusted, so that the target object is in the shooting range of at least one camera device; and when the target object is positioned outside the designated area of the preview image shot by the camera device, prompting information appears in the multimedia processing system to prompt the target object to change the position, so that the target object is positioned in the shooting range of at least one camera device.

Wherein, the shooting angle of adjusting at least one camera device includes: the multimedia processing system automatically adjusts the shooting angle of the camera device; or responding to the received shooting angle adjusting instruction, and adjusting the shooting angle of the camera device. Furthermore, the shooting angle of the camera device can be adjusted in at least one of the following ways: the multimedia processing system automatically adjusts the angle of the camera device; or after receiving the adjusting instruction, adjusting the shooting angle on a display screen of the multimedia processing system; alternatively, after receiving the adjustment instruction, the imaging angle is adjusted directly on the imaging device. Specifically, the multimedia processing system sends the multimedia shot by the camera device to the terminal equipment, the target object can obtain shot preview information in a screen of the terminal equipment, and the multimedia processing system can automatically adjust a shooting angle according to the preview information; or after receiving the adjustment instruction, the camera device directly adjusts the shooting angle according to the preview information displayed in the screen of the terminal equipment; or, according to the preview information displayed in the screen of the terminal equipment, the multimedia processing system adjusts the shooting angle of the camera device according to the received adjusting instruction.

In addition, after the shooting angle of the camera device is adjusted, the terminal equipment can close the preview state; alternatively, the preview state may remain open.

In step 102, the multimedia processing system determines whether the state information meets a shooting condition. Specifically, the determination that the state information meets the shooting condition may be performed in at least one of the following manners:

when the state information of the target object comprises the tone of the target object, the multimedia processing system judges whether the state information meets the shooting condition, and the method comprises the following steps: when the tone of the target object in the audio signal output process is larger than a tone threshold, judging that the shooting condition is met; when the state information of the target object comprises the volume of the target object, the multimedia processing system judges whether the state information meets the shooting condition, and the method comprises the following steps: when the volume of the target object in the audio signal output process is larger than a volume threshold, judging that the shooting condition is met;

when the state information of the target object comprises the facial expression features of the target object, the multimedia processing system judges whether the state information meets the shooting condition, and the method comprises the following steps: judging whether preset facial expression features with similarity larger than a similarity threshold value with the facial expression features of the target object exist in an expression library or not; if yes, judging that the shooting condition is met; the expression library is a database used for storing preset facial expression characteristics; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

Of course, whether the state information meets the shooting condition may be determined by two or more of the above-described modes. For example, when the state information of the target object includes a tone and a volume of the target object, the multimedia processing system determines whether the state information meets a photographing condition, including: and when the volume of the target object in the audio signal output process is larger than a volume threshold and the tone of the target object in the audio signal output process is larger than a tone threshold, judging that the shooting condition is met.

Specifically, the pitch and volume of the voice signal output by the target object during the audio signal output process may be changed continuously, and in the case of a higher pitch, there may be clear status information, such as rich facial expressions and various body movements, so that when the pitch of the target object during the audio signal output process is greater than the pitch threshold, it is determined that the shooting condition is met. Wherein the pitch threshold is an empirical value, and a certain value of the pitch may be set as the pitch threshold, such as 2000 america; or when the volume of the target object is obviously changed in the audio signal output process and is larger than a volume threshold, judging that the shooting condition is met. The volume threshold is an empirical value, and a certain value of the volume may be set as the volume threshold, for example, 40 db is set as the volume threshold; or judging whether preset facial expression features with similarity larger than a similarity threshold value with the facial expression features of the target object exist in an expression library; if yes, judging that the shooting condition is met; the similarity threshold is an empirical value, and facial expressions with a similarity greater than 80% may be set as the similarity threshold. The preset expressions may be anxious expressions, happy expressions, ferocious expressions, etc.

Furthermore, the method further comprises: playing preset multimedia; and if the state information does not accord with the shooting condition, controlling the camera device to shoot the multimedia aiming at the target object when the preset multimedia playing progress reaches a preset time point. The preset multimedia is the existing multimedia, and all the preset multimedia is set through statistics of historical information of the preset multimedia, so that the preset multimedia comprises at least one preset time point. The preset time point may be a time point when a smiling expression, a sad expression, and the like appear.

In step 103, if the state information matches the shooting condition, the imaging device is controlled to shoot the multimedia aiming at the target object. Wherein the multimedia for the target object comprises one of: a picture; video; specifically, the target object may be subjected to picture shooting in a state of acquiring an audio signal input by the target object; alternatively, the target object may be subjected to video recording by the camera device in a state where an audio signal input by the target object is captured.

Here, after the multimedia processing system controls the image pickup device to pick up the multimedia for the target object, the multimedia for the target object is preprocessed, and the preprocessed multimedia is stored; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; or, image processing and sound effect processing can be performed on the acquired photo or video of the target object; or, the acquired multimedia is made into a personalized video. Specifically, the multimedia processing system controls the device to record the currently input audio signal of the target object; and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video. Further, the multimedia processing system replaces the preset multimedia with the synthesized video. Further, the multimedia processing system sends the multimedia and/or the video aiming at the target object to a specified terminal device.

Example III,

In the following, with reference to the example, the shooting method of the embodiment of the present invention is further described in detail by taking an example of capturing or recording a video of the target object a during a singing process of the target object a in the mini karaoke room.

In the embodiment of the present invention, a detailed flow diagram of the shooting method is shown in fig. 2, and the method includes the following steps:

step 201: adjusting the shooting angle of a camera or prompting the target object A to change the position;

the system comprises a mini Karaoke system, a K song system, a wireless system and a camera, wherein the mini Karaoke system is provided with at least one camera, and the camera can shoot a target object A from different directions, for example, the target object A can be shot from the front, the left side and the right side.

In addition, a terminal device of the target object A, such as a mobile phone, can log in the K song system in a mode of scanning a two-dimensional code of WeChat, QQ or microblog; or the terminal equipment can log in the Karaoke system by inputting the verification code, so that the communication connection between the terminal equipment and the Karaoke system is realized.

Further, the camera is started after receiving the starting instruction; alternatively, it may be automatically initiated upon connection to the karaoke system. Specifically, the camera is started after directly receiving a starting instruction; or the karaoke system receives a starting instruction and starts the camera; or the camera is automatically started after being connected to the karaoke system. Wherein, every camera all need carry out independent opening, can open different cameras as required. In addition, the camera is set with a shooting mode after being started, and can be changed when needed.

Next, before the target object a sings formally, if the position of the target object a is not within the shooting range of any camera, adjusting the shooting angle of at least one camera so that the target object a is within the shooting range of at least one camera; or prompting the target object A to change the position if the target object A is positioned outside the designated area of the preview image shot by the camera. Specifically, when the target object A is located in the shooting range of the cameras, and the shooting range needs to be adjusted when the shooting angle is not good, the shooting angle of the cameras is adjusted, so that the target object A is located in the shooting range of at least one camera; when the target object A is located outside the designated area of the preview image shot by the camera, prompt information appears in a display screen of the Karaoke system to prompt the target object A to change the position, so that the target object A is located in the shooting range of at least one camera.

Wherein, the shooting angle of adjustment at least one camera includes: the K song system automatically adjusts the shooting angle of the camera; or responding to the received shooting angle adjusting instruction, and adjusting the shooting angle of the camera.

Further, adjusting the shooting angle of the camera can adopt at least one of the following modes:

the Karaoke system sends the picture shot by the camera to the mobile phone of the target object A, the target object A can obtain shot preview information in a screen of the mobile phone, and the Karaoke system can obtain the position of a preview frame of the target object A in a display screen of the mobile phone so as to automatically adjust the shooting angle; or after the camera receives the adjusting instruction, the camera directly adjusts the shooting angle according to the position of the target object A in the preview frame of the mobile phone display screen; or according to the position of the target object A in a preview frame in the display screen of the mobile phone, the Karaoke system adjusts the shooting angle of the camera according to the received adjusting instruction. In addition, after the shooting angle of the camera is adjusted, the camera keeps the current preview shooting state, and the mobile phone of the target object A can close the preview state; alternatively, the preview state may remain open.

Step 202: acquiring state information of a target object A in the singing process of the target object A;

after at least one camera is started and the shooting angle is adjusted, the target object A starts singing, the karaoke system monitors the target object A in real time, and in the singing process of the target object A, state information of the target object A is acquired.

Wherein the acquiring of the state information of the target object a includes acquiring at least one of the following information: the pitch of the target object a; volume of the target object a; facial expression features of the target object a.

Step 203: judging whether the state information meets shooting conditions or not;

here, the karaoke system determines whether the acquired state information of the target object a meets the shooting condition. Specifically, the determination that the state information meets the shooting condition may be performed in at least one of the following manners:

when the state information of the target object A comprises the tone of the target object A, the Karaoke system judges whether the state information meets the shooting condition, and the judgment comprises the following steps: when the tone of the target object A in the singing process is larger than a tone threshold, judging that the target object A meets the shooting condition;

when the state information of the target object A comprises the volume of the target object A, the Karaoke system judges whether the state information meets the shooting condition, and the method comprises the following steps: when the volume of the target object A in the singing process is larger than a volume threshold, judging that the target object A meets the shooting condition;

when the state information of the target object A comprises the facial expression characteristics of the target object A, the Karaoke system judges whether the state information meets the shooting condition, and the method comprises the following steps: judging whether preset facial expression features with the similarity larger than a similarity threshold value with the facial expression features of the target object A exist in the expression library or not; if yes, judging that the shooting condition is met; the expression library is a database used for storing preset facial expression characteristics; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

Of course, whether the state information meets the shooting condition may be determined by two or more of the above-described modes. For example, when the state information of the target object a includes the tone and volume of the target object, the karaoke system determines whether the state information meets the shooting condition, including: and when the volume of the target object A in the singing process is greater than the volume threshold and the tone is greater than the tone threshold, judging that the shooting condition is met.

Specifically, the pitch and volume of the target object a during singing are constantly changed, and in the case of a high pitch, vivid status information, such as rich facial expressions and various body movements, is often generated, so that when the pitch of the target object a during singing is greater than a pitch threshold, it is determined that the shooting condition is met. Wherein the pitch threshold is an empirical value, and a certain value of the pitch may be set as the pitch threshold, such as 2000 america; or when the volume of the target object A in the singing process is obviously changed and is larger than the volume threshold, judging that the shooting condition is met. The volume threshold is an empirical value, and a certain value of the volume may be set as the volume threshold, for example, 40 db is set as the volume threshold; or judging whether preset facial expression features with the similarity larger than a similarity threshold value with the facial expression features of the target object A exist in the expression library or not; if yes, judging that the shooting condition is met; the similarity threshold is an empirical value, and facial expressions with a similarity greater than 80% may be set as the similarity threshold. The preset expressions may be anxious expressions, happy expressions, ferocious expressions, etc.

In addition, the karaoke system plays preset multimedia; and if the state information does not accord with the shooting condition, controlling a camera to shoot the multimedia aiming at the target object A when the preset multimedia playing progress reaches a preset time point. The preset multimedia is multimedia existing in a karaoke system, and all the preset multimedia is set through statistics of historical information of the preset multimedia, so that the preset multimedia comprises at least one preset time point. The preset time point may be a time point when a smiling expression, a sad expression, and the like appear.

Step 204: and if the state information accords with the shooting condition, controlling the camera to shoot the multimedia aiming at the target object A.

And if the state information accords with the shooting condition, controlling the camera to shoot the multimedia aiming at the target object A. Wherein the multimedia for the target object A comprises one of the following: a picture; video;

specifically, when the current mode of the camera is a shooting format, a photo can be shot on the target object a in the singing process of the target object a; or, when the current mode of the camera is in a recording format, the target object a can be subjected to video recording by the camera in the singing process of the target object a.

In addition, after the Karaoke system controls the camera to shoot the multimedia aiming at the target object A, preprocessing the multimedia aiming at the target object A and storing the preprocessed multimedia; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; or, image processing and sound effect processing can be performed on the acquired photo or video of the target object; or, the acquired multimedia is made into a personalized video. Specifically, the karaoke system controls the equipment to record the song sung by the target object A at present; and synthesizing the recorded song and the multimedia aiming at the target object A into a video, and replacing the preset multimedia in the K song system by utilizing the synthesized video. In addition, the karaoke system can send the multimedia and/or the video aiming at the target object A to the mobile phone, and the network live broadcasting of the singing process of the target object A is realized through the mobile phone.

In order to implement the above shooting method, an embodiment of the present invention further provides a shooting apparatus, where a schematic structural diagram of the apparatus is shown in fig. 3, and the shooting apparatus includes: an acquisition module 31, a discrimination module 32 and a shooting module 33; wherein,

the acquiring module 31 is configured to acquire state information of a target object in a state of acquiring an audio signal input by the target object;

the judging module 32 is configured to judge whether the state information meets a shooting condition;

the shooting module 33 is configured to control the image capturing apparatus to capture multimedia for the target object if the state information meets the shooting condition. Wherein the multimedia for the target object comprises one of: a picture; video;

here, the obtaining module is specifically configured to: obtaining at least one of the following information:

a pitch of the target object;

a volume of the target object;

facial expression features of the target object.

Here, the apparatus further includes an adjusting module for adjusting a shooting angle of at least one image capturing apparatus so that the target object is within a shooting range of the at least one image capturing apparatus; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device. Specifically, when the target object is in a shooting range of the camera device, and the shooting range needs to be adjusted when the shooting angle is not good, the shooting angle of the camera device is adjusted, so that the target object is in the shooting range of at least one camera device; when the target object is located outside the designated area of the preview image shot by the camera device, prompt information appears to prompt the target object to change the position so that the target object can enter the shooting range of at least one camera device, and the target object is located in the shooting range of at least one camera device through adjustment. Further, the adjusting module is specifically configured to: automatically adjusting the shooting angle of the camera device; or responding to the received shooting angle adjusting instruction, and adjusting the shooting angle of the camera device.

Further, the determination module is specifically configured to: when the state information of the target object acquired by the acquisition module comprises the tone of the target object and the tone of the target object in the audio signal output process is greater than a tone threshold, judging that the shooting condition is met;

Further, the device further comprises a preset module, specifically configured to: playing preset multimedia; and if the state information does not accord with the shooting condition, controlling the camera device to shoot the multimedia aiming at the target object when the preset multimedia playing progress reaches a preset time point. The preset multimedia is the existing multimedia with at least one preset time point.

Specifically, the pitch and volume of the voice signal output by the target object during the audio signal output process may vary continuously, and bright state information, such as rich facial expressions and various body movements, may be present in the case of a higher pitch, so when the state information of the target object acquired by the acquisition module includes the pitch of the target object, and the pitch of the target object during the audio signal output process is greater than a pitch threshold, it is determined that the shooting condition is met. Wherein the pitch threshold is an empirical value, and a certain value of the pitch may be set as the pitch threshold, such as 2000 america; or when the state information of the target object acquired by the acquisition module includes the volume of the target object and the volume of the target object in the audio signal output process is greater than a volume threshold, judging that the shooting condition is met. The volume threshold is an empirical value, and a certain value of the volume may be set as the volume threshold, for example, 40 db is set as the volume threshold; or when the state information of the target object acquired by the acquisition module includes the facial expression features of the target object, judging whether preset facial expression features with similarity greater than a similarity threshold exist in an expression library; the similarity threshold is an empirical value, and facial expressions with a similarity greater than 80% may be set as the similarity threshold. The preset expressions may be anxious expressions, happy expressions, ferocious expressions, etc.

Further, the device also comprises a preprocessing module, which is used for preprocessing the multimedia aiming at the target object and storing the preprocessed multimedia; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; and making the acquired multimedia into a personalized video.

Further, the preprocessing module is specifically configured to: if the state information accords with the shooting condition, controlling the equipment to record the audio signal currently input by the target object; and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video.

Further, the apparatus further includes a replacing module, configured to replace the preset multimedia with the synthesized video.

Further, the apparatus further includes a sending module, configured to send the multimedia and/or the video for the target object to a specified terminal device.

In practical applications, the obtaining module 31, the judging module 32, the shooting module 33, the adjusting module, the presetting module, the preprocessing module, the replacing module and the sending module can be implemented by a Central Processing Unit (CPU), a microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), etc. located in the multimedia Processing system.

It should be noted that: in the above embodiment, when the photographing apparatus performs photographing, only the division of the program modules is described as an example, and in practical applications, the processing distribution may be completed by different program modules according to needs, that is, the internal structure of the apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the shooting device and the shooting method provided by the above embodiment belong to the same concept, and the specific implementation process thereof is described in the method embodiment, which is not described herein again.

In order to implement the foregoing method, an embodiment of the present invention further provides another shooting apparatus, where the shooting apparatus includes a memory, a processor, and an executable program that is stored in the memory and can be executed by the processor, and when the processor executes the executable program, the processor performs the following operations:

judging whether the state information meets shooting conditions or not;

The processor is further configured to, when running the executable program, perform the following:

automatically adjusting the shooting angle of the camera device; or,

the acquiring of the state information of the target object includes acquiring at least one of the following information:

a pitch of the target object;

a volume of the target object;

facial expression features of the target object.

when the state information of the target object includes a tone of the target object, the determining whether the state information meets a photographing condition includes: when the tone of the target object in the audio signal output process is larger than a tone threshold, judging that the shooting condition is met;

playing preset multimedia;

the equipment provided with the camera device is applied to a mini K song room.

the multimedia for the target object comprises one of the following: a picture; video;

the method further comprises the following steps:

and replacing the preset multimedia by using the synthesized video.

The hardware configuration of the imaging apparatus will be further described below by taking as an example that the imaging apparatus is implemented as a server or a terminal for imaging.

Fig. 4 is a schematic diagram of a hardware configuration of a camera according to an embodiment of the present invention, and the camera 400 shown in fig. 4 includes: at least one processor 401, memory 402, a user interface 403, and at least one network interface 404. The various components in the camera 400 are coupled together by a bus system 405. It is understood that the bus system 405 is used to enable connection communication between these components. The bus system 405 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 405 in fig. 4.

The user interface 403 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

It will be appreciated that the memory 402 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory.

The memory 402 in the embodiment of the present invention is used to store various types of data to support the operation of the photographing apparatus 400. Examples of such data include: any computer program for operating on the camera 400, such as the executable program 4021, and a program implementing the method of an embodiment of the present invention may be included in the executable program 4021.

The method disclosed in the above embodiments of the present invention may be applied to the processor 401, or implemented by the processor 401. The processor 401 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 401. The processor 401 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 401 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 402, and the processor 401 reads the information in the memory 402 and performs the steps of the aforementioned methods in conjunction with its hardware.

In an exemplary embodiment, an embodiment of the present invention further provides a storage medium having an executable program stored thereon, which when executed by the processor 401 of the photographing apparatus 400, performs the following operations:

judging whether the state information meets shooting conditions or not;

The executable program, when executed by the processor 401 of the camera 400, further performs the following operations:

automatically adjusting the shooting angle of the camera device; or,

a pitch of the target object;

a volume of the target object;

facial expression features of the target object.

playing preset multimedia;

the equipment provided with the camera device is applied to a mini K song room.

the method further comprises the following steps:

and replacing the preset multimedia by using the synthesized video.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or executable program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of an executable program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and executable program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by executable program instructions. These executable program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor with reference to a programmable data processing apparatus to produce a machine, such that the instructions, which execute via the computer or processor with reference to the programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These executable program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These executable program instructions may also be loaded onto a computer or reference programmable data processing apparatus to cause a series of operational steps to be performed on the computer or reference programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or reference programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A photographing method applied to an apparatus provided with an image pickup device, the method comprising:

judging whether the state information meets shooting conditions or not;

if the state information accords with shooting conditions, controlling the camera device to shoot the multimedia aiming at the target object; wherein,

when the state information of the target object includes the facial expression feature of the target object, the determining whether the state information meets a shooting condition includes: judging whether preset facial expression features with similarity larger than a similarity threshold value with the facial expression features of the target object exist in an expression library or not; and if so, judging that the shooting condition is met.

2. The method of claim 1, wherein before obtaining the state information of the target object, the method further comprises:

3. The method of claim 2, wherein the adjusting the shooting angle of the at least one camera comprises:

automatically adjusting the shooting angle of the camera device; or,

4. The method of claim 1, further comprising:

playing preset multimedia;

5. The method of claim 1, wherein the expression library is a database for storing preset facial expression features; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

6. The method of claim 1, wherein the multimedia for the target object comprises one of:

a picture; video;

the method further comprises the following steps:

7. The method of claim 6, further comprising:

8. The method of claim 7, further comprising:

and replacing the preset multimedia by using the synthesized video.

9. The method of claim 8, further comprising:

10. The method according to any one of claims 1 to 9, wherein the apparatus provided with the camera is applied to a mini Karaoke room.

11. A photographing apparatus, characterized in that the apparatus is provided with an image pickup device, the apparatus comprising: the device comprises an acquisition module, a judgment module and a shooting module; wherein,

the shooting module is used for controlling the camera device to shoot the multimedia aiming at the target object if the state information accords with shooting conditions;

the discrimination module is specifically configured to:

when the state information of the target object acquired by the acquisition module comprises the tone of the target object and the tone of the target object in the audio signal output process is greater than a tone threshold, judging that the shooting condition is met;

when the state information of the target object acquired by the acquisition module comprises the volume of the target object and the volume of the target object in the audio signal output process is greater than a volume threshold, judging that the shooting condition is met;

when the state information of the target object acquired by the acquisition module comprises the facial expression features of the target object, judging whether preset facial expression features with the similarity larger than a similarity threshold value exist in an expression library or not; and if so, judging that the shooting condition is met.

12. The apparatus according to claim 11, further comprising an adjusting module for adjusting a shooting angle of at least one camera such that the target object is within a shooting range of the at least one camera; or prompting the target object to change the position if the target object is located outside the designated area of the preview image shot by the camera device.

13. The device according to claim 12, wherein the adjusting module is specifically configured to:

automatically adjusting the shooting angle of the camera device; or,

14. The apparatus of claim 11, further comprising:

the preset module is used for playing preset multimedia; and if the state information does not accord with the shooting condition, controlling the camera device to shoot the multimedia aiming at the target object when the preset multimedia playing progress reaches a preset time point.

15. The apparatus of claim 11, wherein the expression library is a database for storing preset facial expression features; the preset facial expression features are preset facial expression features used for representing that the state information of the target object meets shooting conditions.

16. The apparatus according to claim 11, wherein the apparatus further comprises a preprocessing module, configured to preprocess the multimedia for the target object and store the preprocessed multimedia; wherein the pretreatment at least comprises one of the following modes: generating an expression package according to the acquired multimedia; and making the acquired multimedia into a personalized video.

17. The device according to claim 16, wherein the preprocessing module is specifically configured to: if the state information accords with the shooting condition, controlling the equipment to record the audio signal currently input by the target object; and synthesizing the recorded audio signal and the multimedia aiming at the target object into a video.

18. The apparatus of claim 16, further comprising a replacing module for replacing the preset multimedia with the synthesized video.

19. The device according to claim 18, wherein the device further comprises a sending module for sending the multimedia and/or the video for the target object to a designated terminal device.

20. A storage medium having stored thereon an executable program, the executable program when executed by a processor implementing the steps of the method of any one of claims 1 to 10.

21. A camera device comprising a memory, a processor and an executable program stored on the memory and executable by the processor, wherein the steps of the method of any one of claims 1 to 10 are performed when the executable program is executed by the processor.