CN115103125B - Guide broadcasting method and device - Google Patents

Guide broadcasting method and device Download PDF

Info

Publication number
CN115103125B
CN115103125B CN202210826557.5A CN202210826557A CN115103125B CN 115103125 B CN115103125 B CN 115103125B CN 202210826557 A CN202210826557 A CN 202210826557A CN 115103125 B CN115103125 B CN 115103125B
Authority
CN
China
Prior art keywords
video
global
local
scene
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210826557.5A
Other languages
Chinese (zh)
Other versions
CN115103125A (en
Inventor
袁潮
请求不公布姓名
肖占中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhuohe Technology Co Ltd
Original Assignee
Beijing Zhuohe Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhuohe Technology Co Ltd filed Critical Beijing Zhuohe Technology Co Ltd
Priority to CN202210826557.5A priority Critical patent/CN115103125B/en
Publication of CN115103125A publication Critical patent/CN115103125A/en
Application granted granted Critical
Publication of CN115103125B publication Critical patent/CN115103125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4084Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/262Analysis of motion using transform domain methods, e.g. Fourier domain methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The application provides a method and a device for guiding broadcast, wherein the method for guiding broadcast comprises the following steps: acquiring a global video; searching a scene matched with an instruction based on the instruction; acquiring a local video matched with the scene based on the scene; transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene; and playing the global fusion video in a first window, and playing the local video in a second window. The technical problem that the processing speed and the efficiency of the guide video are low in the prior art is solved.

Description

Guide broadcasting method and device
Technical Field
The present disclosure relates to the field of computer devices, and in particular, to a method and apparatus for broadcasting.
Background
Along with the improvement of the computing power of a computer and the resolution and visual field of a camera, the requirements of people on image videos and quality are higher and higher, and a high-resolution panoramic image is hoped to be obtained, so that a wider visual field can be obtained, and detailed information of the images and the videos is not lost. Then, a global video is photographed with a global camera, and a local video is photographed with a local camera. The global video has low pixels and cannot capture local details; the local video pixels are high but cannot exhibit their position in the global video. In the prior art, in order to present the position of a specific local video in a global video, fusion processing is generally performed on all the local videos and the global video, which results in large processing capacity and low processing efficiency of the video.
Disclosure of Invention
The embodiment of the invention provides a method and a device for guiding and broadcasting, which aim to solve the technical problem of low guiding and broadcasting video processing speed and efficiency in the prior art.
The invention provides a guide broadcasting method, which comprises the following steps:
acquiring a global video;
searching a scene matched with the instruction in the global video based on the instruction;
acquiring a local video matched with the scene based on the scene;
transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene;
and playing the global fusion video in a first window, and playing the local video in a second window.
Optionally, the step of transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene specifically includes: a block matching algorithm is adopted to find out the corresponding block of the local video in the global video; and registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene.
Optionally, the method further comprises: acquiring a light field video, sampling the light field video by a set magnification factor to obtain a sampled light field video, and performing Fourier transform on the sampled light field video to obtain a first video; after the step of transforming the local video to the corresponding block of the global video to obtain the global fusion video matched with the scene, the method further includes: performing high-pass filtering on the global fusion video to obtain a second video; performing linear addition on the first video and the second video, and performing Fourier change to obtain a third video; and playing the third video in a third window.
Optionally, the first window and the second window are displayed simultaneously under the same interface.
Optionally, a region corresponding to the local video is visually identified in the global fusion video.
Optionally, the instruction includes at least one of a voice instruction, a touch instruction, and a gesture instruction.
The embodiment of the application also provides a guiding and broadcasting device, which comprises:
the first acquisition module is used for acquiring a global video;
the searching module is used for searching a scene matched with the instruction in the global video based on the instruction;
the second acquisition module acquires a local video matched with the scene based on the scene;
the transformation module is used for transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene; and
and the playing module is used for playing the global fusion video in a first window and playing the local video in a second window.
Optionally, the transformation module is further adapted to: a block matching algorithm is adopted to find out the corresponding block of the local video in the global video; and registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene.
Optionally, the present application also proposes a computer readable storage medium, on which a computer program is stored, characterized in that the computer program when executed implements the steps of the method as described above.
Optionally, the application further proposes a computer device comprising a processor, a memory and a computer program stored on said memory, characterized in that said processor implements the steps of the method as described above when executing said computer program.
In the method, a scene matched with an instruction is acquired in a global video through the instruction, and then a local video matched with the scene is acquired in a searching mode; and then the local video is transformed to a corresponding block in the global video to obtain the global fusion video matched with the scene. At this time, the processing amount of the local video and the global video is small, and the processing efficiency is improved. Meanwhile, the global fusion video only presents high-resolution video on the corresponding block, and the video of other areas of the global fusion video still keeps low resolution, so that a user can capture the position of the local video related to the instruction in the global video in terms of visual effect. And simultaneously, playing the global fusion video in the first window and playing the local video in the second window, so that a user can compare the global fusion video with the local video, and the user can acquire interested information.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an application scenario schematic diagram of a multicast guiding device provided in an embodiment of the present application;
fig. 2 is a schematic flow chart of a multicast guiding method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a multicast guiding device according to an embodiment of the present application;
fig. 4 is an internal structural diagram of a computer device provided in an embodiment of the present application.
Description of the embodiments
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
It will be appreciated that "system," "apparatus," "unit" and/or "module" as used herein is one method for distinguishing between different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.
As used in this specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
Fig. 1 is a schematic view of an application scenario of a unicast device according to some embodiments of the present application. As shown in fig. 1, the unicast apparatus 100 may include a server 110, a network 120, a group of image capturing devices 130, and a memory 140.
Server 110 may process data and/or information acquired from at least one component of the lead apparatus 100 (e.g., image acquisition device group 130 and memory 140) or an external data source (e.g., a cloud data center). For example, the server 110 may obtain interaction instructions from the image capture device group 130. As another example, server 110 may also retrieve historical data from memory 140.
In some embodiments, server 110 may include a processing device 112. The processing device 112 may process information and/or data related to the human-machine interaction system to perform one or more of the functions described in this specification. For example, the processing device 112 may determine the imaging control strategy based on the interaction instructions and/or historical data. In some embodiments, the processing device 112 may include at least one processing unit (e.g., a single core processing engine or a multi-core processing engine). In some embodiments, the processing device 112 may be part of the image acquisition device group 130.
The network 120 may provide a channel for information exchange. In some embodiments, network 120 may include one or more network access points. One or more components of the multicast device 100 may connect to the network 120 through an access point to exchange data and/or information. In some embodiments, at least one component in the lead device 100 may access data or instructions stored in the memory 140 via the network 120.
The image capturing device group 130 may be composed of a plurality of image capturing devices, and the types of the image capturing devices are not limited, and may be, for example, a camera, a light field camera, or a mobile terminal having an image capturing function.
In some embodiments, memory 140 may store data and/or instructions that processing device 112 may perform or use to implement the exemplary methods described herein. For example, the memory 140 may store historical data. In some embodiments, memory 140 may be directly connected to server 110 as back-end memory. In some embodiments, memory 140 may be part of server 110, image capture device group 130.
Fig. 2 shows a flow chart of a method for guiding broadcast according to an embodiment of the application. Referring to fig. 2, the application further provides a guiding method, which includes the following steps:
acquiring a global video;
acquiring a scene matched with an instruction based on the instruction;
acquiring a local video matched with the scene based on the scene;
transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene;
and playing the global fusion video in a first window, and playing the local video in a second window.
In the embodiment of the application, a scene matched with the instruction is acquired in the global video through the instruction, and then a local video matched with the scene is acquired through a searching mode; and then the local video is transformed to a corresponding block in the global video to obtain the global fusion video matched with the scene. At this time, the processing amount of the local video and the global video is small, and the processing efficiency is improved. Meanwhile, the global fusion video only presents high-resolution video on the corresponding block, and the video of other areas of the global fusion video still keeps low resolution, so that a user can capture the position of the local video related to the instruction in the global video in terms of visual effect. And simultaneously, playing the global fusion video in the first window and playing the local video in the second window, so that a user can compare the global fusion video with the local video, and the user can acquire interested information.
It should be noted that, the global video is obtained by shooting with a global camera. One global video corresponds to a plurality of local videos. And if the global video and the local video are offline videos, marking a plurality of local videos and corresponding scenes, wherein each mark corresponds to a different instruction. If the global video and the local video are live videos, the association relation between the scene and the local camera can be preset, and each mark corresponds to different instructions. When the instruction is acquired, acquiring a corresponding scene, and acquiring the local video matched with the scene based on the scene.
In order to improve the streaming rate of the video or avoid data overflow of the ontology server, in the embodiment of the application, the global video is stored in a local memory, and the local video is stored in a remote memory or a cloud memory. Acquiring a global video; acquiring a scene matched with an instruction based on the instruction; based on the scene, acquiring a local video matched with the scene from a remote memory or a cloud memory; and transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene.
In the embodiment of the present application, the instruction may be at least one of a voice instruction, a touch instruction, and a gesture instruction.
As an optional implementation manner of the foregoing embodiment, the step of transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene specifically includes:
and adopting a block matching algorithm to find out the corresponding block of the local video in the global video. In general, a zero-mean normalized cross-correlation block matching algorithm (abbreviated as "ZNCC algorithm") is adopted to perform block matching, preferably performing twice ZNCC iterations, to find a corresponding block of the local video in the global video, and obtain a pixel matching relationship between the local video and the global reference video.
And registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene. In general, the local video and global video are transformed using a global transform, a grid-based transform, and a temporal, spatial smoothing transform to obtain a matched global fusion video of the scene.
And (3) overall transformation: and taking the searched local video and the corresponding block thereof as a pair, and then adopting a ZNCC algorithm to extract and match the characteristic points of the local video and the corresponding block thereof so as to extract the corresponding (matched) characteristic point pair in the local image and the corresponding block thereof. In a preferred embodiment of the present invention, two ZNCC iterations are performed to extract feature point pairs and calculate a homography matrix, and the process of the two iterations can be represented by the following iteration formula:
Figure SMS_1
,/>
Figure SMS_2
wherein,,
Figure SMS_3
representing matching blocks between the local video and corresponding blocks thereof (representing matching relations between the two); i l 、I r Representing local video and its corresponding block in global reference video, p l And p r Local video I respectively l And corresponding block I r Corresponding feature points of (i.e. p) l And p r Is a characteristic point pair; ZNCC () represents an energy function that computes a local video and its corresponding block with the ZNCC algorithm; h represents a homography matrix, and is initialized to an identity matrix during initialization; />
Figure SMS_4
For a homogenization matrix of pl, pi () represents the center projection and inverse homogenization function; w is the size of the local video (the local video is square, w represents the side length of the square), εFor the search width.
Then, a mesh-based transformation is performed: based on the preliminary global transformation video obtained in the previous step, performing grid-based transformation on the feature point pairs extracted in the whole transformation process by using an ASAP transformation frame (nearest transformation frame), and performing optical flow-based transformation on the grid transformation result to optimize the pixel matching relationship, so as to obtain more reliable feature point pairs, and obtain feature points which show more successful matching in the local video at the moment and changed optical flows. And combining the distortion of the optical flow transformation with the stability of the local video, and recalculating the homography matrix to complete the transformation based on the grid and the transformation based on the optical flow, thereby obtaining a transformation result. And performing color calibration on the local video after the transformation and registration are completed.
In a specific embodiment, a temporal and spatial smooth transformation is performed by introducing a temporal stability constraint, the energy function of the smooth transformation being:
E(V)=λ r E r (V)+λ t E t (V)+λ s E s (V)
where V represents a homography matrix of transformations performed in dependence on mesh vertices, E r (V) is the sum of the distances of each feature point pair between each local video in the global transformed video and the global reference video, E t (V) is a time stability constraint; e (E) s (V) is a spatial smoothing term defined as spatial deformations between adjacent vertices; lambda (lambda) r 、λ t And lambda (lambda) s Are constants greater than 0; wherein:
Figure SMS_5
;α pl is p l Bilinear interpolation weights of (2);
Figure SMS_6
;/>
Figure SMS_7
is a feature point p in the local video l Feature points in the corresponding time prior graph; b is an indicator function for checking whether the pixel point pl is on a static background, when B (p l ) =0 denotes the pixel point p l Is positioned on a moving background; s is the global transformation function between the local video and its temporal prior map.
After the series of transformation and registration, a global high-resolution video is obtained, wherein, in consideration of the problem of inconsistent color of each local video in the global high-resolution video caused by different color illumination of the local camera, each local video can be subjected to color correction until the color correction is consistent with the global reference video, so that the global high-resolution video has uniform color style as a whole. In addition, such optimization can also be performed on global high resolution video: and (3) removing the superposition part between the converted partial videos by adopting a graph cutting method so as to minimize the error of video registration.
As an alternative implementation of the above embodiment, the method further includes: acquiring a light field video, sampling the light field video by a set magnification factor to obtain a sampled light field video, and performing Fourier transform on the sampled light field video to obtain a first video; after the step of transforming the local video to the corresponding block of the global video to obtain the global fusion video matched with the scene, the method further includes: performing high-pass filtering on the global fusion video to obtain a second video; performing linear addition on the first video and the second video, and performing Fourier change to obtain a third video; and playing the third video in a third window. After the global high-resolution video is obtained, video super-resolution is needed to be carried out on the global light field video so as to solve the problem of low spatial resolution, and the specific method is as follows: up-sampling the global light field video with low resolution (spatial resolution) by amplification set times to obtain a sampled low-resolution light field video, performing Fourier transform on the sampled low-resolution light field video to obtain a first frequency spectrum video, and performing low-pass filtering on the first frequency spectrum video; and performing high-pass filtering on the global high-resolution video to obtain a second spectrum video. And thenAnd linearly adding the low-pass filtered first frequency spectrum video with the second frequency spectrum video, and then performing inverse Fourier transform to obtain a global high-resolution light field video. Wherein the set multiple is f h /f l ,f h And f l Focal lengths of the local camera and the global camera, respectively.
As an alternative implementation of the foregoing embodiment, the first window and the second window are displayed simultaneously under the same interface. In order to be able to facilitate the user capturing the information of interest in the video, the first window and the second window are presented under the same interface. In general, the second window floats on the first window, and the second window is close to a corresponding position of the local video in the global video, so that the user can observe the second window conveniently.
Further, the first window, the second window and the third window are displayed simultaneously under the same interface. In order to be able to facilitate the user to capture information of interest in the video, the first window, the second window and the third window are presented under the same interface. In general, the second window floats on the first window and the third window, and the second window is close to the corresponding position of the local video in the global video, so that the user can observe the second window conveniently.
As an optional implementation manner of the foregoing embodiment, the region corresponding to the local video is visually identified in the global fusion video. Typically, the visual means may optionally select a circled representation, an indicated representation, or the like.
As shown in fig. 3, the embodiment of the present application further provides a multicast guiding device, including:
a first obtaining module 100, configured to obtain a global video;
the searching module 200 searches a scene matched with the instruction based on the instruction;
a second obtaining module 300, based on the scene, obtaining a local video matched with the scene;
the transformation module 400 is configured to transform the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene; and
and the playing module 500 is configured to play the global fusion video in a first window and play the local video in a second window.
The transformation module 400 is further adapted to: a block matching algorithm is adopted to find out the corresponding block of the local video in the global video; and registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working processes of the modules/units/sub-units/components in the above-described apparatus may refer to corresponding processes in the foregoing method embodiments, which are not described herein again.
In some embodiments, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing relevant data of the image acquisition device. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method and system for streaming.
In some embodiments, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program, when executed by a processor, implements a method and system for streaming. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the structures shown in FIG. 4 are block diagrams only and do not constitute a limitation of the computer device on which the present aspects apply, and that a particular computer device may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.
In some embodiments, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In some embodiments, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.
In summary, the present application further provides a method for guiding broadcast, including:
acquiring a global video;
searching a scene matched with the instruction in the global video based on the instruction;
acquiring a local video matched with the scene based on the scene;
transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene;
and playing the global fusion video in a first window, and playing the local video in a second window.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments provided in the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should be noted that: like reference numerals and letters in the following figures denote like items, and thus once an item is defined in one figure, no further definition or explanation of it is required in the following figures, and furthermore, the terms "first," "second," "third," etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present application, and are not intended to limit the scope of the present application, but the present application is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, the present application is not limited thereto. Any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or make equivalent substitutions for some of the technical features within the technical scope of the disclosure of the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the corresponding technical solutions. Are intended to be encompassed within the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. The guiding and broadcasting method is characterized by comprising the following steps:
acquiring a global video;
searching a scene matched with an instruction based on the instruction;
acquiring a local video matched with the scene based on the scene;
transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene;
playing the global fusion video in a first window, and playing the local video in a second window;
the step of transforming the local video to the corresponding block of the global video to obtain the global fusion video matched with the scene specifically includes:
a block matching algorithm is adopted to find out the corresponding block of the local video in the global video;
registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene;
the obtaining the matched global fusion video of the scene comprises the following steps:
transforming the local video and the global video by adopting integral transformation, grid-based transformation and time and space smoothing transformation to obtain a matched global fusion video of the scene;
wherein the step of integrally transforming comprises:
taking the searched local video and the corresponding block thereof as a pair, and adopting a ZNCC algorithm to extract and match characteristic points of the local video and the corresponding block thereof so as to extract the matched characteristic point pair in the local image and the corresponding block thereof;
performing two ZNCC iterations to extract the characteristic point pairs and calculate a homography matrix, wherein the process of the two iterations can be expressed by adopting the following iteration formula:
Figure QLYQS_1
,/>
Figure QLYQS_2
wherein->
Figure QLYQS_3
Representing matching blocks between the local video and its corresponding blocks; i l 、I r Representing local video and its corresponding block in global reference video, p l And p r Local video I respectively l And corresponding block I r Corresponding feature points of (i.e. p) l And p r Is a characteristic point pair; ZNCC () represents an energy function that computes a local video and its corresponding block with the ZNCC algorithm; h represents a homography matrix, and is initialized to an identity matrix during initialization; />
Figure QLYQS_4
Is uniform of plA transformation matrix, pi () represents a center projection and inverse transformation function; w is the size of the local video, the local video is square, w is the side length of the square, and epsilon is the search width.
2. The method of claim 1, wherein the method further comprises:
acquiring a light field video, sampling the light field video by a set magnification factor to obtain a sampled light field video, and performing Fourier transform on the sampled light field video to obtain a first video;
after the step of transforming the local video to the corresponding block of the global video to obtain the global fusion video matched with the scene, the method further includes:
performing high-pass filtering on the global fusion video to obtain a second video;
performing linear addition on the first video and the second video, and performing Fourier change to obtain a third video;
and playing the third video in a third window.
3. The method of claim 1, wherein the first window and the second window are presented simultaneously under the same interface.
4. The method of claim 1, wherein the region to which the local video corresponds is visually identified in the global fusion video.
5. The method of claim 1, wherein the instructions comprise at least one of voice instructions, touch instructions, and gesture instructions.
6. A director, comprising:
the first acquisition module is used for acquiring a global video;
the searching module is used for searching a scene matched with the instruction based on the instruction;
the second acquisition module acquires a local video matched with the scene based on the scene;
the transformation module is used for transforming the local video to a corresponding block of the global video to obtain a global fusion video matched with the scene; and
the playing module is used for playing the global fusion video in a first window and playing the local video in a second window;
wherein the transformation module is further to:
a block matching algorithm is adopted to find out the corresponding block of the local video in the global video;
registering the local video and the global video based on the corresponding blocks to obtain a matched global fusion video of the scene;
the obtaining the matched global fusion video of the scene comprises the following steps:
transforming the local video and the global video by adopting integral transformation, grid-based transformation and time and space smoothing transformation to obtain a matched global fusion video of the scene;
wherein the step of integrally transforming comprises:
taking the searched local video and the corresponding block thereof as a pair, and adopting a ZNCC algorithm to extract and match characteristic points of the local video and the corresponding block thereof so as to extract the matched characteristic point pair in the local image and the corresponding block thereof;
performing two ZNCC iterations to extract the characteristic point pairs and calculate a homography matrix, wherein the process of the two iterations can be expressed by adopting the following iteration formula:
Figure QLYQS_5
,/>
Figure QLYQS_6
wherein->
Figure QLYQS_7
Representing matching blocks between the local video and its corresponding blocks; i l 、I r Representing local video and its corresponding block in global reference video, p l And p r Local video I respectively l And corresponding block I r Corresponding feature points of (i.e. p) l And p r Is a characteristic point pair; ZNCC () represents an energy function that computes a local video and its corresponding block with the ZNCC algorithm; h represents a homography matrix, and is initialized to an identity matrix during initialization; />
Figure QLYQS_8
For a homogenization matrix of pl, pi () represents the center projection and inverse homogenization function; w is the size of the local video, the local video is square, w is the side length of the square, and epsilon is the search width.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed, implements the steps of the method according to any one of claims 1-5.
8. A computer device comprising a processor, a memory and a computer program stored on the memory, characterized in that the processor implements the steps of the method according to any of claims 1-5 when the computer program is executed.
CN202210826557.5A 2022-07-13 2022-07-13 Guide broadcasting method and device Active CN115103125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210826557.5A CN115103125B (en) 2022-07-13 2022-07-13 Guide broadcasting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210826557.5A CN115103125B (en) 2022-07-13 2022-07-13 Guide broadcasting method and device

Publications (2)

Publication Number Publication Date
CN115103125A CN115103125A (en) 2022-09-23
CN115103125B true CN115103125B (en) 2023-05-12

Family

ID=83297324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210826557.5A Active CN115103125B (en) 2022-07-13 2022-07-13 Guide broadcasting method and device

Country Status (1)

Country Link
CN (1) CN115103125B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781350A (en) * 2019-09-26 2020-02-11 武汉大学 Pedestrian retrieval method and system oriented to full-picture monitoring scene

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7973834B2 (en) * 2007-09-24 2011-07-05 Jianwen Yang Electro-optical foveated imaging and tracking system
CN107959805B (en) * 2017-12-04 2019-09-13 深圳市未来媒体技术研究院 Light field video imaging system and method for processing video frequency based on Hybrid camera array
CN110086994A (en) * 2019-05-14 2019-08-02 宁夏融媒科技有限公司 A kind of integrated system of the panorama light field based on camera array
CN112367474B (en) * 2021-01-13 2021-04-20 清华大学 Self-adaptive light field imaging method, device and equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781350A (en) * 2019-09-26 2020-02-11 武汉大学 Pedestrian retrieval method and system oriented to full-picture monitoring scene

Also Published As

Publication number Publication date
CN115103125A (en) 2022-09-23

Similar Documents

Publication Publication Date Title
Chen et al. Camera lens super-resolution
CN107959805B (en) Light field video imaging system and method for processing video frequency based on Hybrid camera array
Yu et al. Towards efficient and scale-robust ultra-high-definition image demoiréing
CN111402139B (en) Image processing method, apparatus, electronic device, and computer-readable storage medium
RU2706891C1 (en) Method of generating a common loss function for training a convolutional neural network for converting an image into an image with drawn parts and a system for converting an image into an image with drawn parts
Wang et al. Dual-camera super-resolution with aligned attention modules
CN112367459B (en) Image processing method, electronic device, and non-volatile computer-readable storage medium
JP7264310B2 (en) Image processing method, apparatus, non-transitory computer readable medium
Liu et al. Exploit camera raw data for video super-resolution via hidden markov model inference
CN109035138B (en) Conference recording method, device, equipment and storage medium
Pang et al. FAN: Frequency aggregation network for real image super-resolution
CN114418853B (en) Image super-resolution optimization method, medium and equipment based on similar image retrieval
DE102018125739A1 (en) Dynamic calibration of multi-camera systems using a variety of Multi View image frames
KR20100065918A (en) A method for geo-tagging of pictures and apparatus thereof
Zhang et al. EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks
CN115103125B (en) Guide broadcasting method and device
CN111818298B (en) High-definition video monitoring system and method based on light field
Schaffland et al. An interactive web application for the creation, organization, and visualization of repeat photographs
CN106558021A (en) Video enhancement method based on super-resolution technique
Peng Super-resolution reconstruction using multiconnection deep residual network combined an improved loss function for single-frame image
CN108427935B (en) Street view comparison image generation method and device
CN115222776A (en) Matching auxiliary visual target tracking method and device, electronic equipment and storage medium
Li et al. MVStylizer: An efficient edge-assisted video photorealistic style transfer system for mobile phones
CN113284127A (en) Image fusion display method and device, computer equipment and storage medium
Jiang et al. Low-resolution and low-quality face super-resolution in monitoring scene via support-driven sparse coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant