CN118038098A

CN118038098A - Image processing method, device, equipment, storage medium and program product

Info

Publication number: CN118038098A
Application number: CN202410431365.3A
Authority: CN
Inventors: 林愉欢; 李嘉麟; 陈颖; 聂强; 付威福; 周逸峰; 陶光品; 刘永; 汪铖杰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2024-04-11
Filing date: 2024-04-11
Publication date: 2024-05-14

Abstract

The application provides an image processing method, an image processing device, image processing equipment, an image processing storage medium and an image processing program product, which are applied to various image processing scenes such as cloud technology, artificial intelligence, robots, virtual reality, games, automatic driving and the like; the image processing method comprises the following steps: responding to a scene acquisition request, and controlling a multi-view acquisition device to acquire images of a scene to be processed to obtain images of the scene to be processed, wherein the multi-view acquisition device comprises a plurality of image acquisition devices; acquiring a basal plane image of a scene basal plane, wherein the basal plane image is obtained by acquiring images of the scene basal plane through a multi-mesh acquisition device, and a scene to be processed comprises the scene basal plane; acquiring an image difference between a basal plane image and a scene image to be processed; and determining the image difference as an image processing result of the image of the scene to be processed, wherein the image processing result represents scene imaging information except for a scene basal plane in the scene to be processed. The application can improve the image processing efficiency.

Description

Image processing method, device, equipment, storage medium and program product

Technical Field

The present application relates to image processing technology in the field of computer applications, and in particular, to an image processing method, apparatus, device, storage medium, and program product.

Background

Scene footprint refers to the underlying surface of the scene that is used to carry scene objects, such as the ground in an autopilot scene, a desktop in a desktop cleaning scene, and so on. In order to identify scene information, it is often necessary to filter out scene floors from application scenes; however, in the related art, the scene footprint in the application scene is typically filtered in the point cloud space, which affects the efficiency of image processing.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, electronic equipment, a computer readable storage medium and a computer program product, which can improve the image processing efficiency.

The technical scheme of the embodiment of the application is realized as follows:

The embodiment of the application provides an image processing method, which comprises the following steps:

responding to a scene acquisition request, controlling a multi-view acquisition device to acquire images of a scene to be processed, and obtaining images of the scene to be processed, wherein the multi-view acquisition device comprises a plurality of image acquisition devices;

Acquiring a basal plane image of a scene basal plane, wherein the basal plane image is obtained by carrying out image acquisition on the scene basal plane through the multi-view acquisition equipment, and the scene to be processed comprises the scene basal plane;

Acquiring an image difference between the basal plane image and the scene image to be processed;

And determining the image difference as an image processing result of the image of the scene to be processed, wherein the image processing result represents scene imaging information except for the scene basal plane in the scene to be processed.

An embodiment of the present application provides an image processing apparatus including:

the request response module is used for responding to a scene acquisition request and controlling the multi-view acquisition equipment to acquire images of a scene to be processed to obtain images of the scene to be processed, wherein the multi-view acquisition equipment comprises a plurality of image acquisition equipment;

The image acquisition module is used for acquiring a basal plane image of a scene basal plane, the basal plane image is obtained by carrying out image acquisition on the scene basal plane through the multi-view acquisition equipment, and the scene to be processed comprises the scene basal plane;

The image processing module is used for acquiring an image difference between the basal plane image and the scene image to be processed;

And the result acquisition module is used for determining the image difference as an image processing result of the image of the scene to be processed, wherein the image processing result represents scene imaging information except for the scene basal plane in the scene to be processed.

In an embodiment of the present application, the image obtaining module is further configured to read a pre-stored basal plane image of the scene basal plane, where the basal plane image is obtained by performing image acquisition on the scene basal plane by the multi-mesh acquisition device before responding to the scene acquisition request.

In an embodiment of the present application, the image processing apparatus further includes an information update module, configured to acquire a footprint update image in response to an image update request for the footprint image triggered by an update event, where the update event includes at least one of: when the updating period arrives, the normal vector deviation value of the basal plane is larger than the deviation threshold value, and the position of the image acquisition equipment is changed; responding to a next scene acquisition request, and acquiring a next scene image to be processed; and determining the difference value between the basal plane updated image and the next scene image to be processed as an image to be applied of the next scene image to be processed.

In the embodiment of the application, the image acquisition module is further used for sampling the position points of the scene substrate surface to obtain target sampling points; projecting the target sampling point to a target imaging plane to obtain a target projection point, wherein the target imaging plane is an imaging plane of a target image acquisition device of the multi-view acquisition device; determining a target physical focal length of the target image acquisition device based on the target projection point; and performing light projection based on the target physical focal length to obtain the basal plane image.

In the embodiment of the application, the image processing device further comprises a data calibration module, which is used for controlling the multi-mesh acquisition equipment to acquire images of the calibration objects to obtain a first image to be processed when the calibration objects are arranged on the scene substrate surface; extracting feature points of the first image to be processed to obtain a plurality of first feature points; and determining the spatial relative relation between the scene substrate surface and the target image acquisition equipment based on the plurality of first characteristic points and the calibration object.

In the embodiment of the present application, the image acquisition module is further configured to project the target sampling point to the target imaging plane based on the spatial relative relationship, so as to obtain the target projection point.

In the embodiment of the application, the image acquisition module is further used for acquiring the equipment position point of the target projection point corresponding to the target image acquisition equipment based on the target internal parameter of the target image acquisition equipment; determining a physical imaging point of the target projection point on a physical imaging plane of the target image acquisition equipment based on the target internal parameter; and calculating the target physical focal length of the target image acquisition device by combining the device position point and the physical imaging point.

In the embodiment of the present application, the data calibration module is further configured to execute, for each of a plurality of placement modes of the calibration object, the following processing for each of the placement modes: controlling the multi-view acquisition equipment to acquire images of the calibration objects in the placement mode to obtain a second image to be processed; extracting feature points of the second image to be processed to obtain a plurality of second feature points; matching the plurality of second characteristic points with the calibration object to obtain a characteristic point matching result; and determining equipment parameters of the multi-purpose acquisition equipment based on a plurality of characteristic point matching results corresponding to a plurality of placement modes, wherein the equipment parameters comprise the target internal parameters.

In the embodiment of the application, the image acquisition module is further used for determining a physical imaging plane of the target image acquisition device based on the target physical focal length; determining a region to be imaged corresponding to a specified image size on the physical imaging plane; the following processing is performed for each pixel point to be imaged in the area to be imaged: projecting light rays from a target optical center of the target image acquisition equipment to the pixel points to be imaged to obtain intersection point pixel values of the projected light rays and the scene substrate surface, wherein the intersection point pixel values are any one of the following: depth value, disparity value; obtaining an intersection point pixel value array corresponding to the region to be imaged according to the intersection point pixel value of each pixel point to be imaged; and determining the intersection pixel value array as the basal plane image.

In the embodiment of the present application, the image type of the scene image to be processed and the image type of the basal plane image are respectively any one of the following: a depth image type, a parallax image type, the depth image type representing a pixel value of an image as a depth value, the parallax image type representing a pixel value of an image as a parallax value.

In the embodiment of the present application, the image processing module is further configured to determine, when an image type of the to-be-processed scene image and an image type of the basal plane image are the same, the image types of the to-be-processed scene image and the basal plane image as a target image type; and calculating the difference value between the basal plane image and the scene image to be processed based on the target image type to obtain the image difference.

In the embodiment of the present application, the image processing module is further configured to calculate a first difference value obtained by subtracting the to-be-processed scene image from the basal plane image when the target image type is a depth image type; the first difference is determined as the image difference.

In the embodiment of the present application, the image processing module is further configured to calculate a second difference value obtained by subtracting the basal plane image from the scene image to be processed when the target image type is a parallax image type; and determining the second difference value as the image difference.

In the embodiment of the present application, the image processing module is further configured to determine a reference image type from an image type of the to-be-processed scene image and an image type of the ground plane image when the image type of the to-be-processed scene image and the image type of the ground plane image are different; selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is the same as the reference image type, so as to obtain a reference type image; selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is different from the reference image type, so as to obtain an image to be converted; converting the image to be converted into an image of the reference image type to obtain an image to be calculated; and calculating the difference value between the image to be calculated and the reference type image to obtain the image difference.

In an embodiment of the present application, the image processing apparatus further includes a result application module, configured to determine obstacle information of the scene to be processed based on the image processing result; determining information to be moved of the application equipment mounted by the multi-purpose acquisition equipment based on the obstacle information; and controlling the application equipment to move in the scene to be processed based on the information to be moved.

An embodiment of the present application provides an electronic device for image processing, including:

A memory for storing computer executable instructions or computer programs;

And the processor is used for realizing the image processing method provided by the embodiment of the application when executing the computer executable instructions or the computer programs stored in the memory.

The embodiment of the application provides a computer readable storage medium, which stores computer executable instructions or a computer program, wherein the computer executable instructions or the computer program are used for realizing the image processing method provided by the embodiment of the application when being executed by a processor.

The embodiment of the application provides a computer program product, which comprises computer executable instructions or a computer program, wherein the computer executable instructions or the computer program realize the image processing method provided by the embodiment of the application when being executed by a processor.

The embodiment of the application has at least the following beneficial effects: when a scene image to be processed is acquired in response to a scene acquisition request, the processing of filtering the scene substrate surface from the scene to be processed is completed by acquiring the substrate surface image of the scene substrate surface so as to acquire the image difference between the substrate surface image and the scene image to be processed, and the filtering efficiency of the scene substrate surface is improved; and further, the image processing efficiency can be improved.

Drawings

FIG. 1 is a schematic diagram of an image processing system according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of the terminal in fig. 1 according to an embodiment of the present application;

Fig. 3 is a flowchart of an image processing method according to an embodiment of the present application;

fig. 4 is a second flowchart of an image processing method according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of acquiring a target physical focal length according to an embodiment of the present application;

Fig. 6 is a flowchart of an image processing method according to an embodiment of the present application;

Fig. 7 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 8 is an imaging schematic of an exemplary binocular system provided by an embodiment of the present application;

FIG. 9 is an imaging schematic of another exemplary binocular system provided by an embodiment of the present application;

FIG. 10 is a flow chart of an exemplary filtering of ground points provided by an embodiment of the present application.

Detailed Description

The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

In the following description, the terms "first", "second", and the like are used to distinguish between similar objects and do not represent a particular ordering of the objects, it being understood that the "first", "second", or the like may be interchanged with one another, if permitted, to enable embodiments of the application described herein to be implemented in an order other than that illustrated or described herein.

In the present embodiment, the term "module" or "unit" refers to a computer program or a part of a computer program having a predetermined function and working together with other relevant parts to achieve a predetermined object, and may be implemented in whole or in part by using software, hardware (such as a processing circuit or a memory), or a combination thereof. Also, a processor (or multiple processors or memories) may be used to implement one or more modules or units. Furthermore, each module or unit may be part of an overall module or unit that incorporates the functionality of the module or unit.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the embodiments of the application is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.

In the embodiment of the application, the relevant data collection processing should be strictly according to the requirements of relevant national laws and regulations when the example is applied, so as to acquire the informed consent or independent consent of the personal information body, and develop the subsequent data use and processing within the authorized range of the laws and regulations and the personal information body.

Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.

1) Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is a theory, method, technique, and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. That is, artificial intelligence is an integrated technology of computer science for understanding the essence of intelligence and producing a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence enables machines to have the functions of sensing, reasoning and decision by researching the design principles and implementation methods of various intelligent machines.

It should be noted that, the artificial intelligence technology is a comprehensive discipline, and relates to a wide field, and has a hardware level technology and a software level technology. Artificial intelligence infrastructure technologies generally include, for example, sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, pre-training model technologies, operation/interaction systems, mechatronics, and the like. Wherein, the pre-training model is also called a large model and a basic model; the pretrained model can be widely applied to downstream tasks in all directions of artificial intelligence after fine adjustment. Artificial intelligence software technology includes computer vision technology, speech processing technology, natural language processing technology, machine learning/deep learning, and other major directions. In the embodiment of the application, the movement of the application equipment in the scene to be processed can be realized by an artificial intelligence technology.

2) Machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, and algorithm complexity theory. For studying computer simulation or implementing learning behavior of humans to obtain new knowledge or skills; reorganizing the existing knowledge structure to continuously improve the performance of the knowledge structure. Machine learning is the core of artificial intelligence, and is the fundamental approach to make computers intelligent, and machine learning is applied throughout various fields of artificial intelligence. Machine learning/deep learning generally comprises artificial neural network, confidence network, reinforcement learning, transfer learning, induction learning and other technologies, and a large model is the latest development result of machine learning/deep learning and integrates the technologies. In the embodiment of the application, the motion of the application device in the scene to be processed can be realized by combining machine learning/deep learning.

3) The internal parameters of the image acquisition equipment, which are simply referred to as internal parameters, refer to internal characteristics of the image acquisition equipment, including focal length, principal point position, distortion and the like. Focal length, which is the distance from the optical center to the imaging plane, represents the focusing power of the image acquisition device; the principal point position represents a point corresponding to the optical center of the image acquisition device in the imaging plane; distortion refers to the degree of deformation of an image due to factors such as the lens of the image acquisition device. The image acquisition device is internally used for determining the imaging geometrical characteristics and distortion conditions of the image acquisition device. When the image acquisition equipment is a camera, the internal parameters of the image acquisition equipment are camera internal parameters.

4) The external parameters of the image acquisition equipment are called external parameters for short, namely the position and the gesture of the image acquisition equipment, wherein the position refers to the position of the image acquisition equipment in a world coordinate system, and the gesture refers to the orientation of the image acquisition equipment in the world coordinate system. The external parameters include a translation vector and a rotation matrix (or euler angle) of the image acquisition device for corresponding the image acquisition device coordinate system to the world coordinate system. When the image acquisition equipment is a camera, the external parameters of the image acquisition equipment are camera external parameters.

5) The world coordinate system refers to a real world coordinate system, and an origin is a point of the real world and is usually determined based on an application field; for example, in the field of robotic applications, the base of the robot may be the world coordinate system origin.

6) The coordinate system of the image acquisition equipment takes the optical center of the image acquisition equipment as an origin and takes the optical axis of the image acquisition equipment as a Z axis; the imaging plane of the image acquisition device is parallel to the X axis and the Y axis of the coordinate system of the image acquisition device and is perpendicular to the Z axis of the coordinate system of the image acquisition device; the image acquisition device coordinate system is used to describe geometrical relationships inside the image acquisition device, such as focal length, principal point position, etc. When the image acquisition device is a camera, the image acquisition device coordinate system is a camera coordinate system.

7) Calibrating an image acquisition device, namely determining an internal parameter of the image acquisition device and an external parameter of the image acquisition device; after the image acquisition equipment internal parameters and the image acquisition equipment external parameters are obtained through the calibration of the image acquisition equipment, the corresponding relation between the world coordinate system and the image coordinate system is obtained. When the image acquisition equipment is a camera, the image acquisition equipment is calibrated as camera calibration.

8) An image coordinate system, also called UV coordinate system, is a two-dimensional coordinate system on the imaging plane for describing the pixel locations in the image; the origin of the image coordinate system is the upper left corner of the image, in which the U-axis represents the horizontal direction on the image plane and the V-axis represents the vertical direction on the image plane, wherein the upper left corner, the horizontal direction and the vertical direction are determined with reference to the image presentation angle of view. The image coordinate system may be used in image processing and computer graphics for locating and processing pixels in an image.

9) Scene ground plane calibration refers to a process of determining a geometric relationship between an image acquisition device sensor and the ground, and is used for performing calibration process by corresponding image data with a scene ground plane coordinate system in the real world. In scene floor calibration, a calibration object (e.g., a calibration plate) is typically used to capture image data, and by analyzing the captured data, the spatial relative relationship between the image acquisition device sensor and the ground can be calculated. When the scene basal plane is the ground, the scene basal plane is calibrated as the ground.

10 An imaging plane refers to a plane used for imaging in computer vision, which is a virtual plane used for representing a two-dimensional matrix of an image in a computer, and the two-dimensional matrix is used for storing pixel values of the image.

11 Self-calibration, which means the process of automatically adjusting and calibrating the internal and external parameters; by self-calibration, the imaging accuracy of the image acquisition device can be improved.

12 Binocular system (Binocular Stereo Vision), a method for acquiring three-dimensional geometric information of an object to be imaged by calculating positional deviation between corresponding points of two images based on parallax and acquiring the two images of the object to be imaged from different positions using two image acquisition devices. A binocular system is an example of a multi-view acquisition device in an embodiment of the present application.

13 Parallax (also called Disparity value), in a binocular system, corresponding points of left and right eye images corrected by the epipolar line are distributed on the same line (called epipolar line). When the coordinates of a point of an object in the horizontal direction of the imaging point on the left and right eyes are l and r, respectively, the parallax of the point is l-r.

14 A parallax image (DISPARITY MAP) whose size coincides with a target view (view of any image capturing apparatus in a multi-view system), and an image whose pixel value represents a parallax value corresponding to a pixel point is referred to as a parallax image. The images obtained by the image acquisition devices are fused, the depth characteristics of the object to be imaged can be obtained, and the corresponding relation among the characteristics is established so as to obtain the parallax image of the object to be imaged.

15 Baseline (Baseline), which in binocular systems refers to the distance between the left and right gaze centers.

It should be noted that, the scene substrate surface refers to a basic surface for carrying a scene object in a scene, for example, a ground in an automatic driving scene, a desktop in a desktop cleaning scene, and the like. In order to identify scene information, it is often necessary to filter out scene floors from application scenes; however, in the related art, the scene footprint in the application scene is typically filtered in the point cloud space, which affects the efficiency of image processing.

Illustratively, the filtering process of the scene footprint is illustrated with the scene footprint referring to the ground. To filter the ground, a random sample consistency method (RanSac), a height threshold-based method, a normal vector-based method, and a cluster-based method may be employed.

The random sampling consistency method is an iterative algorithm used for fitting a model and identifying ground points from a point cloud; fitting a ground model by randomly selecting a group of points, calculating the distance between other points and the ground model, and removing non-ground points according to a distance threshold value to obtain a ground point cloud. Although the random sampling consistency method is suitable for various model fitting problems, including ground fitting, a best fit model can be found through an iterative process, the computational complexity is higher than a computational complexity threshold, and the best fit model can be obtained through multiple iterations. In addition, for the case that the ground point density is lower than the density threshold value or outliers exist, the accuracy of the fitting result of the random sampling consistency method is affected, and further the filtering accuracy is affected.

The method based on the height threshold is realized based on the height information of the ground points, and the points lower than the height threshold are regarded as the ground points by setting one height threshold. Although the method based on the height threshold is simple and visual, and easy to realize and adjust, the method is suitable for flat ground, and can only process the condition that the connecting line of the optical center of the camera and the center of the imaging plane is parallel to the ground, so that the applicability of the scene is affected.

And (3) based on a normal vector method, calculating the normal vector of each point in the point cloud, and distinguishing the ground point from the non-ground point according to the characteristics of the normal vector. Although the normal vector-based method can distinguish between ground points and non-ground points by using the geometric information of the point cloud for the case that the difference between the ground and the non-ground is larger than the difference threshold; however, for the case that the difference between the ground and the non-ground is smaller than the difference threshold, erroneous judgment is often generated; for example, the presence of particulate obstructions on the ground or noise points on the ground are misjudged as non-ground points. In addition, computing the algorithm vector increases the consumption of computing resources.

Based on the clustering method, adjacent ground points are clustered into the same cluster by clustering the point cloud. Clustering algorithms, such as density-based spatial clustering algorithms, can be used for the segmentation and filtering of ground points, but clustering-based methods affect the applicability and accuracy of ground point filtering.

In addition, the random sampling consistency method, the height threshold value-based method, the normal vector-based method and the clustering-based method are all ground point filtering performed in a point cloud space, so that the image is required to be converted into the point cloud and then processed, and the calculation complexity is increased.

Based on the above, the embodiments of the present application provide an image processing method, an apparatus, an electronic device, a computer readable storage medium, and a computer program product, which can improve efficiency, accuracy, and application range of scene ground plane filtering, and improve image processing efficiency. An exemplary application of the electronic device for image processing (hereinafter referred to simply as an image processing device) provided by the embodiment of the present application is described below, and the image processing device provided by the embodiment of the present application may be implemented as various types of terminals such as a robot, a smart phone, a smart watch, a notebook computer, a tablet computer, a desktop computer, an intelligent home appliance, a set top box, an intelligent vehicle-mounted device, a portable music player, a personal digital assistant, a dedicated messaging device, an intelligent voice interaction device, a portable game device, and an intelligent sound box, or may be implemented as a server. Next, an exemplary application when the image processing apparatus is implemented as a terminal will be described.

Referring to fig. 1, fig. 1 is a schematic architecture diagram of an image processing system according to an embodiment of the present application; as shown in fig. 1, in order to support an image processing application, in an image processing system 100, a terminal 400 (terminals 400-1 and 400-2 are exemplarily shown) is connected to a server 200 through a network 300; the network 300 may be a wide area network or a local area network, or a combination of both; the server 200 is used to provide service calculation to the terminal 400 through the network 300. In addition, the image processing system 100 further includes a database 500 for providing data support to the server 200; also, the database 500 is shown in fig. 1 as a case independent of the server 200, and furthermore, the database 500 may be integrated in the server 200, which is not limited in the embodiment of the present application.

The terminal 400 is configured to control a multi-view acquisition device to acquire an image of a scene to be processed in response to a scene acquisition request, so as to obtain an image of the scene to be processed, where the multi-view acquisition device includes a plurality of image acquisition devices; acquiring a basal plane image of a scene basal plane, wherein the basal plane image is obtained by acquiring images of the scene basal plane through a multi-mesh acquisition device, and a scene to be processed comprises the scene basal plane; acquiring an image difference between a basal plane image and a scene image to be processed; the image difference is determined as an image processing result of the image of the scene to be processed, the image processing result representing scene imaging information other than the scene ground plane in the scene to be processed, and the application is performed based on the image processing result (an application example of the automatic driving of the vehicle and the movement of the sweeping robot is exemplarily shown).

In some embodiments, the server 200 may be a stand-alone physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present application.

Referring to fig. 2, fig. 2 is a schematic structural diagram of the terminal in fig. 1 according to an embodiment of the present application; as shown in fig. 2, the terminal 400 includes: at least one processor 410, a memory 450, at least one network interface 420, and a user interface 430. The various components in terminal 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable connected communication between these components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled in fig. 2 as bus system 440.

The Processor 410 may be an integrated circuit chip having signal processing capabilities such as a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., where the general purpose Processor may be a microprocessor or any conventional Processor, etc.

The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable presentation of the media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

Memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 450 optionally includes one or more storage devices physically remote from processor 410.

Memory 450 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM) and the volatile Memory may be a random access Memory (Random Access Memory, RAM). The memory 450 described in embodiments of the present application is intended to comprise any suitable type of memory.

In some embodiments, memory 450 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 451 including system programs, e.g., framework layer, core library layer, driver layer, etc., for handling various basic system services and performing hardware-related tasks, for implementing various basic services and handling hardware-based tasks;

A network communication module 452 for accessing other electronic devices via one or more (wired or wireless) network interfaces 420, the exemplary network interface 420 comprising: bluetooth, wireless compatibility authentication (Wi-Fi), and universal serial bus (Universal Serial Bus, USB), etc.;

A presentation module 453 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 431 (e.g., a display screen, speakers, etc.) associated with the user interface 430;

An input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.

In some embodiments, the image processing apparatus provided in the embodiments of the present application may be implemented in software, and fig. 2 shows the image processing apparatus 455 stored in the memory 450, which may be software in the form of a program, a plug-in, or the like, including the following software modules: the request response module 4551, the image acquisition module 4552, the image processing module 4553, the result acquisition module 4554, the information update module 4555, the data calibration module 4556 and the result application module 4557 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter.

In some embodiments, the image processing apparatus provided in the embodiments of the present application may be implemented in hardware, and as an example, the image processing apparatus provided in the embodiments of the present application may be a processor in the form of a hardware decoding processor, which is programmed to perform the image processing method provided in the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may use one or more Application-specific integrated circuits (ASICs), DSPs, programmable logic devices (Programmable Logic Device, PLDs), complex Programmable logic devices (Complex Programmable Logic Device, CPLDs), field-Programmable gate arrays (Field-Programmable GATE ARRAY, FPGA), or other electronic components.

In some embodiments, the terminal or the server may implement the image processing method provided by the embodiment of the present application by running various computer executable instructions or computer programs. For example, the computer-executable instructions may be commands at the micro-program level, machine instructions, or software instructions. The computer program may be a native program or a software module in an operating system; a local (Native) Application (APP), i.e. a program that needs to be installed in an operating system to run, such as an autopilot APP or a sweep APP; or an applet that can be embedded in any APP, i.e., a program that can be run only by being downloaded into the browser environment. In general, the computer-executable instructions may be any form of instructions and the computer program may be any form of application, module, or plug-in.

Next, an image processing method provided by the embodiment of the present application will be described in connection with exemplary applications and implementations of the image processing apparatus provided by the embodiment of the present application. In addition, the image processing method provided by the embodiment of the application is applied to various image processing scenes such as cloud technology, artificial intelligence, robots, virtual reality, games, automatic driving and the like.

Referring to fig. 3, fig. 3 is a flowchart illustrating a first embodiment of an image processing method according to the present application, where an execution subject of each step in fig. 3 is an image processing apparatus; the steps shown in fig. 3 will be described below.

And step 101, responding to a scene acquisition request, and controlling the multi-view acquisition equipment to acquire images of the scene to be processed to obtain images of the scene to be processed.

In the embodiment of the application, when the information in the scene to be processed is processed, the image processing equipment also receives a scene acquisition request; at this time, the image processing device responds to the scene acquisition request and controls the multi-view acquisition device to acquire images of the scene to be processed, and the acquired image acquisition result is the image of the scene to be processed.

It should be noted that, the scene acquisition request is used for requesting to perform image acquisition on the scene to be processed, and is generated when information in the scene to be processed is requested to be processed; for example, when requesting identification of obstacle information in a scene to be processed, when requesting identification of information to be virtual reality enhanced in a scene to be processed, and so on. The scene to be processed is a scene to be subjected to information processing, such as a scene to be cleaned, a scene to be automatically driven, a scene to be subjected to virtual reality augmentation, and the like. The multi-eye acquisition equipment comprises a plurality of image acquisition equipment, and the multi-eye acquisition equipment is used for carrying out image acquisition on the same object to be acquired by adopting the plurality of image acquisition equipment; the relative positions among the plurality of image acquisition devices are fixed, and the plurality of image acquisition devices are used for carrying out image acquisition on the same object to be acquired; in addition, the image acquisition device is used for acquiring images of objects to be acquired, such as cameras, video cameras and the like; in addition, the plurality of image capturing devices may be the same device, may be different devices, may be a combination of the two, and the like, which is not limited in the embodiment of the present application. The image of the scene to be processed is imaging information of the scene to be processed, which can be a depth image, a parallax image, a combination of the depth image and the parallax image, and the like, and the embodiment of the application is not limited to the imaging information; wherein, the pixel value of the depth image is the depth value, and the pixel value of the parallax image is the parallax value.

In the embodiment of the application, the image processing device responds to a scene acquisition request and controls the multi-view acquisition device to acquire images of a scene to be processed to obtain images of the scene to be processed, and the method comprises the following steps: the image processing equipment responds to the scene acquisition request, controls the multi-view acquisition equipment to acquire images of the scene to be processed, obtains a plurality of initial scene images corresponding to the image acquisition equipment one by one, and integrates the initial scene images into the scene image to be processed based on parameters among the image acquisition equipment. The initial scene image refers to a result obtained by each image acquisition device performing image acquisition on a scene to be processed.

Step 102, acquiring a basal plane image of the basal plane of the scene.

In the embodiment of the application, the image processing device can acquire the basal plane image of the basal plane of the scene in real time when acquiring the basal plane image of the basal plane of the scene, and can also read the basal plane image of the basal plane of the scene acquired in advance, and the embodiment of the application is not limited to the above.

The basal plane image is obtained by image acquisition of the basal plane of the scene through the multi-purpose acquisition equipment and is imaging information of the basal plane of the scene; the scene basal surface refers to a basic surface for bearing scene information in a scene to be processed, such as ground, desktop, water surface and the like; the scene to be processed includes a scene footprint. In addition, the basal plane image may be a depth image, a parallax image, a combination of both, or the like, which is not limited in the embodiment of the present application.

In an embodiment of the present application, when an image processing apparatus obtains a basal plane image of a scene basal plane by reading information acquired in advance, the image processing apparatus obtains the basal plane image of the scene basal plane, including: the image processing device reads a pre-stored basal plane image of the basal plane of the scene.

It should be noted that, the basal plane image is obtained by performing image acquisition on the basal plane of the scene by the multi-mesh acquisition device before responding to the scene acquisition request, and the image processing device stores the basal plane image acquired before responding to the scene acquisition request so as to read the pre-stored basal plane image when responding to the scene acquisition request.

It can be understood that by pre-storing the basal plane image of the scene to be processed, the information processing of the scene to be processed can be realized based on the basal plane image and the scene image to be processed by reading the pre-stored basal plane image; and the information processing efficiency of the scene to be processed is improved.

It can be further understood that, in the embodiment of the application, before responding to a scene acquisition request, the basal plane image is acquired in advance, so that when the image processing device performs basal plane filtering on the scene image to be processed in response to the scene acquisition request, the basal plane filtering can be directly realized based on the basal plane image acquired in advance; on the one hand, the method reduces the acquisition time of the basal plane image, and can be further suitable for application scenes with high real-time requirements (the real-time consumption is smaller than a specified time consumption threshold), such as a sweeping robot sweeping scene, an automatic driving scene, a virtual reality augmented scene and the like; on the other hand, the resource consumption of the image processing device is reduced, and the method is further suitable for devices with lower computing resources or computing power (the computing resources or computing power are lower than the computing index threshold value), such as household sweeping robots, automatic driving vehicles, virtual reality enhancing devices and the like; in summary, the image processing method provided by the embodiment of the application can promote the universality of the image processing application scene.

In the embodiment of the present application, when the image processing apparatus acquires the substrate surface image of the substrate surface of the scene through real-time acquisition, referring to fig. 4, fig. 4 is a schematic flow chart two of the image processing method provided in the embodiment of the present application, where the execution subject of each step in fig. 4 is the image processing apparatus; as shown in fig. 4, step 102 may be implemented through steps 1021 to 1024, that is, the image processing apparatus acquires a basal plane image of the basal plane of the scene, including steps 1021 to 1024, which are described below.

And 1021, performing position point sampling on the scene substrate surface to obtain a target sampling point.

In the embodiment of the application, the image processing equipment models the basal plane of the scene, and the obtained modeling result is called a basal plane model, and the basal plane model is used for determining the position information of the position point of the basal plane of the scene in a world coordinate system; for example, when the scene footprint is the ground, the footprint model is a ground equation (referred to as a ground model). When the floor model is obtained, the image processing apparatus may sample the position points on the scene floor by the floor model, and refer to the sampled position points on the scene floor as target sampling points.

Step 1022, projecting the target sampling point to the target imaging plane to obtain a target projection point.

In the embodiment of the application, the image processing device projects the target sampling point to the imaging plane of the target image acquisition device of the multi-purpose acquisition device, namely, the target sampling point is projected to the target imaging plane; at this time, the position point of the target sampling point corresponding to the target imaging plane obtained by the image processing device is the target projection point.

The target imaging plane is an imaging plane of a target image capturing device of the multi-view capturing device, and the target image capturing device is any one of a plurality of image capturing devices of the multi-view capturing device.

In the embodiment of the application, the image processing equipment performs calibration on the scene basal plane and the multi-mesh acquisition equipment in advance, so that the spatial relative relation between the scene basal plane and the target image acquisition equipment in the multi-mesh acquisition equipment is obtained; therefore, when the image processing device projects the target sampling point to the target imaging plane, the image processing device can project the target sampling point to the target imaging plane based on the spatial relative relation to obtain the target projection point.

Step 1023, determining the target physical focal length of the target image acquisition device based on the target projection point.

It should be noted that, the target projection point is a position point under an image coordinate system corresponding to a target imaging plane of the target image acquisition device; the image processing device can calculate the physical focal length of the target image acquisition device based on the target internal reference and the target projection point of the target image acquisition device; here, the physical focal length of the target image capturing apparatus is referred to as a target physical focal length. The target internal parameter refers to an internal parameter of the target image acquisition equipment, and is a calibration result between the target image acquisition equipment and a target imaging plane.

Referring to fig. 5, fig. 5 is a schematic flow chart of acquiring a target physical focal length according to an embodiment of the present application, where an execution subject of each step in fig. 5 is an image processing apparatus; as shown in fig. 5, in the embodiment of the present application, step 1023 may be implemented through steps 10231 to 10233, that is, the image processing apparatus determines the target physical focal length of the target image capturing apparatus based on the target projection point, including steps 10231 to 10233, which will be described below.

Step 10231, acquiring device position points of the target projection points corresponding to the target image acquisition device based on the target internal parameters of the target image acquisition device.

In the embodiment of the application, the image processing device can obtain the target internal parameter, and the target internal parameter represents the calibration relation between the target image acquisition device and the target imaging plane, so that the image processing device can determine the position point corresponding to the target projection point in the coordinate system taking the target image acquisition device as a reference based on the target internal parameter, and the device position point is obtained.

Step 10232, determining a physical imaging point of the target projection point on a physical imaging plane of the target image acquisition device based on the target internal parameter.

In the embodiment of the application, the image processing device can calculate the position point of the target projection point on the physical imaging plane of the target image acquisition device based on the target internal reference, and the calculated position point is called a physical imaging point.

When the target internal parameter comprises a first dimension focal length, a second dimension focal length, first dimension rotation information and second dimension rotation information, the image processing device obtains a difference value between the first dimension position information of the target projection point and the first dimension rotation information, and takes the ratio of the difference value to the first dimension focal length as the first dimension position information of the physical imaging point; the image processing device obtains the difference value between the second-dimension position information and the second-dimension rotation information of the target projection point, and takes the ratio of the difference value to the second-dimension focal length as the second-dimension position information of the physical imaging point. Illustratively, the first dimension is, for example, the X-axis and the second dimension is, for example, the Y-axis.

Step 10233, calculating the target physical focal length of the target image acquisition device by combining the device position point and the physical imaging point.

In the embodiment of the application, the image processing device can acquire a first ratio of the first dimension position information of the device position point to the first dimension position information of the physical imaging point, acquire a second ratio of the second dimension position information of the device position point to the second dimension position information of the physical imaging point, and then take the average value of the first ratio and the second ratio as the initial physical focal length of the target image acquisition device. Here, the image processing apparatus may take the initial physical focal length directly as the target physical focal length; a plurality of initial physical focal lengths can be obtained, and a target physical focal length is obtained by combining the plurality of initial physical focal lengths; etc., to which embodiments of the application are not limited; in addition, when the image processing apparatus determines the target physical focal length by combining the plurality of initial physical focal lengths, the plurality of initial physical focal lengths may be processed by using a least square method, an average value of the plurality of initial physical focal lengths may be obtained, and the like, which is not limited by the embodiment of the present application.

And 1024, performing light projection based on the target physical focal length to obtain a basal plane image.

It should be noted that, after the image processing device obtains the target physical focal length, the physical imaging plane of the target image acquisition device is obtained; thus, the image processing apparatus obtains imaging information of the basal plane of the scene by performing light projection on the physical imaging plane, that is, obtains a basal plane image.

In an embodiment of the present application, an image processing apparatus performs light projection based on a target physical focal length to obtain a basal plane image, including: the image processing device firstly determines a physical imaging plane of the target image acquisition device based on the target physical focal length; determining a region to be imaged corresponding to the specified image size on a physical imaging plane; next, the following processing is performed for each pixel to be imaged in the region to be imaged: projecting light rays from a target optical center of target image acquisition equipment to a pixel point to be imaged to obtain an intersection point pixel value of the projected light rays and a scene basal plane; obtaining an intersection pixel value array corresponding to the region to be imaged according to the intersection pixel value of each pixel point to be imaged; finally, the intersection pixel value array is determined as a basal plane image.

The intersection pixel value refers to a pixel value of an intersection of the projected light ray and the scene basal plane, and the image processing device can determine the position information of the intersection under the scene basal plane coordinate system based on the scene basal plane; based on the space relative relation, the depth value corresponding to the position information of the intersection point in the scene basal surface coordinate system in the coordinate system corresponding to the target imaging plane can be determined; based on the depth value and the baseline and the first-dimension focal length in the equipment parameters, a parallax value positively correlated with the baseline and the first-dimension focal length and negatively correlated with the depth value can be obtained; thus, the image processing apparatus takes at least one of the obtained depth value and disparity value as the intersection pixel value, and the intersection pixel value is one or two of the following: depth value, disparity value.

In an embodiment of the present application, before the image processing device reads the substrate surface image of the pre-stored scene substrate surface, the image processing method further includes: and (3) performing position point sampling on the scene substrate surface to obtain a to-be-processed sampling point (corresponding to a target sampling point), projecting the to-be-processed sampling point to a target imaging plane to obtain a to-be-processed projection point (corresponding to a target projection point), determining a target physical focal length of the target image acquisition equipment based on the to-be-processed projection point, performing light projection based on the target physical focal length to obtain a substrate surface image, and pre-storing the substrate surface image. Since the process of acquiring the substrate surface image in advance by the image processing apparatus is similar to the process of acquiring the substrate surface image in real time, the embodiments of the present application will not be described again here.

And 103, acquiring an image difference between the image of the basal plane and the image of the scene to be processed.

In the embodiment of the application, the image processing equipment filters the imaging information of the scene basal plane from the scene image to be processed by carrying out subtraction operation on the basal plane image and the scene image to be processed; the processing result of the subtraction operation is the image difference between the basal plane image and the scene image to be processed. Here, the image processing apparatus may further perform absolute value processing on a difference result between the ground plane image and the scene image to be processed, and determine the absolute value processing result as an image difference.

It should be noted that, the image type of the scene image to be processed and the image type of the basal plane image are respectively any one of the following: a depth image type, a parallax image type; the depth image type indicates that the pixel value of the image is a depth value, and the parallax image type indicates that the pixel value of the image is a parallax value.

Referring to fig. 6, fig. 6 is a flowchart illustrating a third embodiment of an image processing method according to the present application, where an execution subject of each step in fig. 6 is an image processing apparatus; as shown in fig. 6, in the embodiment of the present application, step 103 may be implemented through step 1031A and step 1032A, that is, the image processing apparatus acquires an image difference between the image of the substrate surface and the image of the scene to be processed, including step 1031A and step 1032A, which are described below separately.

Step 1031A, when the image type of the scene image to be processed and the image type of the ground plane image are the same, determining the image types of the scene image to be processed and the ground plane image as the target image type.

When the scene image to be processed and the basal plane image are depth images, the image type of the scene image to be processed and the image type of the basal plane image are both depth image types, at the moment, the image type of the scene image to be processed and the image type of the basal plane image are the same, and the target image type is the depth image type; when the scene image to be processed and the basal plane image are both parallax images, the image type of the scene image to be processed and the image type of the basal plane image are both parallax images, at this time, the image type of the scene image to be processed and the image type of the basal plane image are the same, and the target image type is the parallax image type.

Step 1032A, calculating a difference between the image of the substrate surface and the image of the scene to be processed based on the type of the target image, thereby obtaining an image difference.

It should be noted that the image processing apparatus calculates the difference between the image of the substrate surface and the image of the scene to be processed by using different subtraction operations based on the difference of the types of the target images. Thus, in an embodiment of the present application, an image processing apparatus calculates a difference between a substrate surface image and a scene image to be processed based on a target image type, to obtain an image difference, including: when the target image type is a depth image type, the image processing device calculates a first difference value of the image of the basal plane minus the image of the scene to be processed; the first difference is determined as an image difference. Or when the target image type is the parallax image type, the image processing device calculates a second difference value of subtracting the basal plane image from the scene image to be processed; the second difference is determined as an image difference. The method comprises the steps that a basal plane image and a scene image to be processed are of specified image sizes, pixel points of the basal plane image and the scene image to be processed are in one-to-one correspondence, and when a difference value is calculated, the difference value between corresponding pixel point pairs is obtained; thus, the first difference and the second difference each comprise a respective pixel difference.

With continued reference to fig. 6, in the embodiment of the present application, step 103 may be further implemented through steps 1031B to 1035B, that is, the image processing apparatus acquires an image difference between the image of the substrate surface and the image of the scene to be processed, including steps 1031B to 1035B, and each step will be described separately.

Step 1031B, when the image type of the scene image to be processed and the image type of the ground plane image are different, determining a reference image type from the image type of the scene image to be processed and the image type of the ground plane image.

In the embodiment of the application, when the image type of the scene image to be processed and the image type of the basal plane image are different, the image processing device takes the image type of the scene image to be processed or the image type of the basal plane image as a reference image type. It is easy to know that the reference image type may be a depth image type or a parallax image type.

Step 1032B, selecting one of the scene image to be processed and the basal plane image, which has the same image type as the reference image type, to obtain the reference type image.

It should be noted that the reference type image refers to one of the image types in the scene image and the basal plane image to be processed, which is the reference image type; that is, when the image type of the scene image to be processed is the reference image type, the scene image to be processed is the reference type image; when the image type of the ground plane image is the reference image type, the ground plane image is the reference type image.

Step 1033B, selecting one of the scene image to be processed and the basal plane image, which is different from the reference image in image type, to obtain the image to be converted.

It should be noted that the image to be converted refers to one of the scene image to be processed and the basal plane image except the reference type image; thus, when the reference type image is a scene image to be processed, the image to be converted is a basal plane image; and when the reference type image is a basal plane image, the image to be converted is a scene image to be processed.

Step 1034B, converting the image to be converted into an image of a reference image type to obtain an image to be calculated.

In the embodiment of the application, in order to obtain the difference value between the scene image to be processed and the basal plane image, the scene image to be processed and the basal plane image are unified into the same image type, and then the difference value calculation is carried out; thus, the image processing apparatus converts the image to be converted into an image of the reference image type after determining the reference image type, so as to unify the scene image to be processed and the ground plane image to the same image type. The converted image to be converted is the image to be calculated.

Step 1035B, calculating a difference between the image to be calculated and the reference type image to obtain an image difference.

Since the image to be calculated and the reference type image are both images of the reference image type, the image processing apparatus can calculate the difference between the image to be calculated and the reference type image, and take the calculated difference as the image difference between the base surface image and the scene image to be processed.

In the embodiment of the application, the image processing equipment calculates the difference between the image to be calculated and the reference type image based on the reference image type to obtain the image difference; that is, when the reference image type is the depth image type, the image processing apparatus subtracts the first difference value of the image (reference type image or reference type image) corresponding to the scene image to be processed from the image (image to be calculated or reference type image) corresponding to the ground plane image; and when the reference image type is a parallax image type, the image processing apparatus subtracts the image (reference type image or reference type image) corresponding to the base surface image from the image (to-be-calculated image or reference type image) corresponding to the scene image to be processed.

It can be understood that when the base surface image and the scene image to be processed are different in image type, the image difference between the base surface image and the scene image to be processed is calculated after the image type conversion, so that the definition of the image type of the scene image to be processed is reduced, and the flexibility and the application range of image processing can be improved.

Step 104, determining the image difference as an image processing result of the scene image to be processed.

It should be noted that, the image processing result of the to-be-processed scene image may be an image difference, so that the image processing result represents the scene imaging information except the scene basal plane in the to-be-processed scene; further, the image processing apparatus can realize various applications of the scene to be processed, such as motion control in the scene to be processed, virtual reality processing of the scene to be processed, and the like, based on the image processing result. It is also possible that the image processing apparatus determines each pixel in the image difference, whose pixel value is greater than the pixel threshold value, as an image processing result of the image of the scene to be processed.

In the embodiment of the application, the image processing device can also determine each pixel with the pixel value smaller than or equal to the pixel threshold value in the image difference as scene basal plane information in the to-be-processed scene image.

It can be understood that when the scene image to be processed is acquired in response to the scene acquisition request, the processing of filtering the scene substrate surface from the scene to be processed is completed by acquiring the substrate surface image of the scene substrate surface so as to acquire the image difference between the substrate surface image and the scene image to be processed, thereby improving the filtering efficiency of the scene substrate surface; and further, the image processing efficiency can be improved. In addition, the scene basal plane in the scene to be processed is filtered by acquiring the image difference between the basal plane image and the scene image to be processed, so that the limitation on the scene basal plane is reduced, and the application range of image processing can be further improved.

In the embodiment of the present application, when the basal plane image obtained by the image processing device is the imaging information of the pre-stored scene basal plane, step 104 further includes a process of dynamically updating the basal plane image; that is, after the image processing apparatus determines the image difference as the image processing result of the scene image to be processed, the image processing method further includes: the image processing equipment responds to an image update request for the base surface image triggered by an update event, and acquires the base surface update image; responding to a next scene acquisition request, and acquiring a next scene image to be processed; and finally, determining the difference value between the basal plane updated image and the next scene image to be processed as the image to be applied of the next scene image to be processed. The image to be applied is used for being applied to the subsequent motion information or the determination of the information to be enhanced.

It should be noted that, the update event triggering the dynamic update of the basal plane image includes at least one of the following: when the updating period arrives, the normal vector deviation value of the basal plane is larger than the deviation threshold value, and the image acquisition equipment generates position variation; wherein the update period may be determined by an update frequency, the update frequency being positively correlated with the performance of the image processing apparatus. The basal plane normal vector deviation value refers to a deviation value of a normal vector of the basal plane of the scene, and when the basal plane normal vector deviation value is larger than a deviation threshold value, the change of the basal plane of the scene is indicated to be larger, so that the update of the basal plane image is triggered. The position of the image capturing device may be changed, the position relationship between the plurality of image capturing devices may be changed, the position relationship between the image capturing device and the scene substrate surface may be changed, or a combination of the two, which is not limited by the embodiment of the present application. In addition, the process of acquiring the substrate surface update image by the image processing apparatus is similar to the process of acquiring the substrate surface image, the process of acquiring the difference between the substrate surface update image and the next scene image to be processed is similar to the process of acquiring the difference between the substrate surface image and the scene image to be processed, and the embodiments of the present application will not be repeated here.

It can be appreciated that for the pre-stored basal plane image, dynamic update of the basal plane image is triggered based on an update event, so that the accuracy of the basal plane image is improved, and the accuracy of image processing can be improved.

In the embodiment of the application, the method further comprises the process of obtaining the space relative relationship through the calibration of the scene substrate surface before the step 1022; that is, the image processing apparatus projects the target sampling point onto the target imaging plane, and before obtaining the target projection point, the image processing method further includes: the image processing equipment firstly controls the multi-mesh acquisition equipment to acquire images of the calibration objects when the application equipment and the calibration objects mounted on the multi-mesh acquisition equipment are arranged on the scene substrate surface, so as to obtain a first image to be processed; extracting feature points of the first image to be processed to obtain a plurality of first feature points; and finally, determining the spatial relative relation between the scene substrate surface and the target image acquisition equipment based on the plurality of first characteristic points and the calibration objects.

It should be noted that, the calibration object refers to an object for calibrating the multi-mesh collection device, and is an object with a fixed-pitch pattern array; such as a checkerboard, circular plate, etc.

In the embodiment of the application, the calibration process of the multi-mesh acquisition device is further included before the step 10231; that is, the image processing apparatus acquires, based on the target internal parameter of the target image capturing apparatus, before the target projection point corresponds to the apparatus position point of the target image capturing apparatus, the image processing method further includes: the image processing apparatus performs, among a plurality of placement modes of the calibration object, the following processing for each placement mode: firstly, controlling a multi-view acquisition device to acquire images of the calibration objects in a placement mode to obtain a second image to be processed; extracting feature points of the second image to be processed to obtain a plurality of second feature points; then, matching the plurality of second characteristic points with the calibration object to obtain a characteristic point matching result; and finally, determining the equipment parameters of the multi-purpose acquisition equipment based on a plurality of characteristic point matching results corresponding to a plurality of placement modes.

The device parameters include respective internal parameters of the plurality of image acquisition devices, respective external parameters of the plurality of image acquisition devices, and parameters among the plurality of image acquisition devices; thus, the device parameters include target internal parameters of the target image acquisition device.

Referring to fig. 7, fig. 7 is a flowchart illustrating a fourth flowchart of an image processing method according to an embodiment of the present application, where an execution subject of each step in fig. 7 is an image processing apparatus; as shown in fig. 7, in the embodiment of the present application, step 104 further includes steps 105 to 107; that is, after the image processing apparatus determines the image difference as the image processing result of the scene image to be processed, the image processing method further includes steps 105 to 107, each of which will be described below.

Step 105, determining obstacle information of the scene to be processed based on the image processing result.

The image processing device recognizes information in the image processing result, and uses the recognized information as obstacle information of the scene to be processed.

And 106, determining information to be moved of the application equipment mounted by the multi-purpose acquisition equipment based on the obstacle information.

It should be noted that, after the image processing device obtains the obstacle information of the scene to be processed, the motion information of the application device mounted on the multi-purpose acquisition device in the scene to be processed, that is, the motion information to be moved, can be determined. The application device may be integrated into the image processing device or may be independent of the image processing device, which is not limited in the embodiment of the present application.

By way of example, application devices such as sweeping robots, autonomous vehicles, virtual reality devices, and the like.

And step 107, controlling the application equipment to move in the scene to be processed based on the information to be moved.

It should be noted that when the application device is integrated in the image processing device, the image processing device controls the application device to move in the scene to be processed based on the information to be moved, which means that the image processing device controls itself to move in the scene to be processed based on the information to be moved. When the application device is independent of the image processing device, the image processing device controls the application device to move in the scene to be processed based on the information to be moved, namely the image processing device sends a movement instruction to the application device based on the information to be moved, and the application device executes the received movement instruction to move in the scene to be processed.

It can be understood that in an application scene that controls the application device to move in the scene to be processed, the filtering of the scene base surface can be realized by acquiring the image difference between the base surface image and the scene image to be processed, so that the resource consumption is reduced, and the motion control efficiency is improved.

In an embodiment of the present application, after the image processing apparatus determines the image difference as an image processing result of the scene image to be processed, the image processing method further includes: determining to-be-enhanced information of a to-be-processed scene based on an image processing result; and carrying out virtual reality augmentation on the information to be augmented.

It can be understood that in the virtual reality augmentation application scene, the filtering of the scene base surface can be realized by acquiring the image difference between the base surface image and the scene image to be processed, so that the resource consumption is reduced, and the efficiency of virtual reality augmentation is improved.

In the following, an exemplary application of the embodiment of the present application in one actual application scenario will be described, which describes a process of acquiring a ground depth map (referred to as a footprint image) in advance in an automatic driving scenario (referred to as a pending scenario). It is easy to understand that the image processing method provided by the embodiment of the application is applicable to any scene in which the scene substrate surface is filtered, and an automatic driving scene is taken as an example for illustration.

Referring to fig. 8, fig. 8 is an imaging schematic of an exemplary binocular system provided by an embodiment of the present application; as shown in fig. 8, in the binocular system 8-1, the position of the object 8-21 in the real world at the imaging plane 8-31 of the first camera is the position 8-22 (noted as) The position at the imaging plane 8-41 of the second camera is position 8-23 (denoted/>) ; The optical center 8-32 of the first camera is denoted/>The optical center 8-42 of the second camera is denoted/>; The position of the optical center 8-42 of the second camera at the imaging plane 8-31 of the first camera is pole 8-33, denoted/>The position of the optical center 8-32 of the first camera at the imaging plane 8-41 of the second camera is pole 8-43, denoted/>; In addition, the real world position sequence corresponding to the transmission positions 8-22 from the optical center 8-32 (i.e., is (/ >)) The corresponding sequence of imaging points on the imaging plane 8-41 is located on the epipolar line 8-5.

The epipolar line correction is used for overlapping epipolar lines of corresponding points of the image of the first camera and the image of the second camera; after epipolar correction, referring to FIG. 9, FIG. 9 is an imaging schematic of another exemplary binocular system provided by embodiments of the present application; as shown in FIG. 9, the distance between the optical center 9-11 of the first camera 9-1 and the optical center 9-21 of the second camera 9-2 is denoted as baseline 9-3; The focal lengths of the first camera 9-1 and the second camera 9-2 are the same, denoted/>; Thus, when the position point P in the real world is at a horizontal (corresponding to the X-axis) distance 9-12 in the imaging plane of the first camera 9-1, is/>And position point P (/ >)) The horizontal distance 9-22 in the imaging plane of the second camera 9-2 is/>In this case, the relationship shown in the formulas (1) and (2) can be obtained, and the formulas (1) and (2) are as follows.

（1）；

（2）；

Wherein,Representing the horizontal distance of the position point P from the optical center 9-11 of the first camera 9-1,/>A depth value (corresponding to the Z axis) representing the position point P and the optical center 9-11 of the first camera 9-1.

Depth values can be obtained from the formulas (1) and (2)As shown in formula (3).

（3）；

Wherein,The parallax value of the position point P is indicated. /(I)

The process of pre-acquiring a ground depth map based on a binocular system to achieve ground point filtering is described below.

Referring to FIG. 10, FIG. 10 is a flow chart of an exemplary filtering of ground points provided by an embodiment of the present application; as shown in fig. 10, the exemplary process of filtering ground points includes steps 201 to 205, and each step is described below.

Step 201, calibrating a binocular camera (called a multi-camera acquisition device).

When the relative positions of the binocular cameras are fixed, shooting calibration plates (called as calibration objects) at different positions and at different angles by adopting the binocular cameras to obtain a plurality of Zuo You-mesh matching images (called as second images to be processed); extracting feature points of each left and right eye matching image, and matching the extracted feature points (called a plurality of second feature points) to obtain matched feature point pairs; then, camera intrinsic parameters and camera extrinsic parameters, and inter-camera parameters between the two cameras, are calibrated based on the matched pairs of feature points.

And 202, calibrating the ground of the binocular camera.

When equipment (such as a sweeper and the like) carrying the binocular camera with the camera calibration is placed on the ground, shooting the calibration plate horizontally placed on the ground to obtain a plurality of calibration plate images; and extracting characteristic points of each calibration plate image, and calculating the position relationship (called space relative relationship) between the target camera and the ground based on the extracted characteristic points (called a plurality of first characteristic points). Wherein the target camera is any one of binocular cameras.

Step 203, calculate the equivalent physical focal length of the target camera (referred to as the target physical focal length).

It should be noted that, sampling ground points on the ground by establishing a ground equation, projecting the sampled ground points (referred to as target sampling points) onto an imaging plane of the target camera based on a positional relationship between the target camera and the ground, and removing ground points projected to an outer side of the imaging plane of the target camera to obtain effective imaging points (referred to as target projection points) projected onto the imaging plane of the target camera; an equivalent physical focal length of the target camera is calculated based on the sampled ground points and the effective imaging points.

Illustratively, the effective imaging point is located at the imaging plane of the target camera at a coordinate of，/>) Depth value is/>Converting the effective imaging point to a camera coordinate system of the target camera based on an internal reference transformation matrix corresponding to the camera internal reference of the target camera, as shown in formula (4); thereby, coordinates (/ >) in the camera coordinate system corresponding to the target camera are obtained，/>，/>) As shown in formula (5).

（4）；

（5）；

Wherein,Is an internal reference transformation matrix,/>、/>、/>And/>Is a camera reference (referred to as a first dimension focal length, a second dimension focal length, first dimension rotation information, and second dimension rotation information) of the target camera.

Here, camera internal parameters based on the target camera can be obtained，/>) Coordinates (/ >) on the equivalent imaging plane of the target camera) (Called physical imaging point) as shown in formula (6).

（6）；

Determining an equivalent physical focal length of the target camera based on equation (5) and equation (6)As shown in formula (7). /(I)

（7）；

It should be noted that, since an equivalent physical focal length can be calculated based on each effective imaging point, finally, a plurality of equivalent physical focal lengths can be processed by the least square method to obtain a final equivalent physical focal length.

Step 204, acquiring a ground depth map based on the equivalent physical focal length.

It should be noted that, based on the equivalent physical focal length, light is projected from the optical center of the target camera to each pixel point of the equivalent imaging plane. The intersection of the ray passing through each pixel point with the ground equation established in step 203 is then calculated. If the two have the intersection point, the depth value of the intersection point is used as the pixel value of the pixel point, so that the ground depth map is obtained. Here, when the ground disparity map is to be acquired, the camera intrinsic parameters obtained in step 201 are usedAcquiring a ground disparity map (called a ground plane image) corresponding to a ground depth map; wherein, the disparity value/>, of each pixel point in the ground disparity mapAs shown in formula (8).

（8）；

Wherein,Is the depth value in the ground depth map.

Step 205, performing ground point filtering on the scene depth map based on the ground depth map.

It should be noted that, when a ground depth map with respect to the target camera is obtainedAfter that, after the actual scene depth map/>, relative to the target camera, is obtained by the binocular cameraThen, the ground information is filtered by the formula (9) to obtain a filtering result/>(Referred to as image difference), the formula (9) is as follows.

（9）；

Wherein,The optional correction value (called pixel threshold) can be selected according to practical situations.

When the relative positions of the binocular camera and the ground deviate, the normal vector and the ground point of the ground equation can be updated in a self-correcting mode in subsequent use by recording the normal vector and the ground point of the ground equation, so as to update the ground depth map, and self-calibration is realized; here, the filtering efficiency can also be improved by controlling the update frequency.

It can be appreciated that the image processing method provided by the embodiment of the application can be applied to binocular vision tasks in an embedded system, for example, fields of sweeping robots, automatic driving and the like; the image processing method can improve the instantaneity of image processing by exchanging space cost for calculation efficiency. In addition, the embodiment of the application can realize the filtration of the ground points by performing difference operation, reduces the resource consumption for converting the data into the point cloud space, improves the image processing efficiency, and further improves the image rendering frame rate. The image processing method provided by the embodiment of the application can acquire the ground depth map in advance, can dynamically acquire the ground depth map in real time, can improve the image processing efficiency under the condition of ensuring the image processing accuracy, can be applicable to various grounds, and can improve the scene applicability of image processing.

Continuing with the description below of an exemplary architecture of the image processing device 455 implemented as a software module provided by embodiments of the present application, in some embodiments, as shown in fig. 2, the software module stored in the image processing device 455 of the memory 450 may include:

The request response module 4551 is configured to control a multi-view acquisition device to acquire an image of a scene to be processed in response to a scene acquisition request, so as to obtain an image of the scene to be processed, where the multi-view acquisition device includes a plurality of image acquisition devices;

The image acquisition module 4552 is configured to acquire a basal plane image of a scene basal plane, where the basal plane image is obtained by performing image acquisition on the scene basal plane by using the multi-view acquisition device, and the scene to be processed includes the scene basal plane;

an image processing module 4553, configured to obtain an image difference between the basal plane image and the scene image to be processed;

the result obtaining module 4554 is configured to determine the image difference as an image processing result of the image of the scene to be processed, where the image processing result represents scene imaging information of the scene to be processed except for the scene basal plane.

In the embodiment of the present application, the image obtaining module 4552 is further configured to read a pre-stored basal plane image of the scene basal plane, where the basal plane image is obtained by performing image acquisition on the scene basal plane by the multi-view acquisition device before responding to the scene acquisition request.

In an embodiment of the present application, the image processing apparatus 455 further includes an information update module 4555 configured to acquire a footprint update image in response to an image update request for the footprint image triggered by an update event, where the update event includes at least one of: when the updating period arrives, the normal vector deviation value of the basal plane is larger than the deviation threshold value, and the position of the image acquisition equipment is changed; responding to a next scene acquisition request, and acquiring a next scene image to be processed; and determining the difference value between the basal plane updated image and the next scene image to be processed as an image to be applied of the next scene image to be processed.

In the embodiment of the present application, the image acquisition module 4552 is further configured to sample a location point of the scene substrate surface to obtain a target sampling point; projecting the target sampling point to a target imaging plane to obtain a target projection point, wherein the target imaging plane is an imaging plane of a target image acquisition device of the multi-view acquisition device; determining a target physical focal length of the target image acquisition device based on the target projection point; and performing light projection based on the target physical focal length to obtain the basal plane image.

In this embodiment of the present application, the image processing apparatus 455 further includes a data calibration module 4556, configured to control the multi-mesh acquisition device to perform image acquisition on a calibration object when the calibration object is placed on the scene substrate surface, so as to obtain a first image to be processed; extracting feature points of the first image to be processed to obtain a plurality of first feature points; and determining the spatial relative relation between the scene substrate surface and the target image acquisition equipment based on the plurality of first characteristic points and the calibration object.

In this embodiment of the present application, the image obtaining module 4552 is further configured to project the target sampling point to the target imaging plane based on the spatial relative relationship, so as to obtain the target projection point.

In the embodiment of the present application, the image acquisition module 4552 is further configured to acquire, based on a target internal parameter of the target image acquisition device, a device location point of the target projection point corresponding to the target image acquisition device; determining a physical imaging point of the target projection point on a physical imaging plane of the target image acquisition equipment based on the target internal parameter; and calculating the target physical focal length of the target image acquisition device by combining the device position point and the physical imaging point.

In this embodiment of the present application, the data calibration module 4556 is further configured to, in a plurality of placement modes of the calibration object, perform, for each of the placement modes, the following processing: controlling the multi-view acquisition equipment to acquire images of the calibration objects in the placement mode to obtain a second image to be processed; extracting feature points of the second image to be processed to obtain a plurality of second feature points; matching the plurality of second characteristic points with the calibration object to obtain a characteristic point matching result; and determining equipment parameters of the multi-purpose acquisition equipment based on a plurality of characteristic point matching results corresponding to a plurality of placement modes, wherein the equipment parameters comprise the target internal parameters.

In an embodiment of the present application, the image obtaining module 4552 is further configured to determine a physical imaging plane of the target image capturing device based on the target physical focal length; determining a region to be imaged corresponding to a specified image size on the physical imaging plane; the following processing is performed for each pixel point to be imaged in the area to be imaged: projecting light rays from a target optical center of the target image acquisition equipment to the pixel points to be imaged to obtain intersection point pixel values of the projected light rays and the scene substrate surface, wherein the intersection point pixel values are any one of the following: depth value, disparity value; obtaining an intersection point pixel value array corresponding to the region to be imaged according to the intersection point pixel value of each pixel point to be imaged; and determining the intersection pixel value array as the basal plane image.

In this embodiment of the present application, the image processing module 4553 is further configured to determine, when an image type of the to-be-processed scene image and an image type of the basal plane image are the same, the image types of the to-be-processed scene image and the basal plane image as a target image type; and calculating the difference value between the basal plane image and the scene image to be processed based on the target image type to obtain the image difference.

In this embodiment of the present application, the image processing module 4553 is further configured to calculate a first difference value obtained by subtracting the to-be-processed scene image from the ground plane image when the target image type is a depth image type; the first difference is determined as the image difference.

In this embodiment of the present application, the image processing module 4553 is further configured to calculate a second difference value obtained by subtracting the basal plane image from the scene image to be processed when the target image type is a parallax image type; and determining the second difference value as the image difference.

In the embodiment of the present application, the image processing module 4553 is further configured to determine a reference image type from the image type of the to-be-processed scene image and the image type of the base surface image when the image type of the to-be-processed scene image and the image type of the base surface image are different; selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is the same as the reference image type, so as to obtain a reference type image; selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is different from the reference image type, so as to obtain an image to be converted; converting the image to be converted into an image of the reference image type to obtain an image to be calculated; and calculating the difference value between the image to be calculated and the reference type image to obtain the image difference.

In an embodiment of the present application, the image processing apparatus 455 further includes a result application module 4557 configured to determine, based on the image processing result, obstacle information of the scene to be processed; determining information to be moved of the application equipment mounted by the multi-purpose acquisition equipment based on the obstacle information; and controlling the application equipment to move in the scene to be processed based on the information to be moved.

Embodiments of the present application provide a computer program product comprising computer-executable instructions or a computer program stored in a computer-readable storage medium. The processor of the image processing apparatus reads the computer-executable instructions or the computer program from the computer-readable storage medium, and executes the computer-executable instructions or the computer program to cause the image processing apparatus to execute the image processing method described above according to the embodiment of the present application.

The embodiment of the present application provides a computer-readable storage medium in which computer-executable instructions or a computer program are stored, which when executed by a processor, cause the processor to perform an image processing method provided by the embodiment of the present application, for example, an image processing method as shown in fig. 3.

In some embodiments, the computer readable storage medium may be FRAM, ROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, computer-executable instructions may be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, in the form of programs, software modules, scripts, or code, and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, computer-executable instructions may, but need not, correspond to files in a file system, may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext markup language (Hyper Text Markup Language, HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

As an example, the computer-executable instructions may be deployed to be executed on one electronic device (in this case, the one electronic device is an image processing device), or executed on a plurality of electronic devices located at one place (in this case, a plurality of electronic devices located at one place is an image processing device), or executed on a plurality of electronic devices distributed at a plurality of places and interconnected by a communication network (in this case, a plurality of electronic devices distributed at a plurality of places and interconnected by a communication network is an image processing device).

It will be appreciated that in the embodiments of the present application, related data such as images of a scene to be processed is involved, when the embodiments of the present application are applied to specific products or technologies, user permissions or consents need to be obtained, and the collection, use and processing of related data need to comply with relevant laws and regulations and standards of relevant countries and regions. In addition, in the embodiment of the application, the relevant data collection processing should be strictly according to the requirements of relevant national laws and regulations when the example is applied, obtain the informed consent or independent consent of the personal information body, and develop the subsequent data use and processing within the authorized range of laws and regulations and the personal information body.

In summary, when the to-be-processed scene image is acquired in response to the scene acquisition request, the embodiment of the application completes the process of filtering the scene substrate surface from the to-be-processed scene by acquiring the substrate surface image of the scene substrate surface so as to improve the filtering efficiency of the scene substrate surface by acquiring the image difference between the substrate surface image and the to-be-processed scene image; and further, the image processing efficiency can be improved. And the scene basal plane in the scene to be processed is filtered by acquiring the image difference between the basal plane image and the scene image to be processed, so that the limitation on the scene basal plane is reduced, and the application range of image processing can be further improved. In addition, the basal plane image is obtained in advance before the scene acquisition request is responded, so that the basal plane image acquisition efficiency can be improved, and the filtering efficiency of the scene basal plane can be improved; and updating the image of the substrate surface dynamically, so that the accuracy of image processing can be improved.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. An image processing method, the method comprising:

2. The method of claim 1, wherein the acquiring a basal plane image of a basal plane of a scene comprises:

And reading the pre-stored basal plane image of the scene basal plane, wherein the basal plane image is obtained by image acquisition of the scene basal plane through the multi-mesh acquisition equipment before responding to the scene acquisition request.

3. The method according to claim 2, wherein after the determining the image difference as the image processing result of the image of the scene to be processed, the method further comprises:

Acquiring a footprint update image in response to an update event triggered image update request for the footprint image, the update event comprising at least one of: when the updating period arrives, the normal vector deviation value of the basal plane is larger than the deviation threshold value, and the position of the image acquisition equipment is changed;

responding to a next scene acquisition request, and acquiring a next scene image to be processed;

and determining the difference value between the basal plane updated image and the next scene image to be processed as an image to be applied of the next scene image to be processed.

4. The method of claim 1, wherein the acquiring a basal plane image of a basal plane of a scene comprises:

sampling the position points of the scene substrate surface to obtain target sampling points;

Projecting the target sampling point to a target imaging plane to obtain a target projection point, wherein the target imaging plane is an imaging plane of a target image acquisition device of the multi-view acquisition device;

Determining a target physical focal length of the target image acquisition device based on the target projection point;

and performing light projection based on the target physical focal length to obtain the basal plane image.

5. The method of claim 4, wherein the projecting the target sampling point onto a target imaging plane results in a target projection point, the method further comprising:

When a calibration object is placed on the scene substrate surface, controlling the multi-view acquisition equipment to acquire an image of the calibration object to obtain a first image to be processed;

extracting feature points of the first image to be processed to obtain a plurality of first feature points;

determining a spatial relative relationship between the scene substrate surface and the target image acquisition equipment based on a plurality of first characteristic points and the calibration objects;

The projecting the target sampling point to a target imaging plane to obtain a target projection point includes:

and based on the space relative relation, projecting the target sampling point to the target imaging plane to obtain the target projection point.

6. The method of claim 4 or 5, wherein the determining a target physical focal length of the target image capturing device based on the target projection points comprises:

Acquiring equipment position points of the target projection points corresponding to the target image acquisition equipment based on the target internal parameters of the target image acquisition equipment;

Determining a physical imaging point of the target projection point on a physical imaging plane of the target image acquisition equipment based on the target internal parameter;

And calculating the target physical focal length of the target image acquisition device by combining the device position point and the physical imaging point.

7. The method of claim 6, wherein the acquiring the target projection point corresponds to a device location point of the target image capturing device based on the target internal parameter of the target image capturing device, the method further comprising:

among a plurality of placement modes of the calibration object, the following processing is performed for each of the placement modes:

Controlling the multi-view acquisition equipment to acquire images of the calibration objects in the placement mode to obtain a second image to be processed;

extracting feature points of the second image to be processed to obtain a plurality of second feature points;

Matching the plurality of second characteristic points with the calibration object to obtain a characteristic point matching result;

And determining equipment parameters of the multi-purpose acquisition equipment based on a plurality of characteristic point matching results corresponding to a plurality of placement modes, wherein the equipment parameters comprise the target internal parameters.

8. The method of claim 4 or 5, wherein the performing light projection based on the target physical focal length to obtain the basal plane image comprises:

determining a physical imaging plane of the target image acquisition device based on the target physical focal length;

determining a region to be imaged corresponding to a specified image size on the physical imaging plane;

the following processing is performed for each pixel point to be imaged in the area to be imaged:

projecting light rays from a target optical center of the target image acquisition equipment to the pixel points to be imaged to obtain intersection point pixel values of the projected light rays and the scene substrate surface, wherein the intersection point pixel values are any one of the following: depth value, disparity value;

Obtaining an intersection point pixel value array corresponding to the region to be imaged according to the intersection point pixel value of each pixel point to be imaged;

And determining the intersection pixel value array as the basal plane image.

9. The method according to claim 1, wherein the image type of the scene image to be processed and the image type of the ground plane image are each any one of the following: a depth image type, a parallax image type, the depth image type representing a pixel value of an image as a depth value, the parallax image type representing a pixel value of an image as a parallax value.

10. The method according to any one of claims 1 to 5, 9, wherein said acquiring an image difference between the basal plane image and the image of the scene to be processed comprises:

When the image type of the scene image to be processed is the same as the image type of the basal plane image, determining the image types of the scene image to be processed and the basal plane image as target image types;

and calculating the difference value between the basal plane image and the scene image to be processed based on the target image type to obtain the image difference.

11. The method of claim 10, wherein calculating a difference between the ground plane image and the image of the scene to be processed based on the target image type, the image difference, comprises:

When the target image type is a depth image type, calculating a first difference value of the basal plane image minus the scene image to be processed;

The first difference is determined as the image difference.

12. The method of claim 10, wherein calculating a difference between the ground plane image and the image of the scene to be processed based on the target image type, the image difference, comprises:

when the target image type is a parallax image type, calculating a second difference value of subtracting the basal plane image from the scene image to be processed;

and determining the second difference value as the image difference.

13. The method according to any one of claims 1 to 5, 9, wherein said acquiring an image difference between the basal plane image and the image of the scene to be processed comprises:

when the image type of the scene image to be processed and the image type of the basal plane image are different, determining a reference image type from the image type of the scene image to be processed and the image type of the basal plane image;

Selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is the same as the reference image type, so as to obtain a reference type image;

selecting one of the scene images to be processed and the basal plane images, wherein the image type of the one of the scene images to be processed and the basal plane images is different from the reference image type, so as to obtain an image to be converted;

converting the image to be converted into an image of the reference image type to obtain an image to be calculated;

and calculating the difference value between the image to be calculated and the reference type image to obtain the image difference.

14. The method according to any one of claims 1 to 5, 9, wherein after the determining the image difference as an image processing result of the image of the scene to be processed, the method further comprises:

determining obstacle information of the scene to be processed based on the image processing result;

determining information to be moved of the application equipment mounted by the multi-purpose acquisition equipment based on the obstacle information;

and controlling the application equipment to move in the scene to be processed based on the information to be moved.

15. An image processing apparatus, characterized in that the image processing apparatus comprises:

16. An electronic device for image processing, the electronic device comprising:

A memory for storing computer executable instructions or computer programs;

a processor for implementing the image processing method of any one of claims 1 to 14 when executing computer executable instructions or computer programs stored in the memory.

17. A computer-readable storage medium storing computer-executable instructions or a computer program, which, when executed by a processor, implements the image processing method according to any one of claims 1 to 14.

18. A computer program product comprising computer-executable instructions or a computer program, which, when executed by a processor, implements the image processing method of any of claims 1 to 14.