WO2021026855A1 - 基于机器视觉的图像处理方法和设备 - Google Patents

基于机器视觉的图像处理方法和设备 Download PDF

Info

Publication number
WO2021026855A1
WO2021026855A1 PCT/CN2019/100710 CN2019100710W WO2021026855A1 WO 2021026855 A1 WO2021026855 A1 WO 2021026855A1 CN 2019100710 W CN2019100710 W CN 2019100710W WO 2021026855 A1 WO2021026855 A1 WO 2021026855A1
Authority
WO
WIPO (PCT)
Prior art keywords
scene
image
detection model
information
environment
Prior art date
Application number
PCT/CN2019/100710
Other languages
English (en)
French (fr)
Inventor
夏志强
封旭阳
张李亮
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980033604.7A priority Critical patent/CN112204566A/zh
Priority to PCT/CN2019/100710 priority patent/WO2021026855A1/zh
Publication of WO2021026855A1 publication Critical patent/WO2021026855A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Definitions

  • the embodiments of the present disclosure relate to the technical field of intelligent control and perception, and in particular, to an image processing method and device based on machine vision.
  • Target detection algorithm is one of the key technologies of autonomous driving and intelligent drones. It can detect and recognize the position, category and confidence of objects of interest in visual images, and provide necessary observation information for subsequent intelligent functions.
  • the target detection algorithm usually uses only one general model for all scenes, such as a trained neural network model or a perception algorithm model based on feature point recognition.
  • a trained neural network model or a perception algorithm model based on feature point recognition.
  • it is necessary to learn more data from different scenes. High-performance detection results can be obtained in different scenarios, and the model design is often more complicated, which will greatly increase the amount of calculation.
  • the present disclosure provides an image processing method and device based on machine vision, which improves image processing efficiency.
  • the present disclosure provides an image processing method based on machine vision, which is applied to a movable platform equipped with an image acquisition device, and the method includes:
  • the environment image is processed based on the scene detection model.
  • the present disclosure provides a vehicle equipped with a camera device, a memory, and a processor, the memory is used to store instructions, and the instructions are executed by the processor to implement any one of the first aspect. The method described.
  • the present disclosure provides a drone equipped with a camera device, a memory, and a processor.
  • the memory is used to store instructions that are executed by the processor to implement the Any one of the methods.
  • the present disclosure provides an electronic device that is communicatively connected to a camera device.
  • the electronic device includes a memory and a processor.
  • the memory is used to store instructions that are executed by the processor to implement the first aspect. The method of any one of.
  • the present disclosure provides a handheld pan/tilt that includes a camera, a memory, and a processor.
  • the memory is used to store instructions that are executed by the processor to implement the Any one of the methods.
  • the present disclosure provides a mobile terminal, the mobile terminal includes: a camera, a memory, and a processor, the memory is used to store instructions, the instructions are executed by the processor to implement any one of the first aspect The method described in the item.
  • the present disclosure provides an image processing method and device based on machine vision to obtain an environment image; use a preloaded environment detection model to determine a current scene according to the environment image; load a scene detection model that matches the current scene;
  • the scene detection model processes environmental images, and when the computing power is constrained, a lightweight scene detection model corresponding to the current scene is selected, which improves the efficiency of image processing and the respective performance in different scenes.
  • Fig. 1 is a schematic diagram of a drone provided by an embodiment of the present disclosure
  • Figure 2 is a schematic diagram of a handheld pan/tilt provided by an embodiment of the disclosure
  • FIG. 3 is a schematic diagram of an application provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of an embodiment of an image processing method based on machine vision provided by the present disclosure
  • FIG. 5 is a schematic diagram of a scenario provided by an embodiment of the disclosure.
  • FIG. 6 is a schematic diagram of a scenario provided by another embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of a comparison of network models according to an embodiment of the disclosure.
  • FIG. 8 is a schematic flowchart of another embodiment of the image processing method provided by the present disclosure.
  • FIG. 9 is a schematic flowchart of another embodiment of the image processing method of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a vehicle provided by an embodiment of the disclosure.
  • FIG. 11 is a schematic structural diagram of a drone provided by an embodiment of the disclosure.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the disclosure.
  • FIG. 13 is a schematic structural diagram of a handheld pan/tilt provided by an embodiment of the disclosure.
  • FIG. 14 is a schematic structural diagram of a mobile terminal provided by an embodiment of the disclosure.
  • FIG. 15 is a schematic diagram of a memory loading situation disclosed in an embodiment of this specification.
  • the machine vision-based image processing method provided by the embodiments of the present disclosure is applied to scenes such as autonomous driving and smart drones, and can detect and recognize the position, category, and confidence of the object of interest in the image, Provide necessary observation information for other functions.
  • the method may be executed by a drone 10.
  • the drone 10 may be equipped with a camera device 1, for example, the processor of the drone may execute the corresponding
  • the software code implementation can also be implemented by the drone through data interaction with the server while executing the corresponding software code.
  • the server performs some operations to control the drone to execute the image processing method.
  • the method may be executed by a handheld PTZ.
  • the handheld PTZ 20 may include a camera device 2.
  • the processor of the handheld PTZ may execute corresponding software.
  • the code implementation can also be implemented by the UAV executing the corresponding software code while performing data interaction with the server.
  • the server performs some operations to control the UAV to execute the image processing method.
  • the camera device is used to obtain environmental images, such as the surrounding environmental images of the drone or the handheld PTZ.
  • the method may be executed by electronic equipment such as a mobile terminal, as shown in Figure 3, the electronic equipment may be set on a vehicle or a drone; or may be executed by a vehicle-mounted control device communicating with the electronic equipment carried out.
  • the above-mentioned vehicles may be self-driving vehicles or ordinary vehicles.
  • it can be implemented by an electronic device such as the processor of the electronic device executing the corresponding software code, or the electronic device can execute the corresponding software code while performing data interaction with the server.
  • the server performs some operations to control the electronic device.
  • the device executes the image processing method.
  • FIG. 4 is a schematic flowchart of an embodiment of an image processing method based on machine vision provided by the present disclosure. As shown in FIG. 4, the method provided in this embodiment is applied to a movable platform equipped with an image acquisition device, and the method includes:
  • Step 101 Acquire environmental images.
  • the environment image may be image information collected by an image acquisition device.
  • the image acquisition device is usually mounted on a movable body, which may be a vehicle, an unmanned aerial vehicle, a ground mobile robot, etc.
  • the image acquisition device may be a monocular camera device, a binocular camera device, a multi-eye camera device, a fish-eye lens, a double-eye lens, and so on.
  • the imaging device acquires environmental image information around the movable body, for example, image information of the front, back, or side of the movable body.
  • the camera device can also obtain wide-format information or panoramic information around the movable body; multiple images, parts of images, parts of images, or combinations of images can be obtained.
  • the acquired environment image may be an original image output by the image sensor, or an image that has undergone image processing but retains the original image brightness information, for example, an image in RGB format or HSV format.
  • the above-mentioned environment image may be the environment image information collected by the image acquisition device during the driving process of the vehicle or during the flight of the drone.
  • Movable platforms refer to platforms such as drones, vehicles, and electronic devices, for example.
  • Step 102 Use the preloaded environment detection model to determine the current scene according to the environment image.
  • determining the current scene information includes extracting the possible scene where the movable body is located according to the environment image obtained in step 101.
  • This step can be implemented according to a judgment function, for example, reading the RGB or HSV distribution information of the environment image obtained in step 101, and judging the current scene according to the distribution.
  • This step can also be a process of statistical comparison, for example, reading the histogram information in the HSV, and then judging the scene based on the histogram information.
  • This step can also be through an environment detection model, which can be implemented based on a neural network to construct a neural network, and output the current scene according to the input environment image.
  • the scene may include scenes at different times, such as day and night; different weather scenes, such as sunny, rainy, foggy, snowy, etc.; scenes with different road conditions, such as highways, urban roads, Country roads, etc.
  • the current scene may include at least two scenes divided according to image brightness.
  • the current scene divided according to image brightness may include a high-brightness scene and a low-brightness scene.
  • the current scene divided according to image brightness may include a high-brightness scene, a medium-brightness scene, and a low-brightness scene.
  • the current scene may include at least two scenes divided according to image visibility.
  • the current scene divided according to image visibility may include a high visibility scene and a low visibility scene.
  • the current scene divided according to the visibility of the image may include a scene with high visibility, a scene with medium visibility, and a scene with low visibility.
  • the at least two scenes classified according to the visibility of the image may include a haze scene, a sand dust scene, a snow scene, a rain scene, and the like.
  • the current scene may include at least two scenes divided according to image texture information.
  • the scene divided according to the image texture information includes weather information.
  • the weather information includes weather information such as rain, snow, fog, and blowing sand.
  • the network used for scene recognition only needs to output a small amount of classification results.
  • the network layer does not need too many parameters. That is, the neural network used for this step of judgment only needs to consume a small amount of system computing power, and the model loading only needs to consume a small amount of system bandwidth.
  • the environment detection model can be preloaded before the current scene is determined, and no loading operation is required during use, which can improve processing efficiency.
  • the preloaded environment detection model is always in a loading state during the environment image acquisition process.
  • the preloaded environment detection model is always in a loading state during the environment image acquisition process, and the environment detection model can be used to determine the current scene at any time.
  • Step 103 Load a scene detection model matching the current scene.
  • this step loads a scene detection model matching the current scene based on the current scene determined in step 102.
  • the scene detection model can be established based on neural network models such as CNN, VGG, GoogleNet, etc., and trained based on the training data of different scenes to obtain scene detection models matching different scenes.
  • neural network models such as CNN, VGG, GoogleNet, etc.
  • the scenes may include scenes at different times, such as day and night; scenes of different weather, such as sunny, rainy, foggy, snowy, etc.; scenes of different road conditions, such as highways, urban roads, and rural roads.
  • the scenes where the vehicle is located in Figure 5 and Figure 6 are a sunny scene and a cloudy scene, or a high-brightness scene and a low-brightness scene, respectively.
  • the scene detection model corresponding to each scene does not require too many parameters and only consumes a small amount of system computing power.
  • a small scene detection model corresponding to multiple scenes replaces a large general detection model. In the case of limited computing power So that the device can work normally.
  • the computing power of the device is 500M. If a 2.7G network model (such as part a on the left of Figure 7) needs to be loaded to realize the image processing function, this is obviously impossible.
  • a 2.7G network model such as part a on the left of Figure 7
  • the computing power of the device is limited. Under the circumstances, the device can work normally.
  • the scene detection model may also be established based on other network models, which is not limited in the present disclosure.
  • the scene detection model matching the current scene is switched and loaded as the current scene changes.
  • the scene detection model matching the current scene does not exit the memory due to switching loading.
  • a scene detection model that matches the current scene is loaded based on the current scene, and if the current scene changes, the scene detection model that matches the changed scene is switched to load.
  • the scene detection model may not exit the memory, and the loading speed can be increased for the next use.
  • the preloaded environment detection model and the scene detection model are in different threads.
  • the pre-loaded environment detection model and the scene detection model can be in different threads.
  • the environment detection model can also be used to determine the current scene. The scene at the time may change and does not match the scene detection model.
  • the scene detection model that matches the changed scene can be switched to load the scene detection model to process the environment image.
  • the pre-loaded environment detection model communicates between threads through a callback function.
  • the information of the current scene determined by the environment detection model may be notified to the scene detection model through the callback function, or the environment image obtained by the image acquisition device may be acquired based on the callback function.
  • Step 104 Process the environment image based on the scene detection model.
  • the environment image is processed based on the scene detection model corresponding to the identified current scene, such as identifying the position of the target object in the environment image, the category to which the target object belongs, and the confidence in the category, etc. .
  • processing the environment image based on the scene detection model includes: acquiring object information in the environment image.
  • the object information includes: location information of the target object in the environment image, category information of the target object, and confidence of the target object in the corresponding category.
  • a non-maximum value suppression method is used to filter the object information to obtain the target detection result.
  • the amount of target object information included in the object information output by the scene detection model is very large, and there will be a lot of repeated information, for example, there are many location information, and some of them overlap.
  • the object information can be filtered by methods such as non-maximum suppression to obtain the final target detection result.
  • the output can be used as external observation information to provide downstream modules, such as state estimation, navigation control, etc., to complete more complex automatic driving functions.
  • the information of the environment image is input into the scene detection model corresponding to the loaded current scene, and the target detection results are output through several network layers of the scene detection model, for example, including: the position of the target object, the belonging The category and the confidence level in that category.
  • the target object may be, for example, a dynamic target and/or a static target.
  • the dynamic target may include a moving vehicle, a drone, etc.
  • the static target may include, for example, the number of surroundings, road signs, telephone poles, and so on.
  • the image acquisition device loaded by the vehicle acquires the environment image around the vehicle.
  • the vehicle uses the preloaded environment detection model to determine the current scene according to the environment image, for example, it is determined that the current scene is a high-brightness scene, A scene detection model corresponding to the high-brightness scene, and based on the scene detection model, the environment image acquired by the image acquisition device is processed.
  • the image acquisition device loaded by the vehicle acquires the environment image around the vehicle.
  • the vehicle uses the preloaded environment detection model to determine the current scene according to the environment image, for example, it is determined that the current scene is a low-brightness scene, A scene detection model corresponding to the low-brightness scene, and based on the scene detection model, the environment image acquired by the image acquisition device is processed.
  • the method of this embodiment obtains an environment image; uses a preloaded environment detection model to confirm the current scene according to the environment image; loads a scene detection model that matches the current scene; processes the environment image based on the scene detection model,
  • selecting the lightweight scene detection model corresponding to the current scene improves the efficiency of image processing and the respective performance in different scenes.
  • the environmental image may also be compressed.
  • the acquired environment image is generally color RGB image information, and the image resolution is generally large, such as 1280 ⁇ 720.
  • the environment image can be compressed, for example, the resolution is compressed Up to 640 ⁇ 360, processing efficiency can be improved when computing power is restricted.
  • the pre-loaded environment detection model is used to extract brightness information in the environment image to determine the current scene.
  • the RGB or HSV information of the environmental image can be obtained, thereby extracting the brightness information in the environmental image, and then determining the current scene, such as a high-brightness scene, a medium-brightness scene, and a low-brightness scene divided by image brightness.
  • high visibility scenes, medium visibility scenes, and low visibility scenes are classified according to image visibility.
  • the pre-loaded environment detection model is used to extract brightness information and images in the environment image to determine the current scene.
  • the aforementioned preloaded environment detection model can also extract the image, and combine the image and the brightness information to determine the current scene.
  • step 102 is as follows:
  • the RGB or HSV distribution information of the environment image obtained in step 101 is read, and the current scene is determined according to the distribution information.
  • the information of the R, G, and B channels of the pixels in the environment image can be averaged to obtain each The average pixel value corresponding to the channel, or obtain the proportion of pixels whose brightness value is greater than the preset brightness value, etc., to determine the current scene. For example, if the proportion of pixels whose brightness value is greater than the preset brightness value is greater than a certain value, it can be determined as high brightness Scenes, such as daytime scenes.
  • HSV is a way to represent points in the RGB color space in an inverted cone.
  • HSV stands for Hue, Saturation, and Value.
  • Hue is the basic attribute of color, which is the usual color name, such as red, yellow, etc.; saturation refers to the purity of the color, the higher the color The more pure, the lower it will gradually become gray, which is a value of 0-100%; brightness refers to the brightness of the color, which is 0-100%.
  • the information of the H, S, and V channels of the pixels in the environmental image can be averaged to obtain the average pixel value corresponding to each channel. Or, obtain the proportion of pixels whose brightness value is greater than the preset brightness value, or obtain the proportion of red and yellow light, so as to determine the current scene.
  • step 102 is as follows:
  • the histogram information in the environment image is counted, and the current scene is determined by using the histogram information.
  • the RGB or HSV histogram information of the environment image obtained in step 101 is read, and the current scene is determined according to the RGB or HSV histogram.
  • the RGB histogram information in an optional embodiment, after obtaining the environment image, perform statistics on the R, G, and B channels of the pixels in the environment image to obtain Histogram information to determine the current scene based on the histogram information of the R, G, and B channels.
  • HSV histogram information in an optional embodiment, after acquiring the environment image, perform statistics on the three channels of pixels H, S, and V in the environment image to obtain Histogram information, so as to determine the current scene according to the histogram information of the three channels H, S, and V.
  • the distribution information or histogram information obtained above may also be input into the pre-trained environment detection model, and output information of the current scene, thereby determining the current scene.
  • step 102 is as follows:
  • the current scene is determined by using the pre-trained environmental detection model.
  • the environment image can be directly input into the environment detection model, and the corresponding current scene information is output.
  • the environment detection model can be established based on a neural network model such as CNN, and trained based on training data to obtain better parameters of the environment detection model.
  • the environment detection model can only output a small amount of classification results.
  • the network layer does not require too many parameters. That is, the neural network used for this step of judgment only needs to consume a small amount of system computing power, and the model loading only needs to consume a small amount of system bandwidth.
  • the environment detection model may also be established based on other network models, which is not limited in the embodiments of the present disclosure.
  • step 102 is as follows:
  • the landmark information in the environment image is acquired, and the current scene is determined according to the landmark information, for example, an urban road scene, a highway scene, etc.
  • the road sign information in the environmental image information can be obtained through a recognition algorithm.
  • step 104 may be specifically implemented in the following manner:
  • the determined current scene includes multiple scenes, such as daytime scenes, snowy scenes, and highway scenes (for example, multiple scenes can be determined at the same time based on an environmental image, such as both daytime scenes, snowy scenes and highway scenes). Then, the scene detection models corresponding to the multiple scenes can be loaded in sequence, and the environment image can be processed based on the scene detection models corresponding to the multiple scenes.
  • scenes such as daytime scenes, snowy scenes, and highway scenes
  • an environmental image such as both daytime scenes, snowy scenes and highway scenes
  • first load a scene detection model matching the daytime scene, and process the environment image based on the scene detection model matching the daytime scene to obtain the first detection result; further, load the scene matching the snow scene A detection model, inputting the first detection result and the information of the environment image into a snow scene matching scene detection model, and processing the first detection result and the information of the environment image based on the snow scene matching scene detection model, and the first detection result It can be used as a priori information to make the obtained second detection result more accurate; further, load the scene detection model matching the highway scene, and input the first detection result, the second detection result and the information of the environment image into the highway Scene matching scene detection model, based on the highway scene matching scene detection model processing the first detection result, second detection result and environmental image information, the first detection result and the second detection result can be used as prior information, so that The obtained third detection result is more accurate, and finally the target detection result is obtained according to the third detection result, or the target detection result is obtained according to the first detection result, the second detection
  • obtaining the target detection result can be specifically implemented in the following manner:
  • the third detection result (or at least one of the first detection result, the second detection result, and the third detection result) is filtered by using a non-maximum value suppression method to obtain the target detection result; the target detection
  • the result includes at least one of the following: the position information of the target object in the environmental image information, the category information of the target object, and the confidence of the target object in the corresponding category.
  • the detection result output by the scene detection model includes a large amount of target object information, and there will be a lot of repeated information, for example, there are many location information, and some of the content overlaps.
  • Methods such as non-maximum suppression can be used to filter the detection results to obtain the final target detection results.
  • the output can be used as external observation information to provide downstream modules, such as state estimation, navigation control, etc., to complete more complex automatic driving functions.
  • step 103 the following operations may be performed before step 103:
  • the training data includes environmental image data including location information and category information of target objects in different scenes;
  • scene detection models corresponding to different scenes need to be pre-trained to obtain better parameters of the scene detection model.
  • a scene detection model with better performance for different scenes such as daytime environment and nighttime environment
  • it is necessary to train the models separately for training data corresponding to different scenes such as daytime data and night data.
  • a batch of training data is collected in advance for different scenes such as day and night.
  • Each training data contains the environment image and the location and category labels of the object of interest on the environment image, and then models are designed based on the training data corresponding to different scenes. And training, so as to obtain better scene detection models in different scenarios.
  • a corresponding training set is used for each scene to train the scene detection model.
  • FIG. 8 is a schematic flowchart of another embodiment of the target detection method provided by the present disclosure. As shown in Figure 8, the method provided in this embodiment includes:
  • Step 201 Acquire environmental images.
  • the environment image may be image information collected by the image acquisition device, such as an environment image around the vehicle.
  • the environment image may include multiple images, such as an image that triggers the loading of a corresponding scene detection model, or an image used to determine the current scene.
  • Step 202 Extract feature information in the environmental image.
  • the environmental image may also be compressed.
  • Step 203 Determine the current scene according to the feature information in the environment image.
  • the current scene can be determined based on the environmental image information, such as a scene at a different time, such as a daytime scene or a night scene.
  • the acquired environment image is generally color RGB image information, and the image resolution is generally large, such as 1280 ⁇ 720.
  • the environment image information can be compressed, such as compressing the resolution to 640 ⁇ 360, which can improve processing efficiency when computing power is restricted.
  • the current scene can be determined using the environment detection model based on the feature information extracted from the environment image, for example, a daytime scene or a night scene.
  • the feature information includes at least one of the following: average pixel value, proportion of high brightness value, proportion of red and yellow light, and HSV three-channel statistical histogram of hue, saturation and brightness.
  • the color image can be stacked by the three channels of R, G, and B, and the histogram of each channel can be extracted separately.
  • the average pixel value can be the average of the three channels.
  • the proportion of high brightness value refers to the proportion of pixels whose brightness value is greater than the preset high brightness value.
  • HSV is a way of representing points in the RGB color space in an inverted cone.
  • HSV stands for Hue, Saturation, and Value.
  • Hue is the basic attribute of color, which is the usual color name, such as red, yellow, etc.; saturation refers to the purity of the color, the higher the color The more pure, the lower it will gradually become gray, which is a value of 0-100%; brightness refers to the brightness of the color, which is 0-100%.
  • the HSV color space feature extraction method is similar to RGB.
  • the key point is to convert the original image into an HSV color space image, and then perform histogram drawing operations on the three channels separately.
  • the proportion of red and yellow light can also be obtained.
  • the above four features may be spliced together to form feature information with a length of 63.
  • a pre-trained environment detection model can be used, the extracted feature information is input into the environment detection model, and the corresponding current scene information is output;
  • step 203 can be specifically implemented in the following manner:
  • the ambient light intensity of the current scene is determined.
  • a pre-trained environment detection model can be used, the extracted feature information is input into the environment detection model, and the ambient light intensity of the current scene is output, and the current scene is determined according to the ambient light intensity , Since the ambient light intensity of different time scenes, such as daytime scene and night scene, is different, the current scene can be determined according to the ambient light intensity.
  • the environment detection model can also be trained in advance, which can be specifically implemented in the following ways:
  • the training data includes feature information of multiple environmental images and scene information corresponding to each environmental image, or multiple environmental images and scene information corresponding to each environmental image;
  • the pre-established environment detection model is trained through the training data to obtain a trained environment detection model.
  • the environment detection model can be established by deep learning algorithms, such as convolutional neural network CNN model, VGG model, GoogleNet model, etc.
  • CNN model convolutional neural network
  • VGG model a model with better recognition performance for different scenes
  • GoogleNet model a model with better recognition performance for different scenes
  • the environment detection model is trained on training data corresponding to different scenes such as daytime scenes and night scenes, so as to obtain better parameters of the environment detection model.
  • Step 204 Load a scene detection model matching the current scene.
  • this step loads the corresponding scene detection model in the memory of the device based on the current scene determined in step 203.
  • Step 205 Process the environment image based on the scene detection model to obtain the first detection result.
  • the environment image is processed based on the scene detection model corresponding to the current scene, such as identifying the position of the target object in the environment image, the category to which the target object belongs, and the confidence in the category, etc.
  • the scene detection model may be a machine learning model obtained by pre-training, such as a convolutional neural network model.
  • the corresponding training data set is used to train the scene detection model for each scene.
  • the information of the environment image is input into the scene detection model corresponding to the current scene, and the first detection result is output after processing by several convolutional layers and pooling layers.
  • Step 206 Use a non-maximum value suppression method to filter the first detection result to obtain the target detection result; the target detection result includes at least one of the following: the position information of the target object in the environment image, the category information of the target object, and the target object The confidence in the corresponding category.
  • the detection result output by the scene detection model includes a large amount of target object information, and there will be a lot of repeated information, for example, there are many location information, and some of the content overlaps.
  • Methods such as non-maximum suppression can be used to filter the detection results to obtain the final target detection results.
  • the output can be used as external observation information to provide downstream modules, such as state estimation, navigation control, etc., to complete more complex automatic driving functions.
  • step 205 can be implemented in the following manner:
  • Step 2051 Process the environment image based on the scene detection model matched by the first scene to obtain the first detection result
  • Step 2052 process the first detection result based on the scene detection model matched by the second scene, and obtain the second detection result
  • Step 2053 Obtain a target detection result according to the second detection result.
  • the scene can be determined based on the environmental image.
  • the current scene includes different daytime, night and other time scenes, or snowy, foggy, rainy, sunny and other weather scenes, or highways, rural roads, urban roads, etc. Traffic scene.
  • the current scene includes at least two scenes, for example, the first scene and the second scene.
  • the environment image is processed based on the scene detection model matched by the first scene to obtain a first detection result; further, the first detection result is input into a second scene, for example, the second scene It is a snow scene in a weather scene, the first detection result is processed based on the scene detection model matched by the second scene, the second detection result is obtained, and the target detection result is finally obtained according to the second detection result.
  • the detection model matched by the second scene is used for target detection
  • the scene detection model matched by the first scene has been used to process the environment image to obtain prior information, so that the final target detection result obtained is more accurate.
  • the first scene and the second scene may be a high brightness scene and a low brightness scene, respectively.
  • the method of this embodiment obtains an environment image; confirms and determines the current scene according to the environment image; loads a scene detection model that matches the current scene; processes the environment image based on the scene detection model, and when the computing power is restricted, Selecting the lightweight scene detection model corresponding to the current scene improves the efficiency of image processing and the respective detection performance in different scenes.
  • an embodiment of the present disclosure also provides a vehicle.
  • the vehicle is equipped with a camera device 11, a memory 12, and a processor 13.
  • the memory 12 is used to store instructions, and the instructions are used by the processor 13. Execute to implement the method described in any one of the foregoing method embodiments.
  • the vehicle provided in this embodiment is used to execute the image processing method provided in any of the foregoing embodiments, and the technical principles and technical effects are similar, and will not be repeated here.
  • an embodiment of the present disclosure also provides a drone.
  • the drone is equipped with a camera 21, a memory 22, and a processor 23.
  • the memory 22 is used to store instructions. It is executed by the processor 23 to implement the method described in any one of the foregoing method embodiments.
  • the drone provided in this embodiment is used to execute the image processing method provided in any of the foregoing embodiments.
  • the technical principles and technical effects are similar, and details are not repeated here.
  • an embodiment of the present disclosure also provides an electronic device, which is communicatively connected to the camera device.
  • the electronic device includes a memory 32 and a processor 31.
  • the memory 32 is used to store instructions. It is executed by the processor 31 to implement the method described in any one of the foregoing method embodiments.
  • the electronic device provided in this embodiment is used to execute the image processing method provided in any one of the foregoing embodiments.
  • the technical principles and technical effects are similar, and will not be repeated here.
  • an embodiment of the present disclosure also provides a handheld PTZ.
  • the handheld PTZ includes: a camera 41, a memory 42, and a processor 43.
  • the memory 42 is used to store instructions. It is executed by the processor 43 to implement the method described in any one of the foregoing method embodiments.
  • the handheld pan/tilt provided in this embodiment is used to execute the image processing method provided in any one of the foregoing embodiments.
  • the technical principles and technical effects are similar, and will not be repeated here.
  • an embodiment of the present disclosure also provides a mobile terminal.
  • the mobile terminal includes a camera 51, a memory 52, and a processor 53, the memory 52 is used to store instructions, and the instructions are processed.
  • the device 53 executes to implement the method described in any one of the foregoing method embodiments.
  • the mobile terminal provided in this embodiment is used to execute the image processing method provided in any of the foregoing embodiments.
  • the technical principles and technical effects are similar, and details are not repeated here.
  • the embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the corresponding method in the foregoing method embodiment is implemented.
  • the specific implementation process please refer to the foregoing method implementation.
  • the implementation principles and technical effects are similar, so I won’t repeat them here.
  • the embodiment of the present disclosure also provides a program product.
  • the program product includes a computer program (that is, an execution instruction), and the computer program is stored in a readable storage medium.
  • the processor can read the computer program from a readable storage medium, and the processor executes the computer program to execute the target detection method provided by any one of the foregoing method embodiments.
  • An embodiment of the present disclosure also provides a vehicle, including:
  • the electronic device is installed on the vehicle body.
  • the implementation principle and technical effect are similar to the method embodiment, and will not be repeated here.
  • An embodiment of the present disclosure also provides a drone, including:
  • the electronic device is installed on the vehicle body.
  • the implementation principle and technical effect are similar to the method embodiment, and will not be repeated here.
  • FIG. 15 is a schematic diagram of a ratio of memory occupation during model loading according to an embodiment of this specification.
  • the environment detection model is always loaded, for example, it can always be loaded in the processor memory during the operation of the mobile platform. It only needs to judge the current environment, and the required system resources are small.
  • the environment detection model only needs to identify and output the category information of the current environment, which is used to load the scene detection model.
  • the scene detection model is used to detect objects around the movable platform. On the one hand, the environment detection model and scene model can greatly reduce the resources occupied by the loaded model; on the other hand, the scene model occupies more resources than the environment detection model.
  • the environment detection model may be a trained neural network model, which can output the recognized classification results according to the input image information, such as day, night, rain, snow, and fog.
  • the environment detection model may be a trained neural network model, which can output recognized two-dimensional classification results according to the input image information, such as day-rain, night-rain, and day-fog.
  • the environment detection model can be a trained neural network model that can output the recognized three-dimensional classification results according to the input image information.
  • the dimensions include but are not limited to weather-climate brightness, such as day-rain- Dim, night-rain-dark, day-sunny-bright.
  • the environment detection model can be a trained neural network model that can output recognized four-dimensional or even high-dimensional classification results according to the input image information.
  • the dimensions include but are not limited to weather-climate brightness, such as daytime -Rain-dark-road, night-rain-dark-road, day-clear-bright-tunnel.
  • the environment detection model may be a judgment function based on the output parameters of the image sensor, for example, judging day or night according to the brightness information of the image.
  • a person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the steps including the foregoing method embodiments are executed; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

一种基于机器视觉的图像处理方法和设备,应用于搭载图像获取装置的可移动平台,该方法包括:获取环境图像(101);使用预加载的环境检测模型,根据环境图像确定当前场景(102);加载与当前场景匹配的场景检测模型(103);基于场景检测模型处理环境图像(104)。在算力受到约束的情况下,选择当前场景对应的轻量化的场景检测模型,提高了处理的效率和不同场景下各自的性能。

Description

基于机器视觉的图像处理方法和设备 技术领域
本公开实施例涉及智能控制和感知技术领域,尤其涉及一种基于机器视觉的图像处理方法和设备。
背景技术
目标检测算法是自动驾驶、智能无人机的关键技术之一,它可以检测、识别出视觉图像中感兴趣物体的位置、类别和置信度,为后续智能功能提供必需的观测信息。
相关技术中,目标检测算法针对所有场景通常只使用一个通用模型,例如经过训练的神经网络模型或者基于特征点识别的感知算法模型。为了保证在不同场景都有高可靠性的识别结果,当使用神经网络模型时,需要学习较多不同场景的数据。可以在不同场景下能够获得高性能检测结果,模型设计往往较为复杂,会大大增加计算量。
发明内容
本公开提供一种基于机器视觉的图像处理方法和设备,提升了图像处理效率。
第一方面,本公开提供一种基于机器视觉的图像处理方法,应用于搭载图像获取装置的可移动平台,所述方法包括:
获取环境图像;
使用预加载的环境检测模型,根据所述环境图像确定当前场景;
加载与所述当前场景匹配的场景检测模型;
基于所述场景检测模型处理环境图像。
第二方面,本公开提供一种车辆,所述车辆搭载有摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现第一方面中任一项所述的方法。
第三方面,本公开提供一种无人机,所述无人机搭载有摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现第一方面中任一项所述的方法。
第四方面,本公开提供一种电子设备,与摄像装置可通信连接,所述电子设备包含 存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现第一方面中任一项所述的方法。
第五方面,本公开提供一种手持云台,所述手持云台包括:摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现第一方面中任一项所述的方法。
第六方面,本公开提供一种移动终端,所述移动终端包括:摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现第一方面中任一项所述的方法。
本公开提供一种基于机器视觉的图像处理方法和设备,获取环境图像;使用预加载的环境检测模型,根据所述环境图像确定当前场景;加载与所述当前场景匹配的场景检测模型;基于所述场景检测模型处理环境图像,在算力受到约束的情况下,选择当前场景对应的轻量化的场景检测模型,提高了图像处理的效率和不同场景下各自的性能。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开一实施例提供的无人机示意图;
图2为本公开一实施例提供的手持云台示意图;
图3为本公开一实施例提供的一种应用示意图;
图4是本公开提供的基于机器视觉的图像处理方法一实施例的流程示意图;
图5为本公开一实施例提供的场景示意图;
图6为本公开另一实施例提供的场景示意图;
图7为本公开一实施例的网络模型对比示意图;
图8是本公开提供的图像处理方法另一实施例的流程示意图;
图9为本公开图像处理方法又一实施例的流程示意图;
图10为本公开一实施例提供的车辆的结构示意图;
图11为本公开一实施例提供的无人机的结构示意图;
图12为本公开一实施例提供的电子设备的结构示意图;
图13为本公开一实施例提供的手持云台的结构示意图;
图14为本公开一实施例提供的移动终端的结构示意图;
图15是本说明书实施例披露的内存加载情况示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
首先对本公开所涉及的应用场景进行介绍:
本公开实施例提供的基于机器视觉的图像处理方法,应用于自动驾驶、智能无人机等场景中,可以检测、识别出图像中感兴趣物体的位置、类别和在该类别中的置信度,为后续其他功能提供必需的观测信息。
在一个可选的实施例中,该方法可以由无人机10执行,如图1所示,该无人机10可以搭载有摄像装置1,例如可以由该无人机的处理器执行相应的软件代码实现,也可由该无人机在执行相应的软件代码的同时,通过和服务器进行数据交互来实现,如服务器执行部分操作,来控制无人机执行该图像处理方法。
在一个可选的实施例中,该方法可以由手持云台执行,如图2所示,该手持云台20可以包括有摄像装置2,例如可以由该手持云台的处理器执行相应的软件代码实现,也可由该无人机在执行相应的软件代码的同时,通过和服务器进行数据交互来实现,如服务器执行部分操作,来控制无人机执行该图像处理方法。
其中,摄像装置用于获取环境图像,例如该无人机或手持云台周边的环境图像。
在一个可选的实施例中,该方法可以由移动终端等电子设备执行,如图3所示,该电子设备可以设置在车辆或无人机上;或者可以由与该电子设备通信的车载控制设备执行。上述车辆可以是自动驾驶车辆或普通车辆。例如可由电子设备如该电子设备的处理器执行相应的软件代码实现,也可由该电子设备在执行相应的软件代码的同时,通过和服务器进行数据交互来实现,如服务器执行部分操作,来控制电子设备执行该图像处理方法。
在消费级电子市场,电子设备会因为搭载的处理器型号不同面临着算力和带宽瓶 颈。
下面以具体的实施例对本公开的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图4是本公开提供的基于机器视觉的图像处理方法一实施例的流程示意图。如图4所示,本实施例提供的方法,应用于搭载图像获取装置的可移动平台,该方法包括:
步骤101、获取环境图像。
在一个可选的实施例中,环境图像可以是采用图像获取装置采集到的图像信息。图像获取装置通常搭载于可移动体上,可移动体可以是车辆、无人机、地面移动机器人等。图像获取装置可以是单目摄像装置、双目摄像装置、多目摄像装置、鱼眼镜头、复眼镜头等等。摄像装置获取到可移动体周边的环境图像信息,例如可移动体前方、后方或者侧方的图像信息。在可选的实施例中,摄像装置还可以获得可移动体周边的宽幅信息或者全景信息;可以获得多个图像、图像的局部、图像局部或者图像的组合。获取的环境图像可以是图像传感器输出的原始图像,也可以是经过图像处理但是保留原始图像亮度信息的图像,例如保留有RGB格式或者HSV格式的图像。上述环境图像可以是通过图像获取装置采集车辆行驶过程中,或无人机飞行过程中的环境图像信息。
可移动平台例如指无人机、车辆、电子设备等平台。
步骤102、使用预加载的环境检测模型,根据环境图像确定当前场景。
在一个可选的实施例中,确定当前场景信息包括根据前述步骤101获取的环境图像来提取可移动体所处的可能场景。
该步骤可以根据一个判断函数来实现,例如读取步骤101中获得的环境图像的RGB或HSV分布信息,根据分布判断当前的场景。
该步骤也可以是一个统计比较的过程,例如,读取HSV中的直方图信息,然后基于直方图信息判断场景。
该步骤也可以通过环境检测模型,该环境检测模型可以基于神经网络实现,构建一个神经网络,根据输入的环境图像输出当前的场景。
在一个可选的实施例中,场景可以包括不同时间场景,例如白天、黑夜;不同天气场景,例如晴天、雨天、雾天、雪天等;不同路况场景等,例如,高速公路、城市道路、乡村道路等。
在一个可选的实施例中,当前场景可以包括按照图像亮度划分的至少两个场景。
在一个可选的实施例中,按照图像亮度划分的当前场景可以包括高亮度场景和低亮度场景。
在一个可选的实施例中,按照图像亮度划分的当前场景可以包括高亮度场景、中亮度场景和低亮度场景。
在一个可选的实施例中,当前场景可以包括根据图像能见度划分的至少两个场景。
在一个可选的实施例中,根据图像能见度划分的当前场景可以包括高可见度场景和低可见度场景。
在一个可选的实施例中,根据图像能见度划分的当前场景可以包括包括高可见度场景、中可见度场景和低可见度场景。
在一个可选的实施例中,根据图像能见度划分的至少两个场景可以包括雾霾场景、沙尘场景、雪天场景、雨天场景等。
在一个可选的实施例中,当前场景可以包括根据图像纹理信息划分的至少两个场景。
在一个可选的实施例中,根据图像纹理信息划分的场景包括天气信息。在一个可选的实施例中,该天气信息包括雨、雪、雾、扬沙等天气信息。
以神经网络为例,用于场景识别的网络只需要输出少量分类结果,想达到准确的输出结果,网络层也不会需要太多的参数。即,用于该步骤判断的神经网络只需要耗费少量的***算力,模型加载也只需要耗费很小的***带宽。
在一个可选的实施例中,在确定当前场景之前可以预加载环境检测模型,使用时无需进行加载操作,可以提高处理效率。
在一个可选的实施例中,预加载的环境检测模型在环境图像获取过程中始终处于加载状态。
为了保证处理效率,在环境图像获取过程中预加载的环境检测模型始终处于加载状态,随时可以利用该环境检测模型确定当前场景。
步骤103、加载与当前场景匹配的场景检测模型。
在一个可选的实施例中,该步骤基于步骤102中确定出的当前场景,加载与当前场景匹配的场景检测模型。
该场景检测模型可以基于CNN、VGG、GoogleNet等神经网络模型建立,并基于不同场景的训练数据进行训练得到不同场景匹配的场景检测模型。
场景可以包括不同时间场景,例如白天、黑夜;不同天气场景,例如晴天、雨天、 雾天、雪天等;不同路况场景等,例如,高速公路、城市道路、乡村道路等。
例如图5、图6中车辆所处的场景,分别是晴天场景和阴天场景,或高亮度场景和低亮度场景。
每个场景对应的场景检测模型可以不需要太多的参数,只需要耗费少量的***算力,多个场景对应的小的场景检测模型代替一个大的通用的检测模型,在算力有限的情况下,使设备可以正常工作。
例如设备的算力是500M,如果实现图像处理功能需要加载一个2.7G的网络模型(例如图7左边的a部分),这显然是无法做到的。而本公开实施例的方案中,通过将大的网络模型拆分成了若干个小于500M的小的网络模型(即场景检测模型,例如图7右边的b部分),使得在设备算力有限的情况下,使设备可以正常工作。
在一个可选的实施例中,场景检测模型还可以基于其他网络模型建立,本公开对此并不限定。
在一个可选的实施例中,与当前场景匹配的场景检测模型,随当前场景的变化而被切换加载。
在一个可选的实施例中,与当前场景匹配的场景检测模型不因切换加载而退出内存。
具体的,基于当前场景加载与当前场景匹配的场景检测模型,若当前场景发生变化,则切换加载与变化后的场景匹配的场景检测模型。
进一步,在切换加载的过程中,场景检测模型可以不退出内存,为了下次使用,可提高加载速度。
在一个可选的实施例中,预加载的环境检测模型与场景检测模型处于不同线程中。
具体的,预加载的环境检测模型与场景检测模型,可以处于不同线程中,例如在利用前一次确定的场景匹配的场景检测模型处理环境图像的同时,还可以利用环境检测模型确定当前场景,此时的场景可能发生变化,与该场景检测模型不匹配。在利用该场景检测模型处理完环境图像后,可以切换加载与变化后的场景匹配的场景检测模型处理环境图像。
在一个可选的实施例中,预加载的环境检测模型通过回调函数进行线程间通信。
例如可以通过回调函数将环境检测模型确定的当前场景的信息通知给场景检测模型,或者基于回调函数获取图像获取装置得到的环境图像。
步骤104、基于场景检测模型处理环境图像。
在一个可选的实施例中,基于识别出的当前场景对应的场景检测模型处理环境图像,例如识别该环境图像中目标物体的位置、该目标物体所属的类别以及在该类别中的置信度等。
在一个可选的实施例中,基于场景检测模型处理环境图像包括:获取环境图像中的物体信息。
在一个可选的实施例中,物体信息包括:环境图像中目标物体的位置信息、所述目标物体的类别信息和所述目标物体在对应类别中的置信度。
在一个可选的实施例中采用非极大值抑制方法对物体信息进行过滤,获取目标检测结果。
具体的,场景检测模型输出的物体信息中包括的目标物体的信息数量非常多,其中会有很多重复的信息,例如位置信息有很多,其中有些内容有重叠。可以采用非极大值抑制等方法对物体信息进行过滤,得到最终的目标检测结果。
即最终可以获得图像上感兴趣物体的位置、类别和置信度。该输出可以作为外界的观测信息提供给下游模块,比如状态估计、导航控制等,用于完成更加复杂的自动驾驶功能。
在一个可选的实施例中,将环境图像的信息输入到加载的当前场景对应的场景检测模型中,经过场景检测模型的若干个网络层输出目标检测结果,例如包括:目标物体的位置、所属的类别以及在该类别中的置信度等信息。其中,目标物体例如可以是动态目标和/或静态目标,动态目标例如可以包括行驶的车辆、无人机等,静态目标例如可以包括周边的数目、道路指示牌、电线杆等等。
示例性的,如图5所示,车辆加载的图像获取装置获取车辆周边的环境图像,车辆使用预加载的环境检测模型,根据环境图像确定当前场景,例如确定出当前场景为高亮度场景,加载该高亮度场景对应的场景检测模型,并基于该场景检测模型处理图像获取装置获取的环境图像。
示例性的,如图6所示,车辆加载的图像获取装置获取车辆周边的环境图像,车辆使用预加载的环境检测模型,根据环境图像确定当前场景,例如确定出当前场景为低亮度场景,加载该低亮度场景对应的场景检测模型,并基于该场景检测模型处理图像获取装置获取的环境图像。
本实施例的方法,获取环境图像;使用预加载的环境检测模型,根据所述环境图像确认确定当前的场景;加载与所述当前场景匹配的场景检测模型;基于场景检测模型处理环境图像,在算力受到约束的情况下,选择当前场景对应的轻量化的场景检测 模型,提高了图像处理的效率和不同场景下各自的性能。
在上述实施例的基础上,进一步的,在对环境图像进行处理或基于环境图像确定场景之前,还可以对所述环境图像进行压缩处理。
具体的,获取到的环境图像一般为彩色RGB图像信息,图像分辨率一般较大,例如为1280×720,在对环境图像进行处理时,可以对该环境图像进行压缩处理,例如将分辨率压缩到640×360,在算力约束时可以提高处理效率。
在一个可选的实施例中,预加载的环境检测模型用于提取环境图像中的亮度信息,确定当前场景。
例如可以获取到环境图像的RGB或HSV信息,从而提取环境图像中的亮度信息,进而确定当前场景,例如按照图像亮度划分的高亮度场景、中亮度场景和低亮度场景等。例如根据图像能见度划分的高可见度场景、中可见度场景和低可见度场景等。
在一个可选的实施例中,预加载的环境检测模型用于提取环境图像中的亮度信息和图像,确定当前场景。
进一步的,上述预加载的环境检测模型除了可以提取环境图像的亮度信息,还可以提取图像,结合图像和亮度信息,确定当前场景。
进一步的,步骤102的一种可能的实现方式如下:
获取所述环境图像中的分布信息,利用所述分布信息确定当前场景。
在一个可选的实施例中,读取步骤101中获得的环境图像的RGB或HSV分布信息,根据分布信息判断当前场景。
对于RGB分布信息来说,在一个可选的实施例中,获取到环境图像中的RGB分布信息后,可以将环境图像中像素点的R、G、B三个通道的信息分别求平均得到各个通道对应的平均像素值,或者,获取亮度值大于预设亮度值的像素的占比等,从而确定当前场景,例如亮度值大于预设亮度值的像素的占比大于一定值可以确定为高亮度场景,例如白天场景。
对于HSV分布信息来说,HSV是一种将RGB色彩空间中的点在倒圆锥体中的表示方法。HSV即色相(Hue)、饱和度(Saturation)、亮度(Value),色相即颜色的基本属性,即平常所说的颜色名称,如红色、黄色等;饱和度指色彩的纯度,越高则色彩越纯,低则逐渐变灰,取0-100%的数值;亮度指色彩的明亮程度,取0-100%。
在一个可选的实施例中,获取到环境图像信息中的HSV分布信息后,可以将环境图像中像素点的H、S、V三个通道的信息分别求平均得到各个通道对应的平均像素值, 或者,获取亮度值大于预设亮度值的像素的占比,或者,获取红黄光占比,从而确定出当前场景。
进一步的,步骤102的另一种可能的实现方式如下:
统计所述环境图像中的直方图信息,利用所述直方图信息确定当前场景。
在一个可选的实施例中,读取步骤101中获得的环境图像的RGB或HSV直方图信息,根据RGB或HSV直方图判断当前场景。
在一个可选的实施例中,对于RGB直方图信息来说,在一个可选的实施例中,获取到环境图像后,对环境图像中像素点的R、G、B三个通道进行统计得到直方图信息,从而根据R、G、B三个通道的直方图信息确定当前场景。
在一个可选的实施例中,对于HSV直方图信息来说,在一个可选的实施例中,获取到环境图像后,对环境图像中像素点的H、S、V三个通道进行统计得到直方图信息,从而根据H、S、V三个通道的直方图信息确定当前场景。
进一步的,还可以根据前述步骤得到的分布信息或直方图信息,利用预先训练得到的环境检测模型,确定当前场景。
在一个可选的实施例中,还可以将前述得到的分布信息或直方图信息输入到预先训练得到的环境检测模型,输出当前场景的信息,从而确定出当前场景。
进一步的,步骤102的另一种可能的实现方式如下:
根据所述环境图像,利用预先训练得到的环境检测模型,确定当前场景。
在一个可选的实施例中,可以直接将环境图像输入到环境检测模型中,输出对应的当前场景的信息。
其中,该环境检测模型可以基于CNN等神经网络模型建立,并基于训练数据进行训练,得到该环境检测模型的较佳参数。
该环境检测模型可以只输出少量分类结果,想达到准确的输出结果,网络层也不会需要太多的参数。即,用于该步骤判断的神经网络只需要耗费少量的***算力,模型加载也只需要耗费很小的***带宽。
在本公开的其他实施例中,环境检测模型还可以基于其他网络模型建立,本公开实施例对此并不限定。
进一步的,步骤102的另一种可能的实现方式如下:
获取所述环境图像中的路标信息;
根据所述路标信息确定所述当前场景。
具体的,获取环境图像中的路标信息,根据路标信息确定当前场景,例如为城市道路场景、高速路场景等。例如可以通过识别算法获取环境图像信息中的路标信息。
在上述实施例的基础上,进一步的,步骤104具体可以采用如下方式实现:
若确定出的当前场景包括多个场景,例如白天场景、雪天场景、高速公路场景(例如依据一个环境图像可以同时确定出多个场景,例如既是白天场景也是雪天场景也是高速公路场景),则可以依次加载上述多个场景对应的场景检测模型,基于多个场景对应的场景检测模型处理环境图像。
在一个可选的实施例中,假设,首先,加载白天场景匹配的场景检测模型,基于白天场景匹配的场景检测模型处理该环境图像,获取第一检测结果;进一步,加载雪天场景匹配的场景检测模型,将该第一检测结果和环境图像的信息输入雪天场景匹配的场景检测模型,基于该雪天场景匹配的场景检测模型处理该第一检测结果和环境图像的信息,第一检测结果可以作为先验信息,使得获取到的第二检测结果更为准确;进一步,加载高速公路场景匹配的场景检测模型,将该第一检测出结果、第二检测结果和环境图像的信息输入高速公路场景匹配的场景检测模型,基于该高速公路场景匹配的场景检测模型处理该第一检测结果、第二检测结果和环境图像的信息,第一检测结果和第二检测结果可以作为先验信息,使得获取到的第三检测结果更为准确,最终根据第三检测结果获取目标检测结果,或者根据第一检测结果、第二检测结果和第三检测结果获取目标检测结果。
在一个可选的实施例中,获取目标检测结果具体可以通过如下方式实现:
采用非极大值抑制方法对所述第三检测结果(或第一检测结果、第二检测结果和第三检测结果中的至少一项)进行过滤,获取所述目标检测结果;所述目标检测结果包括以下至少一项:所述环境图像信息中目标物体的位置信息、所述目标物体的类别信息和所述目标物体在对应类别中的置信度。
具体的,场景检测模型输出的检测结果中包括的目标物体的信息数量非常多,其中会有很多重复的信息,例如位置信息有很多,其中有些内容有重叠。可以采用非极大值抑制等方法对检测结果进行过滤,得到最终的目标检测结果。
即最终可以获得图像上感兴趣物体的位置、类别和置信度。该输出可以作为外界的观测信息提供给下游模块,比如状态估计、导航控制等,用于完成更加复杂的自动驾驶功能。
在上述实施例的基础上,进一步的,步骤103之前还可以进行如下操作:
获取与所述当前场景匹配的场景检测模型对应的训练数据;所述训练数据包括不 同场景中包括目标物体的位置信息和类别信息的环境图像数据;
通过所述训练数据训练所述场景检测模型。
具体的,不同场景对应的场景检测模型都需要通过预先训练得到该场景检测模型的较优参数。
为了得到针对白天环境、夜晚环境等不同场景有着更好性能的场景检测模型,需要针对白天数据、夜晚数据等不同场景对应的训练数据分别训练模型。具体地,预先针对白天、夜晚等不同场景分别采集一批训练数据,每个训练数据包含环境图像和该环境图像上感兴趣物体的位置和类别标注,然后基于不同场景对应的训练数据分别设计模型并训练,从而得到不同场景下较优的场景检测模型。
上述具体实施方式中,在模型训练过程中,针对每一个场景使用相应的训练集训练场景检测模型。在实际使用过程中,首先根据环境图像判断环境对应的当前场景,然后再加载当前场景对应的场景检测模型来进行目标检测,从而提升检测性能,而且在算力受到约束的情况下,提高检测效率。
图8是本公开提供的目标检测方法另一实施例的流程示意图。如图8所示,本实施例提供的方法,包括:
步骤201、获取环境图像。
环境图像可以是图像获取装置采集到的图像信息,例如车辆周边的环境图像,环境图像可以包括多个图像,例如包括触发加载对应场景检测模型的图像、或用于确定当前场景的图像。
步骤202、提取环境图像中的特征信息。
进一步,在步骤202之前,还可以对环境图像进行压缩处理。
步骤203、根据环境图像中的特征信息,确定当前场景。
具体的,针对该环境图像信息可以判断当前场景,例如不同时间场景,例如白天场景或夜晚场景。
获取到的环境图像一般为彩色RGB图像信息,图像分辨率一般较大,例如为1280×720,在对环境图像信息进行处理时,可以对该环境图像信息进行压缩处理,例如将分辨率压缩到640×360,可以在算力约束时提高处理效率。
在一个可选的实施例中,通过环境图像中提取出的特征信息,利用环境检测模型可以确定出当前场景,例如为白天场景或夜晚场景。
其中,特征信息包括以下至少一项:平均像素值、高亮度值占比、红黄光占比、色调饱和度明度HSV三通道统计直方图。
以下介绍提取特征信息的过程:
彩色图像可以由R、G、B三个通道堆叠而成,可分别提取每个通道的直方图。其中,平均像素值可以是将三个通道分别求平均。高亮度值占比指的是亮度值大于预设高亮值的像素的占比。
HSV是一种将RGB色彩空间中的点在倒圆锥体中的表示方法。HSV即色相(Hue)、饱和度(Saturation)、亮度(Value),色相即颜色的基本属性,即平常所说的颜色名称,如红色、黄色等;饱和度指色彩的纯度,越高则色彩越纯,低则逐渐变灰,取0-100%的数值;亮度指色彩的明亮程度,取0-100%。
HSV颜色空间特征的提取方法和RGB类似,关键一点就是要将原图像转化为HSV颜色空间的图像,之后再对三个通道分别进行直方图绘制操作即可。
转化为HSV颜色空间的图像信息之后还可以获取到红黄光占比。
其中,HSV三通道统计直方图的特征信息的数目可以为3×20=60,在一实施例中可以将上述4个特征拼接在一起组成长度为63的特征信息。
进一步,可以采用预先训练得到的环境检测模型,将提取的特征信息输入到环境检测模型中,输出对应的当前场景的信息;
在本公开的其他实施例中,还可以直接将环境图像输入到环境检测模型中,输出对应的当前场景的信息。
进一步的,对于不同的白天、夜晚等时间场景或雪天、雾天、雨天、晴天等天气场景等,步骤203具体可以采用如下方式实现:
根据环境图像中的特征信息,确定当前场景所处的环境光强。
根据当前场景所处的环境光强,确定当前场景。
在一个可选的实施例中,可以采用预先训练得到的环境检测模型,将提取的特征信息输入到环境检测模型中,输出当前场景所处的环境光强,根据该环境光强,确定当前场景,由于不同的时间场景,例如白天场景和夜晚场景的环境光强不同,因此可以根据环境光强确定当前场景。
在本公开的一实施例中,预先还可以对环境检测模型进行训练,具体可以通过如下方式实现:
获取训练数据;所述训练数据包括多个环境图像的特征信息以及各个环境图像对 应的场景信息,或,多个环境图像以及各个环境图像对应的场景信息;
通过所述训练数据对预先建立的环境检测模型进行训练,得到训练后的环境检测模型。
具体的,环境检测模型可以通过深度学习算法,例如卷积神经网络CNN模型、VGG模型、GoogleNet模型等建立,为了得到针对白天场景、夜晚场景等不同场景有着更好识别性能的环境检测模型,需要针对白天场景、夜晚场景等不同场景对应的训练数据训练该环境检测模型,以得到该环境检测模型的较优的参数。
步骤204、加载与当前场景匹配的场景检测模型。
具体的,该步骤基于步骤203中确定出的当前场景,在设备的内存中加载对应的场景检测模型。
步骤205、基于场景检测模型处理环境图像,获取第一检测结果。
具体的,基于该当前场景对应的场景检测模型处理环境图像,例如识别该环境图像中目标物体的位置、该目标物体所属的类别以及在该类别中的置信度等。
其中,场景检测模型可以是预先训练得到的机器学习模型,例如卷积神经网络模型等。在模型训练过程中针对每一个场景使用相应的训练数据集训练场景检测模型。在检测时,将环境图像的信息输入当前场景对应的场景检测模型,经过若干个卷积层、池化层等处理,输出第一检测结果。
步骤206、采用非极大值抑制方法对第一检测结果进行过滤,获取目标检测结果;目标检测结果包括以下至少一项:环境图像中目标物体的位置信息、目标物体的类别信息和目标物体在对应类别中的置信度。
具体的,场景检测模型输出的检测结果中包括的目标物体的信息数量非常多,其中会有很多重复的信息,例如位置信息有很多,其中有些内容有重叠。可以采用非极大值抑制等方法对检测结果进行过滤,得到最终的目标检测结果。
即最终可以获得图像上感兴趣物体的位置、类别和置信度。该输出可以作为外界的观测信息提供给下游模块,比如状态估计、导航控制等,用于完成更加复杂的自动驾驶功能。
进一步的,在本公开的一实施例中,如图5所示,若所述当前场景包括第一场景和第二场景,步骤205可以通过如下方式实现:
步骤2051、基于第一场景匹配的场景检测模型处理环境图像,获取第一检测结果;
步骤2052、基于第二场景匹配的场景检测模型处理第一检测结果,获取第二检测结果;
步骤2053、根据第二检测结果,获取目标检测结果。
具体的,基于环境图像可以确定出所处的场景,例如当前场景包括不同的白天、夜晚等时间场景,或雪天、雾天、雨天、晴天等天气场景,或高速路、乡村路、城市路等路况场景。
假设基于环境图像确定出当前场景至少包括两种场景,例如第一场景和第二场景。
假设第一场景为时间场景中的白天场景,基于该第一场景匹配的场景检测模型处理该环境图像,获取第一检测结果;进一步,将该第一检测结果输入第二场景,例如第二场景为天气场景中的雪天场景,基于该第二场景匹配的场景检测模型处理该第一检测结果,获取第二检测结果,最终根据第二检测结果获取目标检测结果。由于在利用第二场景匹配的检测模型进行目标检测时,已利用第一场景匹配的场景检测模型处理该环境图像,得到了先验信息,使得最终获取的目标检测结果更准确。
在一个可选的实施例中,第一场景和第二场景可以分别是高亮度场景和低亮度场景。
在本公开的其他实施例中,也可以先基于第二场景匹配的场景检测模型处理,再基于第一场景匹配的场景检测模型处理,本公开实施例对此并不限定。
图9中其余步骤参见图8说明,此处不再赘述。
本实施例的方法,获取环境图像;根据所述环境图像确认确定当前的场景;加载与所述当前场景匹配的场景检测模型;基于场景检测模型处理环境图像,在算力受到约束的情况下,选择当前场景对应的轻量化的场景检测模型,提高了图像处理的效率和不同场景下各自的检测性能。
如图10所示,本公开实施例中还提供一种车辆,所述车辆搭载有摄像装11,存储器12,以及处理器13,所述存储器12用于存储指令,所述指令被处理器13执行以实现前述方法实施例中任一项所述的方法。
本实施例提供的车辆,用于执行前述任一实施例提供的图像处理方法,技术原理和技术效果相似,此处不再赘述。
如图11所示,本公开实施例中还提供一种无人机,所述无人机搭载有摄像装置21,存储器22,以及处理器23,所述存储器22用于存储指令,所述指令被处理器23执行以实现前述方法实施例中任一项所述的方法。
本实施例提供的无人机,用于执行前述任一实施例提供的图像处理方法,技术原理和技术效果相似,此处不再赘述。
如图12所示,本公开实施例中还提供一种电子设备,与摄像装置可通信连接,所述电子设备包含存储器32,以及处理器31,所述存储器32用于存储指令,所述指令被处理器31执行以实现前述方法实施例中任一项所述的方法。
本实施例提供的电子设备,用于执行前述任一实施例提供的图像处理方法,技术原理和技术效果相似,此处不再赘述。
如图13所示,本公开实施例中还提供一种手持云台,所述手持云台包括:摄像装置41,存储器42,以及处理器43,所述存储器42用于存储指令,所述指令被处理器43执行以实现前述方法实施例中任一项所述的方法。
本实施例提供的手持云台,用于执行前述任一实施例提供的图像处理方法,技术原理和技术效果相似,此处不再赘述。
如图14所示,本公开实施例中还提供一种移动终端,所述移动终端包括:摄像装置51,存储器52,以及处理器53,所述存储器52用于存储指令,所述指令被处理器53执行以实现前述方法实施例中任一项所述的方法。
本实施例提供的移动终端,用于执行前述任一实施例提供的图像处理方法,技术原理和技术效果相似,此处不再赘述。
本公开实施例中还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现前述方法实施例中对应的方法,其具体实施过程可以参见前述方法实施例,其实现原理和技术效果类似,此处不再赘述。
本公开实施例中还提供一种程序产品,该程序产品包括计算机程序(即执行指令),该计算机程序存储在可读存储介质中。处理器可以从可读存储介质读取该计算机程序,处理器执行该计算机程序用于执行前述方法实施例中任一实施方式提供的目标检测方法。
本公开实施例中还提供一种车辆,包括:
车体;以及
前述任一实施例所述的电子设备,所述电子设备安装在所述车体上。其实现原理和技术效果与方法实施例类似,此处不再赘述。
本公开实施例中还提供一种无人机,包括:
机身;以及
前述任一实施例所述的电子设备,所述电子设备安装在所述车体上。其实现原理和技术效果与方法实施例类似,此处不再赘述。
图15是本说明书实施例提供的模型加载过程中内存占用比例示意图。环境检测模型始终被加载,例如,其可以在可移动平台工作过程中始终被加载于处理器内存中。其只需要判断当前环境,所需占用的***资源较小,环境检测模型只需要识别并输出当前环境的类别信息,该类别信息用于场景检测模型的加载。场景检测模型用于对可移动平台周围物体的检测,一方面,环境检测模型和场景模型可以在很大程度上减小被加载模型的占用资源;另一方面,场景模型占用资源会大于环境检测模型。作为一个可选的实施例,环境检测模型可以是一个经过训练的神经网络模型,可以根据输入的图像信息输出识别的分类结果,例如白天、夜晚、雨、雪、雾。作为一个可选的实施例,环境检测模型可以是一个经过训练的神经网络模型,可以根据输入的图像信息输出识别的二维分类结果,例如白天-雨、夜晚-雨、白天-雾。作为一个可选的实施例,环境检测模型可以是一个经过训练的神经网络模型,可以根据输入的图像信息输出识别的三维分类结果,维度包括但是不限于天候-气候明亮度,例如白天-雨-昏暗、夜晚-雨-黑暗、白天-晴-明亮。作为一个可选的实施例,环境检测模型可以是一个经过训练的神经网络模型,可以根据输入的图像信息输出识别的四维甚至高维分类结果,维度包括但是不限于天候-气候明亮度,例如白天-雨-昏暗-道路、夜晚-雨-黑暗-道路、白天-晴-明亮-隧道。作为一个可选的实施例,环境检测模型可以是一个基于图像传感器输出参数的判断函数,例如根据图像的亮度信息判断白天或者黑夜。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本公开实施例的技术方案,而非对其限制;尽管参照前述各实施例对本公开实施例进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的范围。

Claims (32)

  1. 一种基于机器视觉的图像处理方法,应用于搭载图像获取装置的可移动平台,其特征在于,所述方法包括:
    获取环境图像;
    使用预加载的环境检测模型,根据所述环境图像确定当前场景;
    加载与所述当前场景匹配的场景检测模型;
    基于所述场景检测模型处理环境图像。
  2. 根据权利要求1所述的方法,其特征在于,所述当前场景包括按照图像亮度划分的至少两个场景。
  3. 根据权利要求2所述的方法,其特征在于,所述当前场景包括高亮度场景和低亮度场景。
  4. 根据权利要求2所述的方法,其特征在于,所述当前场景包括高亮度场景、中亮度场景和低亮度场景。
  5. 根据权利要求1所述的方法,其特征在于,所述当前场景包括根据图像能见度划分的至少两个场景。
  6. 根据权利要求5所述的方法,其特征在于,所述当前场景包括高可见度场景和低可见度场景。
  7. 根据权利要求5所述的方法,其特征在于,所述当前场景包括高可见度场景、中可见度场景和低可见度场景。
  8. 根据权利要求5所述的方法,其特征在于,所述根据图像能见度划分的至少两个场景包括雾霾场景、沙尘场景。
  9. 根据权利要求1所述的方法,其特征在于,所述当前场景包括根据图像纹理信息划分的至少两个场景。
  10. 根据权利要求9所述的方法,其特征在于,所述根据图像纹理信息划分的场景包括天气信息。
  11. 根据权利要求10所述的方法,其特征在于,所述天气信息包括雨、雪、雾、扬沙天气信息。
  12. 根据权利要求1所述的方法,其特征在于,所述预加载的环境检测模型用于提取环境图像中的亮度信息,确定当前场景。
  13. 根据权利要求1所述的方法,其特征在于,所述预加载的环境检测模型用于 提取环境图像中的亮度信息和图像,确定当前场景。
  14. 根据权利要求1所述的方法,其特征在于,所述预加载的环境检测模型在所述图像获取过程中始终处于加载状态。
  15. 根据权利要求14所述的方法,其特征在于,与所述当前场景匹配的场景检测模型,随当前场景的变化而被切换加载。
  16. 根据权利要求15所述的方法,其特征在于,与所述当前场景匹配的场景检测模型不因切换加载而退出内存。
  17. 根据权利要求1所述的方法,其特征在于,所述预加载的环境检测模型与所述场景检测模型处于不同线程中。
  18. 根据权利要求17所述的方法,其特征在于,所述预加载的环境检测模型通过回调函数进行线程间通信。
  19. 根据权利要求1所述的方法,其特征在于,基于所述场景检测模型处理环境图像包括:获取环境图像中的物体信息。
  20. 根据权利要求19所述的方法,其特征在于,
    采用非极大值抑制方法对获取的所述物体信息进行过滤,获取目标检测结果。
  21. 根据权利要求19所述的方法,其特征在于,所述物体信息包括:所述环境图像中目标物体的位置信息、所述目标物体的类别信息和所述目标物体在对应类别中的置信度。
  22. 根据权利要求21所述的方法,其特征在于,所述根据所述环境图像确定当前场景,包括:
    提取所述环境图像中的特征信息;
    根据所述环境图像中的特征信息,确定所述当前场景。
  23. 根据权利要求21所述的方法,其特征在于,所述根据所述环境图像中的特征信息,确定所述当前场景,包括:
    根据所述环境图像中的特征信息,确定所述当前场景所处的环境光强;
    根据所述当前场景所处的环境光强,确定所述当前场景。
  24. 根据权利要求22所述的方法,其特征在于,所述提取所述环境图像中的特征信息之前,还包括:
    对所述环境图像进行压缩处理。
  25. 根据权利要求22所述的方法,其特征在于,所述特征信息包括以下至少一 项:平均像素值、高亮度值占比、红黄光占比、色调饱和度明度HSV三通道统计直方图。
  26. 根据权利要求22所述的方法,其特征在于,根据所述环境图像确定所述当前场景,包括:
    获取所述环境图像中的路标信息;
    根据所述路标信息确定所述当前场景。
  27. 根据权利要求1所述的方法,其特征在于,所述基于所述场景检测模型处理环境图像之前,还包括:
    获取与所述当前场景匹配的场景检测模型对应的训练数据;所述训练数据包括不同场景中包括目标物体的位置信息和类别信息的环境图像数据;
    通过所述训练数据训练所述场景检测模型。
  28. 一种车辆,其特征在于,所述车辆搭载有摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现权利要求1-27之一所述的方法。
  29. 一种无人机,其特征在于,所述无人机搭载有摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现权利要求1-27之一所述的方法。
  30. 一种电子设备,其特征在于,与摄像装置可通信连接,所述电子设备包含存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现权利要求1-27之一所述的方法。
  31. 一种手持云台,其特征在于,所述手持云台包括:摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现权利要求1-27之一所述的方法。
  32. 一种移动终端,其特征在于,所述移动终端包括:摄像装置,存储器,以及处理器,所述存储器用于存储指令,所述指令被处理器执行以实现权利要求1-27之一所述的方法。
PCT/CN2019/100710 2019-08-15 2019-08-15 基于机器视觉的图像处理方法和设备 WO2021026855A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980033604.7A CN112204566A (zh) 2019-08-15 2019-08-15 基于机器视觉的图像处理方法和设备
PCT/CN2019/100710 WO2021026855A1 (zh) 2019-08-15 2019-08-15 基于机器视觉的图像处理方法和设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/100710 WO2021026855A1 (zh) 2019-08-15 2019-08-15 基于机器视觉的图像处理方法和设备

Publications (1)

Publication Number Publication Date
WO2021026855A1 true WO2021026855A1 (zh) 2021-02-18

Family

ID=74004737

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/100710 WO2021026855A1 (zh) 2019-08-15 2019-08-15 基于机器视觉的图像处理方法和设备

Country Status (2)

Country Link
CN (1) CN112204566A (zh)
WO (1) WO2021026855A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114666501A (zh) * 2022-03-17 2022-06-24 深圳市百泰实业股份有限公司 一种可穿戴设备的摄像头智能控制方法
CN115859158A (zh) * 2023-02-16 2023-03-28 荣耀终端有限公司 场景识别方法、***及终端设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114532919B (zh) * 2022-01-26 2023-07-21 深圳市杉川机器人有限公司 多模态目标检测方法、装置、扫地机及存储介质

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150328A1 (en) * 2009-12-21 2011-06-23 Electronics And Telecommunications Research Institute Apparatus and method for blockiing objectionable image on basis of multimodal and multiscale features
CN105812746A (zh) * 2016-04-21 2016-07-27 北京格灵深瞳信息技术有限公司 一种目标检测方法及***
JP2016218760A (ja) * 2015-05-20 2016-12-22 株式会社日立製作所 物体検出システム、物体検出方法、poi情報作成システム、警告システム、及び誘導システム
CN107465855A (zh) * 2017-08-22 2017-12-12 上海歌尔泰克机器人有限公司 图像的拍摄方法及装置、无人机
CN107609502A (zh) * 2017-09-05 2018-01-19 百度在线网络技术(北京)有限公司 用于控制无人驾驶车辆的方法和装置
CN107622273A (zh) * 2016-07-13 2018-01-23 深圳雷柏科技股份有限公司 一种目标检测和辨识的方法和装置
CN108701214A (zh) * 2017-12-25 2018-10-23 深圳市大疆创新科技有限公司 图像数据处理方法、装置及设备
CN109218619A (zh) * 2018-10-12 2019-01-15 北京旷视科技有限公司 图像获取方法、装置和***
CN109815844A (zh) * 2018-12-29 2019-05-28 西安天和防务技术股份有限公司 目标检测方法及装置、电子设备和存储介质
CN109871730A (zh) * 2017-12-05 2019-06-11 杭州海康威视数字技术股份有限公司 一种目标识别方法、装置及监控设备

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150328A1 (en) * 2009-12-21 2011-06-23 Electronics And Telecommunications Research Institute Apparatus and method for blockiing objectionable image on basis of multimodal and multiscale features
JP2016218760A (ja) * 2015-05-20 2016-12-22 株式会社日立製作所 物体検出システム、物体検出方法、poi情報作成システム、警告システム、及び誘導システム
CN105812746A (zh) * 2016-04-21 2016-07-27 北京格灵深瞳信息技术有限公司 一种目标检测方法及***
CN107622273A (zh) * 2016-07-13 2018-01-23 深圳雷柏科技股份有限公司 一种目标检测和辨识的方法和装置
CN107465855A (zh) * 2017-08-22 2017-12-12 上海歌尔泰克机器人有限公司 图像的拍摄方法及装置、无人机
CN107609502A (zh) * 2017-09-05 2018-01-19 百度在线网络技术(北京)有限公司 用于控制无人驾驶车辆的方法和装置
CN109871730A (zh) * 2017-12-05 2019-06-11 杭州海康威视数字技术股份有限公司 一种目标识别方法、装置及监控设备
CN108701214A (zh) * 2017-12-25 2018-10-23 深圳市大疆创新科技有限公司 图像数据处理方法、装置及设备
CN109218619A (zh) * 2018-10-12 2019-01-15 北京旷视科技有限公司 图像获取方法、装置和***
CN109815844A (zh) * 2018-12-29 2019-05-28 西安天和防务技术股份有限公司 目标检测方法及装置、电子设备和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114666501A (zh) * 2022-03-17 2022-06-24 深圳市百泰实业股份有限公司 一种可穿戴设备的摄像头智能控制方法
CN115859158A (zh) * 2023-02-16 2023-03-28 荣耀终端有限公司 场景识别方法、***及终端设备

Also Published As

Publication number Publication date
CN112204566A (zh) 2021-01-08

Similar Documents

Publication Publication Date Title
Mehra et al. ReViewNet: A fast and resource optimized network for enabling safe autonomous driving in hazy weather conditions
CN109740465B (zh) 一种基于实例分割神经网络框架的车道线检测算法
CN106599773B (zh) 用于智能驾驶的深度学习图像识别方法、***及终端设备
CN109753913B (zh) 计算高效的多模式视频语义分割方法
CN114413881B (zh) 高精矢量地图的构建方法、装置及存储介质
WO2021026855A1 (zh) 基于机器视觉的图像处理方法和设备
CN109145798B (zh) 一种驾驶场景目标识别与可行驶区域分割集成方法
EP3839888B1 (en) Compute device and method for detection of occlusions on a camera
CN107480676B (zh) 一种车辆颜色识别方法、装置和电子设备
CN115004242A (zh) 同时进行实时对象检测和语义分割的***和方法
CN111837158A (zh) 图像处理方法、装置、拍摄装置和可移动平台
CN111767831B (zh) 用于处理图像的方法、装置、设备及存储介质
CN105678318A (zh) 交通标牌的匹配方法及装置
Zhou et al. Adapting semantic segmentation models for changes in illumination and camera perspective
CN113673584A (zh) 一种图像检测方法及相关装置
CN110720224B (zh) 图像处理方法和装置
CN114037938A (zh) 一种基于NFL-Net的低照度目标检测方法
CN116052090A (zh) 图像质量评估方法、模型训练方法、装置、设备及介质
CN115661522A (zh) 一种基于视觉语义矢量的车辆导引方法、***、设备和介质
CN111723805B (zh) 一种信号灯的前景区域识别方法及相关装置
CN106650814B (zh) 一种基于车载单目视觉室外道路自适应分类器生成方法
CN112766232A (zh) 一种基于可重构卷积神经网络的道路风险目标识别方法
CN112529917A (zh) 一种三维目标分割方法、装置、设备和存储介质
CN112115737B (zh) 一种车辆朝向的确定方法、装置及车载终端
CN115565155A (zh) 神经网络模型的训练方法、车辆视图的生成方法和车辆

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941360

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941360

Country of ref document: EP

Kind code of ref document: A1