CN114677719A

CN114677719A - Method, apparatus and computer-readable storage medium for image signal processing

Info

Publication number: CN114677719A
Application number: CN202110039921.9A
Authority: CN
Inventors: 朱力于
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-12-09
Filing date: 2021-01-13
Publication date: 2022-06-28

Abstract

Disclosed are a method, a device and a computer readable storage medium for image signal processing, which belong to the technical field of image processing. The method comprises the following steps: acquiring a shot first raw image; inputting the first raw image into a trained control network to obtain first control information; selecting a value corresponding to the first control information from the multiple values of each ISP parameter; and carrying out ISP on the first raw image based on the value selected for each ISP parameter.

Description

Method, apparatus and computer-readable storage medium for image signal processing

The present application claims priority from chinese patent application No. 202011428755.3 entitled "an imaging scheme for achieving automatic scene optimization" filed on 09/12/2020, which is incorporated herein by reference in its entirety.

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for processing an image signal, and a computer-readable storage medium.

Background

With the continuous development of image acquisition processing technology and network technology, video monitoring equipment is widely applied to various places such as streets, buildings, shops, parks and the like. Video surveillance equipment is typically provided with one or more task processes (also referred to as back-end task processes), such as face recognition and the like. In order to better complete the back-end task processing, the video monitoring apparatus generally performs Image Signal Processing (ISP) on a raw image output by the image sensor, so that the processed image is more favorable for the back-end task processing.

The image signal processing process involves a large number of ISP parameters, such as the filtering strength of the sharpening filter, the filtering mode, the gain of different frequency points, and the like, and also includes the defogging strength, the defogging mode, and the like. In the using process of the video monitoring equipment, various shooting scenes such as a rain fog scene, a low-light scene and a wide dynamic scene (the brightness difference between a bright part and a dark part is large) may be faced, for any shooting scene, the value of each ISP parameter needs to be set, and different shooting scenes respectively correspond to a set of values for image signal processing of the corresponding shooting scenes.

In the course of implementing the present application, the inventors found that the related art has at least the following problems:

in order to improve the accuracy of the result of the back-end task processing, shooting scenes need to be divided as finely as possible, and thus a set of corresponding values needs to be set for a large number of divided shooting scenes. However, in the actual operation process, a plurality of technicians are required to work for tens of days, hundreds of days, or even longer in the configuration and debugging process of a set of values of each shooting scene, which causes a lot of time waste.

Disclosure of Invention

The embodiment of the application provides a method and a device for processing image signals and a computer readable storage medium. The technical scheme is as follows:

In a first aspect, a method of image signal processing is provided, the method comprising: a first raw image is acquired. And inputting the first raw image into the trained control network to obtain first control information. And selecting a value corresponding to the first control information from the multiple values of each ISP parameter. And carrying out ISP on the first raw image based on the value selected for each ISP parameter.

The raw image according to the embodiment of the present application includes a first raw image, a second raw image, a sample raw image, and the like, and may be a single raw image frame or a video segment composed of a plurality of consecutive raw image frames, and the raw image frame is described as an example below. The control network may be a machine learning model, and may specifically be a convolutional neural network. The control information according to the embodiment of the present application is information for selecting values of ISP parameters.

According to the scheme shown in the embodiment of the application, after the processor receives the raw image frame, the trained control network is called in the memory, the raw image frame is input into the control network, and the control network outputs the first control information. In the present apparatus, a large number of values for selection are stored for each ISP parameter, for example, a total of 100 ISP parameters are stored, and each ISP parameter correspondingly stores 5 values for selection. The control information is used for selecting one of the values, and one value is selected corresponding to each ISP parameter. After the control network outputs the first control information, a corresponding value is selected for each ISP parameter based on the first control information.

After each ISP parameter is selected, a corresponding ISP algorithm can be called, the selected values are brought into the algorithm, and the ISP is carried out on the first raw image frame based on the ISP algorithm to obtain an ISP output image. The ISP output image may be a YUV (name of a color coding) image.

In the processing, the control network can select a proper value for the ISP parameter based on the condition of the currently shot image without manually setting the ISP parameter, so that the time waste can be reduced.

In a possible implementation manner, the first control information includes an identifier of a value to be selected corresponding to each ISP parameter.

In the solution shown in the embodiment of the present application, each value corresponding to any ISP parameter may have a different identifier. The first control information may include an identifier of a value to be selected corresponding to each ISP parameter. Thus, the first control information may be a vector, each element bit of the vector corresponds to an ISP parameter, and an element value in the element bit is an identifier of a value to be selected.

In the processing, the control network is adopted, which is equivalent to more finely dividing the shooting scene, the control information is a vector, the number of possible values of the vector is very large, and each different value of the vector is equivalent to corresponding to a different shooting scene, which is equivalent to subdividing a large number of shooting scenes. If the ISP parameters are configured for each shooting scene manually, the workload is not imaginable, and it can be said that the configuration cannot be realized at all. Therefore, the method of the embodiment of the application can be more beneficial to fine division of the shooting scene, so that the fineness of value selection of the ISP parameters is improved, further, the image signal processing can better serve for the back-end task processing, and the accuracy of the back-end task processing is improved.

In one possible implementation, the method is applied to a camera, and the camera captures a first raw image through a sensor before the first raw image is acquired.

In the scheme shown in the embodiment of the present application, the raw image of the ISP is a raw image captured by a camera in real time through a sensor (also referred to as an image sensor), and the ISP and subsequent processing are performed by a processor.

In a possible implementation manner, after an ISP is performed on a first raw image, a first ISP output image obtained by performing the ISP on the first raw image is obtained, and a processing result is obtained by performing task processing on a target task type on the first ISP output image, where the target task type includes one or more of man-machine and non-human detection, face recognition, situation monitoring, and image index detection.

The method comprises the following steps of detecting positions of motor vehicles, non-motor vehicles and pedestrians in an image by the aid of the unmanned detection. The face recognition is to detect the position of a face in an image; optionally, the face recognition may further recognize whether the face matches face information pre-stored in the database. Situation monitoring is to monitor the execution of an event, for example, a process line of a product to see if there is a problem with the product or if there is a problem with the operation of the process line. The image index detection is to detect objective index parameters of an image, and the index parameters can include definition, white balance and the like.

According to the scheme shown in the embodiment of the application, most of the back-end task processing is tasks executed by a machine learning model. The pre-stored machine learning model of the back-end task processing can be called, and the first ISP output image is input into the machine learning model to obtain a corresponding processing result. For example, if the target task type is face recognition, the face recognition network is called, the first ISP output image is input to the face recognition network, and the face position is output.

In a possible implementation manner, after the task processing of the target task type is performed on the first ISP output image and the processing result is obtained, the first ISP output image and the processing result are sent to the management device.

According to the scheme shown in the embodiment of the application, data communication can be performed between the camera and the management equipment in a wired mode, data communication can also be performed in a wireless mode, data communication can also be performed in a wired and wireless combined mode, and the camera sends the first ISP output image and the processing result to the background management equipment for displaying through the corresponding data communication mode.

In one possible implementation manner, the captured first raw image is acquired every preset time length. After ISP is carried out on the first raw image, the ISP is carried out on the raw image shot within the preset time after the first raw image is shot based on the value of each ISP parameter selected for the first raw image.

According to the scheme shown in the embodiment of the application, the value of the ISP parameter can be selected once based on the method of the first aspect every preset time, and the value of the ISP parameter selected this time is adopted for processing the image signal of the raw image shot in the period of time after the selection until the next selection, so that the processing resource can be saved.

In one possible implementation, the method for processing an image signal further includes: and performing numerical adjustment on the network parameters of the control network to be trained based on the sample raw image, and performing numerical adjustment on the values of the ISP parameters for selection.

In a possible implementation manner, a second ISP output image corresponding to a sample raw image is determined based on the sample raw image, a control network to be trained, and values for selection of ISP parameters. And performing ISP parameter value evaluation processing based on the second ISP output image to obtain an evaluation result. And adjusting the network parameters of the control network to be trained based on the evaluation result. And adjusting the value of the ISP parameter for selection based on the evaluation result.

In one possible implementation, the process of determining the second ISP output image includes: and inputting the sample raw image into a control network to be trained to obtain second control information, wherein the second control information comprises an identifier of a value to be selected corresponding to each ISP parameter. And selecting a value corresponding to the second control information from the multiple values of each ISP parameter. And carrying out ISP on the raw image of the sample based on the value selected for each ISP parameter to obtain a second ISP output image.

The control network to be trained may be an initial control network that has just been established, or may be a control network that has been trained several times. The sample raw image may be captured or acquired through other channels.

In a possible implementation manner, task processing of a target task type is performed on the second ISP output image to obtain a processing result, wherein the target task type includes one or more of man-machine detection, face recognition, situation monitoring, and image index detection. And inputting the processing result and the true value result of the target task type corresponding to the sample raw image into the loss function to obtain an output value of the loss function as an evaluation result.

In the processing, the control network and the ISP parameters are trained based on the processing result and the true value result of the back-end task processing, so that the selection of the ISP parameter value in the monitoring process can better adapt to different shooting scenes and can better adapt to the requirements of the back-end task processing, and the accuracy of the back-end task processing is improved.

In a possible implementation manner, based on the evaluation result, a value corresponding to the second control information is subjected to numerical adjustment.

In a possible implementation manner, the evaluation result is input into the trained parameter adjusting network to obtain an adjustment value sequence, and the value corresponding to the second control information is selected to be adjusted based on the adjustment value sequence.

The parameter adjusting network is used for adjusting the value of the ISP parameter, and can be a machine learning model, specifically a convolutional neural network. The adjustment value sequence is a vector, any element bit of the vector corresponds to an ISP parameter, and an element value in any element bit is an adjustment value used for adjusting the value selected by the second control information among various stored values of the ISP parameter corresponding to the element bit.

According to the scheme shown in the embodiment of the application, the parameter adjusting network is called, and the output value of the loss function is input into the parameter adjusting network to obtain the adjusting value sequence. And determining the value corresponding to the second control information in the stored multiple values of each ISP parameter to obtain a value to be adjusted corresponding to each ISP parameter. Then, the value to be adjusted of each ISP parameter is adjusted based on the adjustment value corresponding to each ISP parameter in the adjustment value sequence, and specifically, the value to be adjusted may be added to the corresponding adjustment value to obtain the adjusted value.

In a possible implementation manner, the evaluation result is sent to the management device, and the adjustment value sequence which is sent by the management device and obtained after the evaluation result is input into the trained parameter adjusting network is received.

According to the scheme shown in the embodiment of the application, the output value of the loss function is sent to the management equipment, the management equipment calls the stored parameter adjusting network, the output value of the loss function is input into the parameter adjusting network to obtain an adjusting value sequence, and then the adjusting value sequence is sent to the terminal equipment.

In the above processing, the parameter adjusting network is adopted to train and adjust various values of the ISP parameters, and training selection is performed in a larger value space to obtain a smaller accurate value space, and the smaller value space can be provided for the control network to perform further selection in the actual application process to obtain the most accurate value. The tuning network may be a larger network, i.e. a network with higher computational complexity, than the control network. Therefore, by the method of the embodiment of the application, a large part of complex processing (training of the value of the ISP parameter by the parameter adjusting network) in a complex parameter selection problem is put in the training process, and only a small part of simple processing (value selection by the control network) is left in the practical application process after training. In addition, the processing of adjusting the parameter network in the training process can be carried out on computer equipment with stronger computing power except for terminal equipment, such as management equipment, so that the training efficiency can be ensured.

Therefore, the more complex value selection problem can be processed by the terminal equipment with smaller calculation force, and the more complex the value selection problem is, the more the accuracy of the back-end task processing is improved. Therefore, the method of the embodiment of the application is beneficial to improving the accuracy of the back-end task processing under the condition that the processing capacity of the terminal equipment is limited.

In a possible implementation manner, before determining a second ISP output image corresponding to a sample raw image based on the sample raw image, a control network to be trained, and a value for selection of an ISP parameter, a true value result of the sample raw image and a target task type corresponding to the sample raw image sent by a management device is received, where the true value result is obtained by calibrating the sample raw image.

According to the scheme shown in the embodiment of the application, a technician can collect a large number of sample raw images. According to a possible scheme, sample raw images are collected for each terminal device for training, and therefore when a certain terminal device needs to be trained, a large number of raw images shot by the terminal device under various conditions such as various times and various weathers can be obtained and used as the sample raw images. According to another possible scheme, a plurality of terminal devices with similar environments and the same model are trained by using a unified sample raw image, so that a large number of raw images shot by different terminal devices with similar environments and the same model under various conditions such as various times and various weathers can be obtained and serve as the sample raw image.

Furthermore, technicians can manually calibrate the sample raw images according to the target task type to obtain a true value result of the target task type corresponding to each sample raw image.

The sample raw image and the corresponding true value result may then be stored in the management device. When the training process is needed, the management equipment sends the sample raw image and the true value result to the terminal equipment for training.

In the processing, the management equipment acquires and stores the sample and the true value result in advance, so that a mode of manually calibrating the true value result can be adopted, the accuracy of the true value result can be effectively improved, the convergence can be quicker, the overall training efficiency is improved, and the higher accuracy of the rear-end task processing of the terminal equipment after training is ensured.

In a possible implementation manner, before determining a second ISP output image corresponding to a sample raw image based on the sample raw image, a control network to be trained, and a value for selection of an ISP parameter, a shot second raw image is obtained, a processing result obtained by performing task processing on the second raw image by using the ISP and a target task type is obtained, the obtained second raw image is used as the sample raw image, and the obtained processing result is used as a true value result of the target task type corresponding to the sample raw image.

According to the scheme shown in the embodiment of the application, a second raw image shot locally is obtained and used as a sample raw image, meanwhile, normal local processing is conducted on the second raw image, the second raw image is input into a control network to obtain control information, the value of an ISP parameter is selected based on the control information, ISP is conducted on the second raw image based on the value of the ISP parameter to obtain a third ISP output image, rear-end task processing of a target task type is conducted based on the third ISP output image to obtain a processing result, and the processing result is used as a true value result of the target task type corresponding to the sample raw image.

The processing may be performed after receiving a noisy training instruction sent by the management device, and the processing may be performed on all raw images or part of raw images captured within a preset time after the noisy training instruction. The noisy training instruction may be sent by the management device during normal operation of the terminal device, or may be sent when the terminal device is in an idle state. In the normal working process of the terminal equipment, the noise training process can be performed in parallel.

In the above processing, after the terminal device is put into use, supplementary noisy training can be performed on the terminal device, so that the accuracy of the back-end task processing is improved. For example, the accuracy of a processing result of performing back-end task processing on a certain shooting scene by the terminal device is poor, and a monitoring person can operate triggering noisy training at any time after finding.

In a second aspect, an apparatus for image signal processing is provided, the apparatus comprising one or more modules for implementing the method of the first aspect and its possible implementations.

In a third aspect, there is provided an image pickup apparatus comprising: the image acquisition component comprises a lens and a sensor and is used for acquiring raw images; a processor: a method for carrying out the first aspect and possible implementations thereof as described above.

In a fourth aspect, a computer-readable storage medium is provided, which stores computer program code, which, when executed by a computing device, performs the method of the first aspect and its possible implementations.

In a fifth aspect, a computer program product is provided, the computer program product comprising computer program code, which when executed by a computing device, causes the computing device to perform the method of the first aspect and possible implementations thereof.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

in the embodiment of the application, a large number of values are stored corresponding to each ISP parameter, the raw image is input into the control network to obtain the control information, one standby value can be selected from the large number of values of each ISP parameter based on the control information, and then the raw image is subjected to image signal processing based on the value selected for each ISP parameter. Therefore, the control network can select a proper value for the ISP parameter based on the condition of the currently shot image without manually setting the ISP parameter, and thus, the time waste can be reduced.

Drawings

FIG. 1 is a system framework diagram provided by an embodiment of the present application;

fig. 2 is a flowchart of a method for processing an image signal according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of an operating process of a video monitoring apparatus according to an embodiment of the present application;

FIG. 4 is a flowchart of a method of training processing provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a training process of a video surveillance device according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for training processing according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a training process of a video surveillance device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an apparatus for processing an image signal according to an embodiment of the present application.

Detailed Description

The embodiment of the application provides a method for processing image signals, which can be realized by terminal equipment. The terminal device may be a device having an image capturing function, such as a mobile phone, a camera, a video camera, or the like, and the video camera may be a video monitoring device or a handheld video camera, or the terminal device may also be a combination of the video monitoring device and other devices, where the video monitoring device is only responsible for capturing images, and the other devices are responsible for various processing based on the images. In the embodiment of the present application, a single video monitoring device is taken as an execution subject for example to perform detailed description of the scheme, and other situations are similar to the above, which are not described in detail in the embodiment of the present application.

An implementation environment is described below with reference to a video monitoring device, and an embodiment of the present application provides a video monitoring system, which may include a video monitoring device and a management device, as shown in fig. 1. The video monitoring device is configured to capture a raw image, perform ISP on the raw image to obtain an ISP output image, perform task processing (generally also referred to as back-end task processing) based on the ISP output image, and obtain a processing result, for example, a face position obtained by face recognition. The processed image and the processing result are then transmitted to the management apparatus. The management device may display the processed image and the task result, for example, a rectangular frame in the processed image based on the face position.

The video surveillance device may include an image acquisition component, a processor, a memory, and a communication component.

The image acquisition component may include a lens, an image sensor, and an analog-to-digital converter. The lens may be composed of one lens or a plurality of coaxial lenses, and the image sensor (sensor) may be a Complementary Metal Oxide Semiconductor (CMOS) or a Charge Coupled Device (CCD), etc. The processor may be a system on chip (SoC) and may include an ISP unit and an Artificial Intelligence (AI) unit. The memory, or storage unit, may be disposed at one or more of the inside of the AI unit, the inside of the ISP unit, the inside of the processor, the outside of the AI unit and the ISP unit, and the outside of the processor. The communication means may be a network connection port or a wireless communication module.

The lens is arranged on a shell of the video monitoring equipment, the image sensor, the analog-to-digital converter and the processor are arranged in the shell, and the communication component can be arranged on the shell and also can be arranged in the shell. The image sensor is located on the main optical axis of the lens, and the light sensing surface of the image sensor is perpendicular to the main optical axis. The analog-to-digital converter is electrically connected with the image sensor. The processor is electrically connected with the analog-to-digital converter and the communication part respectively.

The lens converges light to the image sensor, the image sensor performs photoelectric conversion, converts light signals into analog electrical signals, and inputs the analog electrical signals into the analog-to-digital converter, and the analog-to-digital converter converts the analog electrical signals into digital electrical signals, so as to obtain raw images. The analog-to-digital converter inputs the raw image into the processor, an ISP unit in the processor performs ISP on the raw image to obtain an ISP output image, then an AI unit in the processor encodes the ISP output image and performs back-end task processing to obtain a processing result. Further, the processor transmits the encoded image and the processing result to the communication section, and the communication section transmits the encoded image and the processing result to the management apparatus.

The Raw image can be divided into various types such as RGGB, RYYB, RCCC, RCCB, RGBW, CMYW and the like. The file format of the RAW image is, for example,. 3fr,. ari,. arw,. bay,. braw,. crw,. cap,. dcs,. dcr,. drf,. eip and. erf. Raw images can also come from image scanners, motion picture film scanners, in addition to video cameras. Using an ISP, RAW images of various formats can be converted into RGB format. The ISP may also convert RAW format to YUV format, or HSV format, Lab format, CMY format, YCbCr format. The image processed by the ISP may be referred to as a 3-channel image.

The camera may also include an encoder (e.g., integrated in the SoC chip or a separate encoding chip) that reduces the amount of data in the image and makes it easier for the user to play. The encoded image format is, for example: jpeg format, bmp format, tga format, png format, and gif format. Coded video media formats such as: MPEG format, AVI format, nAII format, ASF format, MOV format, WMV format, 3GP format, RM format, RMVB format, FLV/F4V format, H.264 format, H.265 format.

The raw image according to the embodiment of the present application includes a first raw image, a second raw image, a sample raw image, and the like, and may be a single raw image or a video segment composed of a plurality of continuous raw images (frames).

The embodiment of the present application provides a method for processing an image signal in a video shooting process, as shown in fig. 2, the method includes the following steps:

and 201, acquiring a first raw image.

In implementation, the video monitoring device may be installed in various application scenarios, such as streets, buildings, shops, parks, etc., and after power-on, the video monitoring device starts to perform video shooting. In the process of video shooting of the video monitoring equipment, after the image sensor inputs the analog electric signal of the image data into the analog-to-digital converter, the analog-to-digital converter outputs a raw image and sends the raw image to the processor. The processor receives the raw images one by one.

202, inputting the first raw image into the trained control network to obtain first control information.

The control information related to the embodiment of the present application is information for indicating value selection of an ISP parameter, and the corresponding control information may include first control information, second control information, and the like.

The control network may be a machine learning model, and may specifically be a convolutional neural network. The training process of the control network will be explained in detail later.

In implementation, after the processor receives the raw image, the AI unit processes the raw image, the trained control network is called from the storage unit, the raw image is input to the control network, and the control network outputs the first control information. After the control network is trained, the information contained in the raw image can be fully utilized, and the most appropriate value of the ISP parameter is selected for the information. And after obtaining the first control information, the AI unit transmits the first control information to the ISP unit.

And 203, selecting a value corresponding to the first control information from the multiple values of each ISP parameter.

There are many ISP parameters involved in image signal processing, which can be classified into a plurality of categories, such as defogging parameter, denoising parameter, sharpness parameter, exposure brightness control parameter, dynamic range compression parameter, white balance parameter, etc. Each class, in turn, may include one or more parameters. For example, the defogging parameters may include a defogging intensity parameter, a defogging mode parameter, etc., the noise reduction parameters may include a noise reduction filtering intensity, a noise reduction filtering mode, gains of different frequency points of a noise reduction filter, etc., the sharpness parameters may include a sharpening filtering intensity, a sharpening filtering mode, gains of different frequency points of a sharpening filter, etc., the exposure brightness control parameters may include a gamma curve parameter, a digital gain parameter, etc., and the dynamic range compression parameters may include a tone mapping (tone mapping) curve parameter, a local contrast intensity parameter, etc.

In implementation, a large number of values for selection are stored in the storage unit of the processor for each ISP parameter, for example, a total of 100 ISP parameters are stored, and each ISP parameter correspondingly stores 3 values for selection. The control information is used for selecting one of the values, and one value is selected corresponding to each ISP parameter. And after the ISP unit receives the first control information sent by the AI unit, selecting a corresponding value for each ISP parameter based on the first control information.

Each value corresponding to any ISP parameter may have different identifiers, and the identifier may be set arbitrarily, for example, the identifiers of 3 values are identifier 1, identifier 2, and identifier 3, respectively. The control information may include an identifier of a value to be selected corresponding to each ISP parameter. Thus, the first control information may be a vector, each element bit of the vector corresponds to an ISP parameter, and an element value in the element bit is an identifier of a value to be selected. For example, for convenience of understanding, it is assumed that there are two ISP parameters, each ISP parameter has 3 values, the two ISP parameters are respectively a parameter a and a parameter B, the 3 values of the parameter a include a1, a2, and A3, the 3 values of the parameter B include B1, B2, and B3, the control information is a vector (identifier 1, identifier 3), and the corresponding selected values are a1 and B3.

And 204, performing ISP on the first raw image based on the value selected for each ISP parameter to obtain a first ISP output image.

In implementation, after the ISP unit selects one value for each ISP parameter, the corresponding ISP algorithm may be called in the storage unit of the processor, the selected values are brought into the algorithm, and the ISP is performed on the first raw image based on the ISP algorithm to obtain the first ISP output image. The first ISP output image may be a YUV image.

And 205, performing task processing of the target task type on the first ISP output image to obtain a processing result.

Generally, for one video monitoring device, the type of the back-end task processing may be preset, one type of back-end task processing may be preset, or multiple types of back-end task processing may be preset. The target task type can comprise one or more of types of man-machine detection, face recognition, situation monitoring, image index detection and the like.

The man-in-the-air detection is to detect the positions of motor vehicles, non-motor vehicles and pedestrians in the image.

Face recognition is the detection of the position of a face in an image. Optionally, the face recognition may further recognize whether the face matches face information pre-stored in the database.

Situation monitoring is to monitor the execution of a certain event, for example, to monitor the processing line of a certain product to see if there is a problem in the product or if there is a problem in the operation of the line.

The image index detection is to detect objective index parameters of an image, and the index parameters can include definition, white balance and the like.

In implementation, the backend task processing is mostly a task performed by a machine learning model, which may be performed by an AI unit. After the ISP unit outputs the first ISP output image, the first ISP output image may be transmitted to the AI unit, the AI unit invokes a machine learning model for the back-end task processing in the storage unit, and inputs the first ISP output image into the machine learning model to obtain a corresponding processing result. For example, the target task type is face recognition, the AI unit calls a face recognition network, inputs the first ISP output image into the face recognition network, and outputs a face position. It should be noted that, in the embodiment of the present application, the machine learning model used in the back-end task processing is already trained, and this training process is not described in the embodiment of the present application.

The AI unit may encode the first ISP output image. The AI unit may transmit the processing result of the backend task and the encoded image to the communication section, and the communication section may transmit the processing result of the backend task and the encoded image to the management terminal, after obtaining the processing result of the backend task. And the management terminal decodes the coded image to obtain a first ISP output image and further displays the first ISP output image and a processing result of the back-end task. For example, the processing result is a face position, and the management terminal may display the first ISP output image and display a corresponding rectangular frame based on the face position.

The operation of the video surveillance apparatus described above can be visualized and simplified by using fig. 3, which is shown in the attached drawings.

In the embodiment of the application, the above flow can be executed for each raw image shot, or the value of the ISP parameter can be selected once every preset time, and the value of the ISP parameter selected this time is adopted for the raw images shot in this time until the next time of selection, so as to process the image signals, thereby saving the processing resources.

Moreover, the control network is adopted, which is equivalent to more finely dividing the shooting scene, the control information is a vector, the number of possible values of the vector is very large, and each different value of the vector is equivalent to corresponding to a different shooting scene, which is equivalent to subdividing a large number of shooting scenes. If the ISP parameters are configured for each shooting scene manually, the workload is not imaginable, and it can be said that the configuration cannot be realized at all. Therefore, the method of the embodiment of the application can be more beneficial to fine division of the shooting scene, so that the fineness of value selection of the ISP parameters is improved, further, the image signal processing can better serve for the back-end task processing, and the accuracy of the back-end task processing is improved.

The embodiment of the application also provides a method for training the parameters of the control network and the ISP.

The technician first needs to perform some initial settings, including setting an initial value space of ISP parameters, establishing a control network, collecting a sample, and calibrating a truth result for the sample. The following description is made separately.

First, when setting the initial value space of the ISP parameters, first, the number of selectable values corresponding to each ISP parameter may be set, and the number of selectable values may be set based on a comprehensive consideration of factors such as processing accuracy and processing capability of the processor. Then, a plurality of initial values corresponding to each ISP parameter can be set, and the plurality of initial values can be set based on the normal value range of the ISP parameter, and can be set according to the characteristics of the use environment of the video monitoring equipment and the characteristics of the task type in combination with experience.

Second, in establishing a control network, a technician may set the mathematical form of the control network, the relevant metrics, and the initial network parameters. Wherein the mathematical form may be a convolutional neural network or the like, the correlation index may include a convolutional kernel size, a length of the output vector, a number of desirable element values of each element bit of the output vector, and the like, and the initial network parameter may be an initial value of each element bit of the convolutional kernel, and the like.

The convolution kernel size may be set based on a combination of processing accuracy and processing power of the processor.

The length of the output vector is equal to the number of the ISP parameters, the number of the ISP parameters is determined by an ISP algorithm, and for the video monitoring equipment with a specific model, the ISP algorithm is preset, and correspondingly, the number of the ISP parameters is fixed and invariable.

Each element bit of the output vector corresponds to an ISP parameter respectively, and correspondingly, the number of the available element values of each element bit of the output vector is equal to the number of the available value of the corresponding ISP parameter.

The initial element value of each element bit of the convolution kernel may be set randomly or based on experience.

Third, the technician can collect a large number of sample raw images while collecting the sample. According to a possible scheme, each video monitoring device is respectively used for collecting sample raw images for training, so that when a certain video monitoring device needs to be trained, a large number of raw images shot by the video monitoring device under various conditions such as various time, various weather and the like can be obtained and used as the sample raw images. According to another possible scheme, a plurality of video monitoring devices with similar environments and the same model are trained by using a unified sample raw image, so that a large number of raw images shot by different video monitoring devices with similar environments and the same model under various conditions such as various times and various weathers can be obtained and serve as the sample raw image. For example, the plurality of video monitoring devices in similar environments and of the same model may be a plurality of video monitoring devices of the same model installed on streets in the same city, or may be video monitoring devices of the same model in the same building, and so on.

Fourthly, when calibrating the true value result of the sample, the technician may manually calibrate the sample raw image according to the target task type to obtain the true value result of the target task type corresponding to each sample raw image. For example, if the target task type is face recognition, when calibrating a true value result for the sample raw image, a technician may determine a minimum bounding rectangle of the face in the sample raw image, and further record position information of the rectangle (for example, coordinates of two non-adjacent vertices of the rectangle) as a true value result corresponding to the sample raw image.

Based on the above initial settings, a process of training the parameters of the control network and the ISP may be further performed, as shown in fig. 4, which includes the following steps:

401, receiving a sample raw image sent by the management device and a true result of a target task type corresponding to the sample raw image.

The target task type is a task type of back-end task processing in the video monitoring equipment needing training. The target task types have been described in detail in the above embodiment. The management device in the training process can be a computer device used by technicians in the production or development process of products, and can also be a monitoring background device connected after the products are used online.

In implementation, after obtaining a large number of sample raw images and a true value result corresponding to each sample raw image, the true value result may be stored in a database on the network side. When training is needed, the management equipment can be operated to call the stored sample raw image and the true value result, and the stored sample raw image and the true value result are sent to the video monitoring equipment.

And 402, inputting a sample raw image into a control network to be trained to obtain second control information.

The control network to be trained may be an initial control network that has just been established, or may be a control network that has been trained several times. The second control information includes an identifier of a value to be selected corresponding to each ISP parameter.

And 403, selecting a value corresponding to the second control information from the multiple values of each ISP parameter.

And 404, performing ISP on the sample raw image based on the value selected for each ISP parameter to obtain a second ISP output image.

And 405, performing task processing of the target task type on the second ISP output image to obtain a processing result.

The processing of step 402-.

And 406, inputting the processing result and the true value result into the loss function to obtain an output value of the loss function.

The type of loss function may be selected based on actual demand. When only one target task type exists, the processing result and the true value result of the target task type can be input into the loss function to obtain an output value. When there are multiple target task types, the processing result corresponding to each target task type and the true value result can be input into the loss function together to obtain an output value.

And 407, performing numerical value adjustment on the network parameters of the control network to be trained based on the output value of the loss function.

Steps

406 and 407 may be performed by the AI unit.

And 408, inputting the output value of the loss function into the trained tuning network to obtain a tuning value sequence.

The parameter adjusting network is used for adjusting the value of the ISP parameter, and can be a machine learning model, specifically a convolutional neural network. The training process of the tuning network will be described in detail later. The adjustment value sequence is a vector, any element bit of the vector corresponds to an ISP parameter, and the element value of any element bit is an adjustment value used for adjusting the value selected by the second control information among the stored values of the ISP parameter corresponding to the element bit.

The AI unit, after obtaining the output value of the loss function, may send the output value to the communication component, which sends the output value to the management device. And the management equipment calls the parameter adjusting network, inputs the output value of the loss function into the parameter adjusting network to obtain an adjusting value sequence, and then sends the adjusting value sequence to the video monitoring equipment. And the communication component receives the adjustment value sequence and then sends the adjustment value sequence to the processor.

The processing of the parameter adjusting network is performed by a computer device except the video monitoring device, because the processing performance of the video monitoring device is generally limited, the processing performance of the computer device can be stronger than that of the video monitoring device, which is more beneficial to improving the processing speed of the parameter adjusting network and improving the training efficiency.

Of course, the processing of this step may also be performed on the video surveillance device.

And 409, performing numerical adjustment on the selected value corresponding to the second control information based on the adjustment value sequence.

And the ISP unit of the processor determines the selected value corresponding to the second control information from the stored multiple values of each ISP parameter to obtain a value to be adjusted corresponding to each ISP parameter. Then, the value to be adjusted of each ISP parameter is adjusted based on the adjustment value corresponding to each ISP parameter in the adjustment value sequence, and specifically, the value to be adjusted may be added to the corresponding adjustment value to obtain the adjusted value. For example, for convenience of understanding, it is assumed that there are two ISP parameters, each ISP parameter has 3 values, the two ISP parameters are respectively a parameter a and a parameter B, the 3 values of the parameter a include a1, a2, and A3, the 3 values of the parameter B include B1, B2, and B3, the second control information is a vector (identifier 2 and identifier 1), the correspondingly selected values are a2 and B1, the adjustment value sequence is (m, n), and the stored values of the parameter a and the parameter B are updated to a1, a2 ', A3, B1', B2, and B3, where a2 '═ a2+ m, and B1' ═ B1+ n.

It should be noted that, the processing of the step 407 and the processing of the

step

408 and 409 have no necessary execution sequence, and the step 407 may be executed first, the

step

408 and 409 may be executed first, or the steps may be executed synchronously.

The process of training the control network and ISP parameters set forth above can be visualized and simplified by using fig. 5, which is shown in the attached drawings.

It should be noted that the training method may be executed before the actual monitoring operation starts, or may be synchronously executed during the monitoring operation.

The above-mentioned process gives a method of noise-free training, which is based on the true value of the sample and the manual calibration. However, sometimes the collected samples may be limited, or the manpower for calibrating the samples is limited, and after training based on the existing samples and truth values, the accuracy of the back-end task processing may still be low, and the accuracy of the back-end task processing may be low in some shooting scenes where the samples are not related. Based on this situation, the embodiment of the present application further provides a noisy training method, as shown in fig. 6, including the following steps:

601, receiving a noisy training instruction sent by a management device.

The management device in the training process can be a monitoring background device connected after the product is used online.

In the working process of the video monitoring equipment, the processing results of the monitoring video (namely the ISP output image) and the rear-end task are continuously sent to the management equipment. And the management equipment displays the monitoring video and displays the processing result in the monitoring video. The monitoring personnel can watch the corresponding monitoring video and evaluate whether the processing result is accurate or not. When the monitoring personnel determine that the processing result is not accurate, the management terminal can be operated, and the management terminal is triggered to send a noise-sending training instruction to the video monitoring equipment.

For example, when the back-end task is processed as face detection, a rectangular frame of a face can be displayed, a monitoring person can see the face and the rectangular frame from a monitoring video, at the moment, the monitoring person can judge whether the positions of the rectangular frame and the face are matched with each other, if the positions of the rectangular frame and the face are not matched with each other, the processing result is not accurate, the monitoring person can click a control of 'optimization training' in an interface, and then the management device can be triggered to send a noise-sending training instruction to the video monitoring device.

And 602, acquiring a shot second raw image, acquiring a processing result obtained by performing ISP on the second raw image and performing task processing on a target task type, taking the second raw image as a sample raw image, and taking the acquired processing result as a true value result of the target task type corresponding to the sample raw image.

And the second raw image is a raw image shot within a preset time length after the noisy training instruction is received.

After the video monitoring equipment receives the noisy training instruction, timing can be started, the shot raw image can be obtained within the preset time length, the shot raw image can be continuously obtained, obtained according to the fixed interval time length or obtained according to the random interval time length, and the obtained raw image is used as a sample raw image. The processor may store one sample raw image per acquisition in memory.

In the process of obtaining a sample raw image, the video monitoring equipment is always in a working state, continuously shoots the raw image, inputs the shot raw image into a control network to obtain control information, selects the value of an ISP parameter based on the control information, carries out ISP on the raw image based on the value of the ISP parameter to obtain a third ISP output image, and carries out rear-end task processing of a target task type based on the third ISP output image to obtain a processing result. For each sample raw image, the processor may obtain a corresponding processing result, and use the processing result as a true value result of the target task type corresponding to the sample raw image. The sample raw image and the corresponding true value result obtained in this way can be used for subsequent noisy training. The sample raw image and the corresponding true value result may be stored in a memory unit of the video surveillance device.

The noisy training has no accurate true value result of manual calibration, so the training convergence speed is slower, but the noisy training has the advantage of saving a large amount of manpower.

603, inputting the sample raw image into the control network to be trained to obtain second control information.

604, a value corresponding to the second control information is selected from the multiple values of each ISP parameter.

605, performing ISP on the sample raw image based on the value selected for each ISP parameter to obtain a second ISP output image.

And 606, performing task processing of the target task type on the second ISP output image to obtain a processing result.

607, the processing result and the true value result are inputted to the loss function to obtain the output value of the loss function.

And 608, performing numerical adjustment on the network parameters of the control network to be trained based on the output value of the loss function.

609, inputting the output value of the loss function into the trained parameter adjusting network to obtain an adjusting value sequence.

And 610, performing numerical adjustment on the selected value corresponding to the second control information based on the adjustment value sequence.

The processing of step 603-.

It should be noted that the processing of the step 608 and the processing of the step 609-.

For the training process, training may be performed once every time a sample raw image and a corresponding true value result are obtained, or may be performed once every time a certain number of sample raw images and corresponding true value results are obtained, or may be performed uniformly after all sample raw images and corresponding true value results within a preset time period are obtained. The training process and the normal processing process of the video monitoring equipment can be executed in parallel.

After the training with noise is finished, the video monitoring device may send a training completion notification with noise to the management device. The management device may display a completion prompt after receiving the noisy training completion notification. The monitoring personnel can continuously observe whether the subsequent processing result of the video monitoring equipment is accurate, and if the subsequent processing result is inaccurate, noise-carrying training can be triggered again.

The process of training the control network and ISP parameters set forth above can be visualized and simplified by using fig. 7, which is referenced.

It was stated above that the noisy training was performed after a certain noiseless training. Of course, it is also possible to directly perform noisy training on the initial values of the initial control network and ISP parameters through the above procedure.

The embodiment of the application also provides a method for training the reference network, which comprises the following steps:

and closing the control network of the video monitoring equipment, and only keeping one value for each ISP parameter, namely, not selecting the value of the ISP parameter. And initializing the value of each ISP parameter, wherein the initialized value can be a random value. The method comprises the steps of obtaining a sample raw image, inputting the sample raw image into an ISP unit, carrying out ISP on the sample raw image based on an initial value of an ISP parameter to obtain a fourth ISP output image, and carrying out rear-end task processing on the fourth ISP output image to obtain a first processing result. And inputting the first processing result and the true value result into a first loss function to obtain a first output value, and inputting the first output value into an initial parameter adjusting network to obtain an adjusting value sequence. And adjusting the initial value of the ISP parameter based on the adjustment value sequence to obtain the adjusted value of the ISP parameter.

And inputting the sample raw image into the ISP unit again, carrying out ISP on the sample raw image based on the adjusted value of the ISP parameter to obtain a fifth ISP output image, and carrying out rear-end task processing on the fifth ISP output image to obtain a second processing result.

And determining a first matching degree of the first processing result relative to the true value result, determining a second matching degree of the second processing result relative to the true value result, inputting the first matching degree and the second matching degree into a second loss function to obtain a second output value, and performing numerical adjustment on the network parameters of the parameter adjusting network based on the second output value.

In the above-mentioned tuning network training process, the process of inputting the first output value into the initial tuning network may be performed by a computer device other than the video monitoring device, such as the above-mentioned management device, and other processes in the training process may be performed by the video monitoring device.

In the embodiment of the application, the control network and the ISP parameters are trained based on the processing result and the true value result of the back-end task processing, so that the selection of the ISP parameter value in the monitoring process can better adapt to different shooting scenes and can better adapt to the requirements of the back-end task processing, and the accuracy of the back-end task processing is improved.

And the parameter adjusting network is adopted to train and adjust various values of the ISP parameters, and training selection is performed in a larger value space to obtain a smaller accurate value space, and the smaller value space can be provided for the control network to further select in the actual application process to obtain the most accurate value. The tuning network may be a larger network, i.e. a network with higher computational complexity, than the control network. Therefore, by the method of the embodiment of the application, a large part of complex processing (training of the value of the ISP parameter by the parameter adjusting network) in a complex parameter selection problem is put in the training process, and only a small part of simple processing (value selection by the control network) is left in the practical application process after training. In addition, the processing of the parameter adjusting network in the training process can be carried out on computer equipment with stronger calculation power except the video monitoring equipment, such as management equipment, so that the training efficiency can be ensured.

Therefore, the more complex value selection problem can be processed by the video monitoring equipment with smaller calculation force, and the more complex the value selection problem is, the more the accuracy of the back-end task processing is improved. Therefore, the method of the embodiment of the application is beneficial to improving the accuracy of the back-end task processing under the condition that the processing capacity of the video monitoring equipment is limited.

Based on the same technical concept, an embodiment of the present application further provides an apparatus for processing an image signal, which can be applied to the terminal device provided in the foregoing embodiment, as shown in fig. 8, where the apparatus includes:

an obtaining module 810, configured to obtain a first raw image. The obtaining function in step 201 and other implicit steps may be implemented specifically.

And the control module 820 is configured to input the first raw image into the trained control network to obtain first control information. The control function in step 202, as well as other implicit steps, may be implemented in particular.

A selecting module 830, configured to select, from the multiple values of each ISP parameter, a value corresponding to the first control information. The selection function in step 203, as well as other implicit steps, may be implemented.

The image signal processing module 840 is configured to perform ISP on the first raw image based on the value selected for each ISP parameter. The image signal processing function in step 204, and other implicit steps may be implemented specifically.

In a possible implementation manner, the apparatus is applied to a camera, and the apparatus further includes a shooting module configured to: the first raw image is captured by a sensor.

In one possible implementation manner, the apparatus further includes a task processing module, configured to: and acquiring a first ISP output image obtained by performing ISP on the first raw image, and performing task processing of a target task type on the first ISP output image to obtain a processing result, wherein the target task type comprises one or more of man-machine and non-man detection, face recognition, situation monitoring and image index detection. The task processing function in step 205, and other implicit steps may be implemented specifically.

In a possible implementation manner, the apparatus further includes a sending module, configured to: and transmitting the first ISP output image and the processing result to a management device.

In one possible implementation, the apparatus further includes a training module configured to: and performing numerical adjustment on the network parameters of the control network to be trained based on the sample raw image, and performing numerical adjustment on the values of the ISP parameters for selection.

In one possible implementation, the training module is configured to: determining a second ISP output image corresponding to a sample raw image based on the sample raw image, a control network to be trained and values for selection of ISP parameters; performing ISP parameter value evaluation processing based on the second ISP output image to obtain an evaluation result; based on the evaluation result, carrying out numerical value adjustment on the network parameters of the control network to be trained; and adjusting the value of the ISP parameter for selection based on the evaluation result.

In one possible implementation, the training module is configured to: inputting a sample raw image into a control network to be trained to obtain second control information, wherein the second control information comprises an identifier of a value to be selected corresponding to each ISP parameter; selecting a value corresponding to the second control information from the multiple values of each ISP parameter; and carrying out ISP on the sample raw image based on the value selected for each ISP parameter to obtain a second ISP output image.

In one possible implementation, the training module is configured to: and carrying out numerical value adjustment on the value corresponding to the second control information based on the evaluation result.

In one possible implementation, the training module is configured to: inputting the evaluation result into a trained parameter adjusting network to obtain an adjusting value sequence; and carrying out numerical value adjustment on the value corresponding to the second control information based on the adjustment value sequence.

In one possible implementation, the training module is configured to: and sending the evaluation result to a management device, and receiving an adjustment value sequence which is sent by the management device and obtained by inputting the evaluation result into a trained parameter adjusting network.

In one possible implementation, the training module is configured to: performing task processing of a target task type on the second ISP output image to obtain a processing result, wherein the target task type comprises one or more of man-machine detection, face recognition, situation monitoring and image index detection; and inputting the processing result and the true value result of the target task type corresponding to the sample raw image into a loss function to obtain an output value of the loss function as an evaluation result.

In one possible implementation manner, the training module is further configured to: receiving a sample raw image sent by a management device and a true value result of the target task type corresponding to the sample raw image, wherein the true value result is obtained by calibrating the sample raw image.

In one possible implementation manner, the training module is further configured to: acquiring a shot second raw image; acquiring a processing result obtained by performing ISP on the second raw image and performing task processing on the target task type; taking the obtained second raw image as the sample raw image; and taking the obtained processing result as a true value result of the target task type corresponding to the sample raw image.

The training module can implement the training functions of the steps 401-.

In a possible implementation manner, the obtaining module 810 is configured to obtain a first raw image captured every preset time period; the image signal processing module 840 is further configured to perform ISP on the raw image shot within the preset time period after the first raw image is shot based on the value of each ISP parameter selected for the first raw image.

In the embodiment of the application, a large number of values are stored corresponding to each ISP parameter, the raw image is input into the control network to obtain the control information, one standby value can be selected from the large number of values of each ISP parameter based on the control information, and then ISP is carried out on the raw image based on the value selected for each ISP parameter. Therefore, the control network can select a proper value for the ISP parameter based on the condition of the currently shot image without manually setting the ISP parameter, and thus, the time waste can be reduced.

It should be noted that the obtaining module 810, the control module 820, the selecting module 830, the image signal processing module 840, the task processing module and the training module may be implemented by a processor, or implemented by the processor in cooperation with at least one of a memory, a communication component, and an image capturing component.

It should be noted that: in the image signal processing apparatus provided in the above embodiment, when performing image signal processing, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules to complete all or part of the above described functions. In addition, the image signal processing apparatus and the image signal processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments, and are not described herein again.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof, and when the implementation is realized by software, all or part of the implementation may be realized in the form of a computer program product. The computer program product comprises one or more computer program instructions which, when loaded and executed on a device, cause a process or function according to an embodiment of the application to be performed, in whole or in part. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optics, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by the device or a data storage device, such as a server, a data center, etc., that is integrated into one or more available media. The usable medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape, etc.), an optical medium (such as a Digital Video Disk (DVD), etc.), or a semiconductor medium (such as a solid state disk, etc.).

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only one embodiment of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of image signal processing, the method comprising:

acquiring a first unprocessed raw image;

inputting the first raw image into a trained control network to obtain first control information;

selecting a value corresponding to the first control information from a plurality of values of each image signal processing ISP parameter;

and carrying out ISP on the first raw image based on the value selected for each ISP parameter.

2. The method according to claim 1, wherein the first control information includes an identifier of a value to be selected corresponding to each ISP parameter.

3. The method according to claim 1, applied to a camera, further comprising, before said acquiring a first raw image:

the camera captures the first raw image through a sensor.

4. The method of claim 1, wherein after performing ISP on the first raw image based on the value selected for each ISP parameter, further comprising:

and acquiring a first ISP output image obtained by performing ISP on the first raw image, and performing task processing of a target task type on the first ISP output image to obtain a processing result, wherein the target task type comprises one or more of man-machine and non-man detection, face recognition, situation monitoring and image index detection.

5. The method according to claim 4, wherein after the task processing of the target task type is performed on the first ISP output image to obtain a processing result, the method further comprises:

and sending the first ISP output image and the processing result to a management device.

6. The method according to any one of claims 1-5, further comprising:

and performing numerical adjustment on the network parameters of the control network to be trained based on the sample raw image, and performing numerical adjustment on the values of the ISP parameters for selection.

7. The method according to claim 6, wherein the performing numerical adjustment on the network parameter of the control network to be trained based on the sample raw image and the performing numerical adjustment on the value of the ISP parameter for selection comprises:

determining a second ISP output image corresponding to a sample raw image based on the sample raw image, a control network to be trained and values for selection of ISP parameters;

performing ISP parameter value evaluation processing based on the second ISP output image to obtain an evaluation result;

based on the evaluation result, carrying out numerical value adjustment on the network parameters of the control network to be trained;

and adjusting the value of the ISP parameters for selection based on the evaluation result.

8. The method according to claim 7, wherein the determining a second ISP output image corresponding to the sample raw image based on the sample raw image, the control network to be trained, and the value for selection of the ISP parameter comprises:

inputting a sample raw image into a control network to be trained to obtain second control information, wherein the second control information comprises an identifier of a value to be selected corresponding to each ISP parameter;

Selecting a value corresponding to the second control information from the multiple values of each ISP parameter;

and carrying out ISP on the sample raw image based on the value selected for each ISP parameter to obtain a second ISP output image.

9. The method of claim 8, wherein numerically adjusting the value of the ISP parameter for selection based on the evaluation result comprises:

and carrying out numerical value adjustment on the value corresponding to the second control information based on the evaluation result.

10. The method according to claim 9, wherein the performing a numerical adjustment on the value corresponding to the second control information based on the evaluation result includes:

inputting the evaluation result into a trained parameter adjusting network to obtain an adjusting value sequence;

and carrying out numerical value adjustment on the value corresponding to the second control information based on the adjustment value sequence.

11. The method according to claim 10, wherein the inputting the evaluation result into the trained tuning network to obtain the adjustment value sequence comprises:

and sending the evaluation result to a management device, and receiving an adjustment value sequence which is sent by the management device and obtained by inputting the evaluation result into a trained parameter adjusting network.

12. The method according to claim 7, wherein the performing ISP parameter evaluation processing based on the second ISP output image to obtain an evaluation result comprises:

performing task processing of a target task type on the second ISP output image to obtain a processing result, wherein the target task type comprises one or more of man-machine detection, face recognition, situation monitoring and image index detection;

and inputting the processing result and the true value result of the target task type corresponding to the sample raw image into a loss function to obtain an output value of the loss function as an evaluation result.

13. The method according to claim 12, wherein before determining the second ISP output image corresponding to the sample raw image based on the sample raw image, the control network to be trained, and the values for selection of the ISP parameters, further comprising:

receiving a sample raw image sent by a management device and a true value result of the target task type corresponding to the sample raw image, wherein the true value result is obtained by calibrating the sample raw image.

14. The method according to claim 12, wherein before determining the second ISP output image corresponding to the sample raw image based on the sample raw image, the control network to be trained, and the values for selection of the ISP parameters, further comprising:

Acquiring a shot second raw image;

acquiring a processing result obtained by performing ISP on the second raw image and performing task processing on the target task type;

taking the obtained second raw image as the sample raw image;

and taking the obtained processing result as a true value result of the target task type corresponding to the sample raw image.

15. The method of any of claims 1-14, wherein the acquiring the first raw image comprises:

acquiring a shot first raw image every other preset time length;

after the ISP is performed on the first raw image based on the value selected for each ISP parameter, the method further includes:

and carrying out ISP on the raw image shot in the preset time after the first raw image is shot based on the value of each ISP parameter selected for the first raw image.

16. An apparatus for image signal processing, the apparatus comprising:

the acquisition module is used for acquiring a first unprocessed raw image;

the control module is used for inputting the first raw image into a trained control network to obtain first control information;

a selecting module, configured to select, from the multiple values of each ISP parameter, a value corresponding to the first control information;

And the image signal processing module is used for carrying out ISP on the first raw image based on the value selected for each ISP parameter.

17. The apparatus according to claim 16, wherein the first control information includes an identifier of a value to be selected corresponding to each ISP parameter.

18. The apparatus of claim 16, wherein the apparatus is applied to a camera, the apparatus further comprising a photographing module for:

the first raw image is captured by a sensor.

19. The apparatus of claim 16, further comprising a task processing module configured to:

20. The apparatus of claim 19, further comprising a sending module configured to:

and transmitting the first ISP output image and the processing result to a management device.

21. The apparatus of any one of claims 16-20, further comprising a training module to:

22. The apparatus of claim 21, wherein the training module is configured to:

and adjusting the value of the ISP parameter for selection based on the evaluation result.

23. The apparatus of claim 22, wherein the training module is configured to:

24. The apparatus of claim 23, wherein the training module is configured to:

25. The apparatus of claim 14, wherein the training module is configured to:

26. The apparatus of claim 25, wherein the training module is configured to:

and sending the evaluation result to a management device, and receiving an adjustment value sequence which is sent by the management device and obtained after the evaluation result is input into the trained parameter adjusting network.

27. The apparatus of claim 22, wherein the training module is configured to:

28. The apparatus of claim 27, wherein the training module is further configured to:

29. The apparatus of claim 27, wherein the training module is further configured to:

acquiring a shot second raw image;

taking the obtained second raw image as the sample raw image;

30. The apparatus according to any one of claims 16-29, wherein the obtaining module is configured to:

acquiring a shot first raw image every other preset time length;

The image signal processing module is further configured to:

31. An image pickup apparatus characterized by comprising:

the image acquisition component comprises a lens and a sensor and is used for acquiring raw images;

a processor: for carrying out the method of any one of the preceding claims 1 to 15.

32. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer program code which, when executed by a computing device, performs the method of any of the preceding claims 1 to 15.