CN116824511A - Tool identification method and device based on deep learning and color space - Google Patents

Tool identification method and device based on deep learning and color space Download PDF

Info

Publication number
CN116824511A
CN116824511A CN202310975350.9A CN202310975350A CN116824511A CN 116824511 A CN116824511 A CN 116824511A CN 202310975350 A CN202310975350 A CN 202310975350A CN 116824511 A CN116824511 A CN 116824511A
Authority
CN
China
Prior art keywords
tool
data
image
network
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310975350.9A
Other languages
Chinese (zh)
Inventor
陆彬
孟思宏
李琳
姜德田
范以云
龙如兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xingwei Technology Beijing Co ltd
Original Assignee
Xingwei Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xingwei Technology Beijing Co ltd filed Critical Xingwei Technology Beijing Co ltd
Priority to CN202310975350.9A priority Critical patent/CN116824511A/en
Publication of CN116824511A publication Critical patent/CN116824511A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the field of information processing, and discloses a tool identification method and device based on deep learning and color space, wherein the track of a brightness V value is adjusted according to the unique illumination characteristics of an outdoor tool through an improved HSV color space, and the track is changed from a fixed value to a dynamic value, so that the original characteristics of data can be expressed, and the accuracy of characteristic extraction is ensured; meanwhile, by using a network of the resnet50, algorithm overfitting is avoided, and detection of the humanoid form is combined with deep learning targets, so that the accuracy of tool identification is ensured.

Description

Tool identification method and device based on deep learning and color space
Technical Field
The application relates to the field of information processing, in particular to a tool identification method and device based on deep learning and color space.
Background
At present, with the continuous development of artificial intelligence technology, more and more new business demands are induced, the application of artificial intelligence in security industry is also more and more popular, and the form of providing security by means of artificial and video monitoring alone in the past is also changed. Workers in the scenes such as factories, shops and working parks are required to wear unified dressing to facilitate management, the cameras are installed alone, monitoring video is monitored manually, the time of dressing of the monitoring workers is realized in the parks with more complex scene workers, more monitoring is needed, more resources such as manpower are input, and time, manpower and money are consumed. For common merchants, a large amount of wiring and energy consumption are required for placing cameras, which has become a headache problem in factories, parks, shops and other places. How to improve the recognition accuracy and the efficiency of image processing becomes a problem needing attention in the security field.
Disclosure of Invention
In order to solve one of the problems, the application provides a recognition method and a device based on deep learning and a color space, which are applied to tool recognition. The method comprises the following steps:
step 101, acquiring a tool monitoring image data set;
102, preprocessing data of images in a tool image dataset, screening the images in the tool image dataset, and then performing resolution unification;
step 103, performing data enhancement on the tooling images in the tooling image dataset after the data preprocessing, and enhancing the color space in the data comprises the following steps: when different visual perception scenes are caused by different illumination, calculating by using a Value channel, and extracting the edge of an object by calculating the gradient; when the Saturation of the foreground is high, the background adopts a color with low Saturation to set off the foreground, and the information of a Saturation channel is extracted; when the scene is in an indoor scene and the style is single, increasing the Hue channel weight value; the value of the brightness V is changed from static state to dynamic state;
104, constructing a tool identification network, and using a residual error network to avoid over fitting based on the characteristic of small gap between tools except colors; sending the tooling image dataset enhanced by the data into a target detection humanoid network model for detection, and extracting the tooling image humanoid dataset;
step 105, training a tool classification network by adopting a tool image data set with enhanced data to obtain a tool identification model;
and 106, carrying out tool classification detection on the tool monitoring image to be detected by adopting a tool identification model.
Preferably, the data preprocessing in step 102 further includes: sorting the data sets, and dividing the total data set into a test set and a training set; the training set is used for training model parameters of the tool classification network.
Preferably, the converting the value of the brightness V from static to dynamic in the step 103 includes: selecting a sigmoid function as a description of a dynamic change track of the V value; sigmoid function
The sigmoid function takes a value range of (0, 1), and maps the real value V to the interval of (0, 1).
Preferably, the step 105 specifically includes:
and training a human shape detection model by adopting the tool image dataset with the data enhanced so as to obtain human shape data in the image by using the human shape detection model, and setting maximum numbers=50000 of the data enhanced picture.
Preferably, the use of the residual network to avoid overfitting in step 104 specifically includes: the setup tool identification network includes a resnet50.
Preferably, the image in the tooling image data set is a video image shot by a monitoring camera.
Preferably, step 105 specifically further includes: training the tool image classification network by adopting the humanoid data set to obtain a tool image classification model; training a tool image classification network by using a training set, wherein the learning rate adopts a cosine algorithm, the initial learning rate r=0.00001, and the gradient model uses small-batch gradient descent; when 300 epochs are trained, a judgment is carried out to judge whether the error and the precision can meet the requirements; and stopping training if the requirement is met, otherwise, continuing training until the requirement is met.
Preferably, the tooling image dataset with enhanced data in step 104 is sent to a target detection humanoid network model for detection, and the tooling image humanoid dataset is extracted, wherein the target detection algorithm is yolov 3 or yolov5-s algorithm.
Preferably, the method is applied to tool recognition of staff of a gas station.
The application also provides a tool identification device based on deep learning and color space, which is used for realizing the method, and comprises the following steps: the data set acquisition module is used for acquiring a tool image data set;
the data preprocessing module is used for preprocessing the data of the images in the tooling image dataset; the data enhancement module is used for enhancing the data of the tooling images in the tooling image dataset after the data preprocessing;
the tool image classification network construction module is used for constructing a tool identification network, and the tool identification network comprises a ResNet50 network;
the tool image classification network training module is used for training a tool identification network by adopting the tool image data set with enhanced data to obtain a tool identification model;
and the tool identification module is used for identifying the tool of the tool image to be detected by adopting the tool identification model.
The application changes the previous mode of carrying out tool identification by manual work only, uses an artificial intelligent algorithm to replace manual work for tool identification, reduces the investment of manual work and saves funds. Meanwhile, the improved HSV color space data enhancement method solves the problems of over fitting of an algorithm and low recognition accuracy in the conventional tool recognition. There is preferably also provided an apparatus comprising a processor and a memory, the memory having stored thereon a computer program for executing the computer program on the memory for carrying out the above method.
The application avoids the algorithm overfitting by using the network of the resnet50, which is more beneficial to the engineering landing; the improved HSV color space adjusts the track of the brightness V value according to the unique illumination characteristics of the outdoor tool, changes the track from a fixed value into a dynamic value, can express the original characteristics of the data, and ensures the accuracy of characteristic extraction; the detection of the human shape by combining the deep learning target is realized, and the accuracy of tool identification is ensured.
Drawings
The features and advantages of the present application will be more clearly understood by reference to the accompanying drawings, which are schematic and should not be interpreted as limiting the application in any way.
FIG. 1 is a flow chart of an identification method provided by the application.
FIG. 2 is a flow chart of the present application for providing identification method optimization.
FIG. 3 is a schematic diagram of the dynamic change track of V value in the present application.
Fig. 4 is a schematic view of the structure of the device of the present application.
Detailed Description
These and other features and characteristics of the present application, as well as the methods of operation and functions of the related elements of structure, the combination of parts and economies of manufacture, may be better understood with reference to the following description and the accompanying drawings, all of which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the application. It will be understood that the figures are not drawn to scale. Various block diagrams are used in the description of the various embodiments according to the present application.
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In this context "/" means "or" for example, a/B may mean a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone.
It should be noted that, in order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the terms "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function or effect, and those skilled in the art will understand that the terms "first", "second", and the like do not limit the number and execution order. For example, the first information and the second information are used to distinguish between different information, and not to describe a particular order of information.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
Example 1
The application provides a tool identification method based on deep learning and color space, which comprises the following steps: acquiring a tool identification image dataset; performing data preprocessing on the tool identification image data set; the data pre-processed image data is subjected to data enhancement including, but not limited to, flipping, rotation, cropping, noise, blurring, masking, cutout, random, erasing, mixup, color transformation, and the like. The color transform is an improved HSV color space transform that uses a varying V value to determine a maximum value of RGB; and constructing a tooling image classification network, wherein the image classification network comprises a human shape detection network tooling classification network.
The human shape data captured by the human shape detection network is sent into a tool classification network, the tool classification network comprises an improved resnet50 serving as a main network, human shape data extracted by the human shape detection network is received for training, and a tool image classification model is obtained; and carrying out image classification on the image to be detected by using the tool image classification model.
Preferably: in the step of acquiring the tool identification image data set, the tool video is manufactured into the image data set, data screening is carried out, and the image size resolution is processed in a unified mode.
Preferably: in the step of data enhancement of the image data after the data preprocessing, the data enhancement of the tooling image data set after the data preprocessing is performed by adopting a data enhancement mode of Mosaic, mixUp, randomErasing, hideAndSeek and GridMask, overturning, rotating, cutting, noise adding, blurring, masking, cutout, random, wearing, mixup, color conversion and other methods.
Preferably: in the step of constructing the tooling image classification network, the image classification network comprises a human shape detection network tooling classification network. The human shape data captured by the human shape detection network is sent into the tool classification network, the tool classification network comprises an improved resnet50 serving as a main network, human shape data extracted by the human shape detection network is received for training, and a tool image classification model is obtained.
Priority is given to: in the step of tool identification, the tool image classification model is adopted to detect and identify the tool image to be detected.
The scheme provided by the application has the beneficial technical effects that the collected data is subjected to data preprocessing, and then the data after the data preprocessing is subjected to data enhancement; the data enhancement includes, but is not limited to, turning, rotating, clipping, adding noise, blurring, masking, cutout, random, wearing, mixup, color conversion and the like, and the color conversion adopts an improved HSV color space method to enhance the data; constructing a tool image classification network; training a tool classification network by adopting the tool image data set with the enhanced data to obtain a tool classification model; and carrying out tool identification on the tool image to be detected by adopting the tool classification model.
In a specific embodiment, as shown in fig. 1-2, a tool identification method and device based on deep learning and color space are provided, and the method includes:
and 101, acquiring a tool monitoring image data set.
And 102, preprocessing the data of the images in the tooling image dataset. And the image in the tool image data set is a video image shot by the monitoring camera.
The step 102 specifically includes:
and after the images in the tool image dataset are screened, resolution unification processing is carried out. Because intelligent monitoring equipment is more in variety, the resolution of video images shot by the monitoring cameras is different, the network regression effect can be influenced, the resolution is unified, and the resolution of the video images is unified to be the same fixed size.
The data preprocessing also includes data set sorting, and the total data set is divided into a test set and a training set. The training set is mainly used for training parameters of the tool classification network model, and the corresponding tool identification method and parameters based on deep learning and color space.
The test set is mainly used for verifying the accuracy of the tool classification network model and ensuring that the tool classification network model can be used for actual engineering.
Step 103, carrying out data enhancement on the tooling image in the tooling image dataset after the data preprocessing, wherein step 103 specifically comprises the step of carrying out data enhancement on the tooling image dataset after the data preprocessing by adopting a data enhancement mode of Mosaic, mixUp, randomErasing, hideAndSeek and Gridmask, turning, rotating, cutting, adding noise, blurring, masking, cutout, random, wearing, mixup, color conversion and other methods.
Wherein data enhancement plays a vital role in deep learning network training. Because the parameters of the network are changed according to the change of the characteristics of the data, the quality of the data directly determines the quality of model training. Obviously, more obvious characteristics and color change which is easier to extract can be extracted by a convolution network, and the expression of the trained model is more accurate.
According to experiments, it was found that since the surveillance video is continuous, the resulting video image is also continuous, and in most cases, the variation between the upper and lower frames of image data is minimal.
Experiments show that the characteristics extracted by the final network tend to be a common characteristic regardless of the sequence of the characteristics between the images, and the neural network regression is taken as a basic principle. Based on this principle, a configuration is provided to the features more obvious to the neural network. The color feature is found to be an extremely important point of tool classification in experiments, and based on the experimental finding, the gap between tool shapes in a real scene is almost negligible, and only the feature capable of distinguishing the tool is only the color feature. Based on this, the present application focuses on improved enhancements to the data enhanced color space.
In image processing, RGB images are not generally processed directly, mainly because RGB is far from human visual perception, while HSV is a common color space. HSV refers to Hue (Hue), saturation (Saturation), brightness (Value), respectively, and we usually use different channels for different problems. The most common method in the prior art is to change an image into a gray level image, and experiments show that the Value channel is adopted for calculation under certain scenes with different visual perceptions caused by different illumination, and the object edge can be conveniently extracted by calculating the gradient. Meanwhile, experiments show that when the foreground Saturation is high, the background is set off the foreground by adopting a color with low Saturation, and the information of the channel of the Saturation is very useful. When in some indoor scenes, the style is single, i.e. in general, the object has only one color, and the channel weight value Hue is increased. According to HSV color space set in different situations, a proper channel is selected to complete most image preprocessing work.
According to a large number of observation and experiments, the tool relates to the outdoor and indoor situations, especially the outdoor tool has larger difference under different illumination conditions due to illumination influence, so that the data feature extraction is difficult, and the model training is inaccurate. While the control point is brightness (V)
Original HSV formula
Wherein r represents red and g represents green. g represents that the blue RGB has a value of 0 to 255, H has a value of 0:000 to 5:255, and S and V have a value of 0 to 255. When both S and V take full values, the corresponding red is 255, blue is 0, and b green varies from 0 to 255, between yellow and green, the corresponding red varies from 0 to 255, blue is 0, and g green 255, among others can be analogized.
RGB to V formula: v=max, where S is 255, adjusting V, adjusting the H hue value directly affects the maximum value of RGB and the value of V. Indicating that the value of V determines the maximum value of RGB
Saturation S
When max is fixed, the effect of S is to represent the value of min. When V and S are determined. The sliding color tone H, RGB colors have two colors, namely min and max, respectively, and the other color varies from min to max.
In general, the brightness V is processed by setting a fixed value in advance, so as to control the maximum value of RGB to realize the color conversion, and obviously, this method cannot follow the real-time change of the data in time, and cannot better reflect the real color characteristics of the image data. We therefore turn the value of luminance (V) from static to dynamic. So that the brightness can restore the true color of the image data.
From experimental observations, the succession of video images makes the image data continuous, such that the image features are continuous, and such continuous features cannot be increased all the time, which would make the network prediction effect overfit when all the picture data tends to one feature. We therefore choose the sigmoid function as a description of the dynamic change trace of the V value.
The track is shown in figure 3 of the drawings,
sigmoid function
The sigmoid function has a value range of (0, 1), and can map a real number V value to a section of (0, 1), so that the effect is better when the characteristic phase difference is complex or the phase difference is not particularly large.
It can be seen from fig. 3 that when the control dependent variable V value is continuously expanded to two ends, the color value of RGB is continuously increased, a certain feature of image data is continuously increased along with the increase of the data amount, and the feature is no longer increased when the feature is increased to a certain value, so that the model expressive force is prevented from being single, the model is over-fitted, and the recognition accuracy of the model is reduced. Therefore, the brightness V value is controlled through the sgmoid function, so that the RGB value is controlled, and the enhancement of HSV color space data better reflects the color characteristics of image data. And the tool classification network extracts more accurate data characteristics.
Step 104, constructing a tool identification network, wherein the tool identification network comprises a resnet50, and a residual network is needed to avoid over fitting because the tools are not far apart from the colors, and the resnet50 just meets the characteristic. Step 104 specifically includes:
and sending the tooling image dataset with the enhanced data into a target detection humanoid network model for detection, and extracting the tooling image humanoid dataset, wherein the humanoid dataset is also called a tooling dataset because the humanoid dataset is extracted from the tooling dataset. The target detection algorithm is a yolov 3 algorithm or a yolov5-s algorithm.
And 105, training the tool classification network by adopting the tool image data set with the enhanced data to obtain a tool classification model. Step 105 specifically includes:
and training a human shape detection model by adopting the tool image dataset after data enhancement to human shape position information in the tool image, extracting human shape data in the image by using the human shape detection model to manufacture a dataset, setting a maximum number=50000 of a data enhancement picture, and training the tool image classification network by adopting the human shape dataset if the data quantity is more, so as to obtain a tool image classification model.
The training set is used for training a network model, the learning rate adopts a cosine algorithm, the initial learning rate r=0.00001, and the gradient model uses small-batch gradient descent. When 300 epochs are trained, a judgment is carried out to judge whether the error and the precision can meet the requirements. And stopping training if the requirement is met, otherwise, continuing training until the requirement is met. And after the accuracy of the tool identification algorithm after the improvement is verified in the test set and meets the requirement, the parameters of the established algorithm model are selected as the parameters of the final model. The tool identification algorithm using the parameters is used in actual engineering.
And 106, carrying out tool classification detection on the tool monitoring image to be detected by adopting a tool identification model.
Fig. 4 is a schematic structural diagram of a tool recognition device based on deep learning and color space, as shown in fig. 4, and a tool recognition method and device based on deep learning and color space, including:
a data set acquisition module 201, configured to acquire a tool image data set.
The data preprocessing module 202 is configured to perform data preprocessing on the images in the tooling image dataset.
And the data enhancement module 203 is configured to perform data enhancement on the tooling images in the tooling image dataset after data preprocessing.
The tool image classification network construction module 204 is configured to construct a tool identification network, where the tool identification network includes a res net50 network, and since the tools are not widely separated except by color, a residual network is required to avoid overfitting, and the res net50 just meets this feature.
The tool image classification network training module 205 is configured to train the tool identification network by using the tool image data set after the data enhancement, and obtain a tool identification model.
The tool identification module 206 is configured to identify a tool for the to-be-detected tool image by using the tool identification model.
The data preprocessing module 202 specifically includes a data preprocessing unit, configured to perform resolution unification processing on images in the tool image dataset.
The data enhancement module 203 specifically includes a data enhancement unit, and performs data enhancement on the tool image dataset after data preprocessing by adopting a data enhancement mode of Mosaic, mixUp, randomErasing, hideAndSeek and GridMask, turning, rotating, cutting, noise adding, blurring, masking, cutout, random, erase, mixup, color conversion and other methods.
The tool image classification network training module 205 specifically comprises a tool image classification network training unit, a human shape detection network, a tool image classification network training module and a tool image classification network training module, wherein the tool image classification network training unit is used for adopting a tool image data set with enhanced data, sending the tool image data set into the human shape detection network to capture human shape data, and then sending the human shape into the tool classification network, wherein the tool classification network comprises an improved resnet50 as a backbone network, and receiving the human shape data extracted by the human shape detection network for training to obtain a tool image classification model;
the application solves the problems that the manpower monitoring cost is high and the personnel cannot monitor the working negligence in place in the mode of monitoring by the manpower alone, provides a more practical HSV color space data enhancement method and improves the accuracy and efficiency of tool identification.
It will be appreciated by those skilled in the art that implementing all or part of the above-described embodiment method may be implemented by a computer program to instruct related hardware, where the program may be stored in a computer readable storage medium, and the program may include the above-described embodiment method when executed. Wherein the storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
As used in this disclosure, the terms "component," "module," "apparatus," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, the components may be, but are not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers. Furthermore, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local device, distributed device, and/or across a network such as the internet with other devices by way of the signal).
It should be noted that the above embodiments are only for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present application may be modified or substituted without departing from the spirit and scope of the technical solution of the present application, which is intended to be covered in the scope of the claims of the present application.

Claims (10)

1. A tool identification method based on deep learning and color space, the method comprising:
step 101, acquiring a tool monitoring image data set;
102, preprocessing data of images in a tool image dataset, screening the images in the tool image dataset, and then performing resolution unification;
step 103, performing data enhancement on the tool images in the tool image dataset after the data preprocessing, and enhancing the HSV color space in the data comprises the following steps: when different visual perception scenes are caused by different illumination, calculating by using a Value channel, and extracting the edge of an object by calculating the gradient; when the Saturation of the foreground is high, the background adopts a color with low Saturation to set off the foreground, and the information of a Saturation channel is extracted; when the scene is in an indoor scene and the style is single, increasing the Hue channel weight value; the value of the brightness V is changed from static state to dynamic state;
104, constructing a tool identification network, and using a residual error network to avoid over fitting based on the characteristic of small gap between tools except colors; sending the tooling image dataset enhanced by the data into a target detection humanoid network model for detection, and extracting the tooling image humanoid dataset;
step 105, training a tool classification network by adopting a tool image data set with enhanced data to obtain a tool identification model;
and 106, carrying out tool classification detection on the tool monitoring image to be detected by adopting a tool identification model.
2. The method of claim 1, wherein the preprocessing of the data in step 102 further comprises: sorting the data sets, and dividing the total data set into a test set and a training set; the training set is used for training model parameters of the tool classification network.
3. The method of claim 2, wherein the step 103 of converting the value of the luminance V from static to dynamic comprises: selecting a sigmoid function as a description of a dynamic change track of the V value; sigmoid function
The sigmoid function takes a value range of (0, 1), and maps the real value V to the interval of (0, 1).
4. A method as claimed in claim 3, wherein: the step 105 specifically includes:
and training a human shape detection model by adopting the tool image dataset with the data enhanced so as to obtain human shape data in the image by using the human shape detection model, and setting maximum numbers=50000 of the data enhanced picture.
5. The method of claim 4, wherein: the use of the residual network to avoid overfitting in step 104 specifically includes: the setup tool identification network includes a resnet50.
6. The method of claim 5, wherein: the images in the tool image data set are video images shot through the monitoring camera.
7. The method of claim 6, wherein: step 105 specifically further includes: training the tool image classification network by adopting the humanoid data set to obtain a tool image classification model; training a tool image classification network by using a training set, wherein the learning rate adopts a cosine algorithm, the initial learning rate r=0.00001, and the gradient model uses small-batch gradient descent; when 300 epochs are trained, a judgment is carried out to judge whether the error and the precision can meet the requirements; and stopping training if the requirement is met, otherwise, continuing training until the requirement is met.
8. The method of claim 7, wherein: and (4) sending the tooling image dataset enhanced by the data in the step (104) into a target detection humanoid network model for detection, and extracting the tooling image humanoid dataset, wherein the target detection algorithm is a yolov5-s algorithm.
9. The method as recited in claim 8, wherein: the method is applied to tool identification of gas station staff.
10. A tool recognition device based on deep learning and color space for implementing any one of the methods of claims 1-9; characterized in that the device comprises: the data set acquisition module is used for acquiring a tool image data set;
the data preprocessing module is used for preprocessing the data of the images in the tooling image dataset; the data enhancement module is used for enhancing the data of the tooling images in the tooling image dataset after the data preprocessing;
the tool image classification network construction module is used for constructing a tool identification network, and the tool identification network comprises a ResNet50 network;
the tool image classification network training module is used for training a tool identification network by adopting the tool image data set with enhanced data to obtain a tool identification model;
and the tool identification module is used for identifying the tool of the tool image to be detected by adopting the tool identification model.
CN202310975350.9A 2023-08-03 2023-08-03 Tool identification method and device based on deep learning and color space Pending CN116824511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310975350.9A CN116824511A (en) 2023-08-03 2023-08-03 Tool identification method and device based on deep learning and color space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310975350.9A CN116824511A (en) 2023-08-03 2023-08-03 Tool identification method and device based on deep learning and color space

Publications (1)

Publication Number Publication Date
CN116824511A true CN116824511A (en) 2023-09-29

Family

ID=88122295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310975350.9A Pending CN116824511A (en) 2023-08-03 2023-08-03 Tool identification method and device based on deep learning and color space

Country Status (1)

Country Link
CN (1) CN116824511A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368727A (en) * 2020-03-04 2020-07-03 西安咏圣达电子科技有限公司 Dressing detection method, storage medium, system and device for power distribution room inspection personnel
CN111401314A (en) * 2020-04-10 2020-07-10 上海东普信息科技有限公司 Dressing information detection method, device, equipment and storage medium
WO2020164270A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Deep-learning-based pedestrian detection method, system and apparatus, and storage medium
CN113052194A (en) * 2019-12-27 2021-06-29 杭州深绘智能科技有限公司 Garment color cognition system based on deep learning and cognition method thereof
CN113129236A (en) * 2021-04-25 2021-07-16 中国石油大学(华东) Single low-light image enhancement method and system based on Retinex and convolutional neural network
CN113269161A (en) * 2021-07-16 2021-08-17 四川九通智路科技有限公司 Traffic signboard detection method based on deep learning
CN114708618A (en) * 2022-04-21 2022-07-05 河南众诚信息科技股份有限公司 Intelligent work clothes identification method and system for intelligent park based on classification
CN114782268A (en) * 2022-04-19 2022-07-22 南京航空航天大学 Low-illumination image enhancement method for improving SURF algorithm

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020164270A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Deep-learning-based pedestrian detection method, system and apparatus, and storage medium
CN113052194A (en) * 2019-12-27 2021-06-29 杭州深绘智能科技有限公司 Garment color cognition system based on deep learning and cognition method thereof
CN111368727A (en) * 2020-03-04 2020-07-03 西安咏圣达电子科技有限公司 Dressing detection method, storage medium, system and device for power distribution room inspection personnel
CN111401314A (en) * 2020-04-10 2020-07-10 上海东普信息科技有限公司 Dressing information detection method, device, equipment and storage medium
CN113129236A (en) * 2021-04-25 2021-07-16 中国石油大学(华东) Single low-light image enhancement method and system based on Retinex and convolutional neural network
CN113269161A (en) * 2021-07-16 2021-08-17 四川九通智路科技有限公司 Traffic signboard detection method based on deep learning
CN114782268A (en) * 2022-04-19 2022-07-22 南京航空航天大学 Low-illumination image enhancement method for improving SURF algorithm
CN114708618A (en) * 2022-04-21 2022-07-05 河南众诚信息科技股份有限公司 Intelligent work clothes identification method and system for intelligent park based on classification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐鹏程;刘本永;: "基于图像增强和深层CNN学习的交互行为识别", 通信技术, no. 03, 10 March 2019 (2019-03-10) *

Similar Documents

Publication Publication Date Title
US20230289979A1 (en) A method for video moving object detection based on relative statistical characteristics of image pixels
CN107256225B (en) Method and device for generating heat map based on video analysis
US9483709B2 (en) Visual saliency estimation for images and video
CN113139521B (en) Pedestrian boundary crossing monitoring method for electric power monitoring
CN110059694A (en) The intelligent identification Method of lteral data under power industry complex scene
CN101828201B (en) Image processing device and method, and learning device, method
US8687887B2 (en) Image processing method, image processing apparatus, and image processing program
CN109918971B (en) Method and device for detecting number of people in monitoring video
CN108596102B (en) RGB-D-based indoor scene object segmentation classifier construction method
CN102915446A (en) Plant disease and pest detection method based on SVM (support vector machine) learning
CN110717896A (en) Plate strip steel surface defect detection method based on saliency label information propagation model
CN111353452A (en) Behavior recognition method, behavior recognition device, behavior recognition medium and behavior recognition equipment based on RGB (red, green and blue) images
CN109685045A (en) A kind of Moving Targets Based on Video Streams tracking and system
CN111127360B (en) Gray image transfer learning method based on automatic encoder
CN104281839A (en) Body posture identification method and device
CN113111878B (en) Infrared weak and small target detection method under complex background
CN102340620B (en) Mahalanobis-distance-based video image background detection method
CN113409355A (en) Moving target identification system and method based on FPGA
CN113435452A (en) Electrical equipment nameplate text detection method based on improved CTPN algorithm
CN108876672A (en) A kind of long-distance education teacher automatic identification image optimization tracking and system
CN102510437B (en) Method for detecting background of video image based on distribution of red, green and blue (RGB) components
CN108833776A (en) A kind of long-distance education teacher automatic identification optimization tracking and system
CN111611866A (en) Flame detection and identification method and system based on YCrCb and LAB color spaces
CN116824511A (en) Tool identification method and device based on deep learning and color space
CN107341456B (en) Weather sunny and cloudy classification method based on single outdoor color image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination