AU2017101803A4 - Deep learning based image classification of dangerous goods of gun type - Google Patents

Deep learning based image classification of dangerous goods of gun type Download PDF

Info

Publication number
AU2017101803A4
AU2017101803A4 AU2017101803A AU2017101803A AU2017101803A4 AU 2017101803 A4 AU2017101803 A4 AU 2017101803A4 AU 2017101803 A AU2017101803 A AU 2017101803A AU 2017101803 A AU2017101803 A AU 2017101803A AU 2017101803 A4 AU2017101803 A4 AU 2017101803A4
Authority
AU
Australia
Prior art keywords
deep learning
self
model
training
dangerous goods
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2017101803A
Inventor
Mufei Chen
Quansen Wang
Hang Yin
Renyuan ZHANG
Hongbo Zhu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chen Mufei Ms
Original Assignee
Chen Mufei Ms
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chen Mufei Ms filed Critical Chen Mufei Ms
Priority to AU2017101803A priority Critical patent/AU2017101803A4/en
Application granted granted Critical
Publication of AU2017101803A4 publication Critical patent/AU2017101803A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

WINO201712002 This invention lies in the field of digital image processing and is, in particular, a deep learning enabled technique for image classification of dangerous goods. The invention consists of the following steps: firearm image preprocessing and post-processed data set splitting - training and test set - wherein the training set is cross validated for hyper-parameters searching and model parameter learning. Each training image is treated as a NDArray and is then fed into a deep Convolutional Neural Network (CNN) for forward evaluation. The entire training process will learn a set of optimal parameters acrossall layers of the neural net with the goal of minimizing empirical loss. The set of learned parameters together with our model is then stored and used to predict labels for the test set. The closeness between the true label and predicted label is the measure of the model performance that we look for. The final justified deep learning model is able to classify new sensitive images to achieve the goal of dangerous goods identification to a favorable accuracy. This invention does not require manualfeature selection.It is a state-of-the-art image classification technique with high performance and reliability based on deep learning method to identify dangerous goods of gun type.

Description

DESCRIPTION
TITLE
Deep learning based image classification of dangerous goods of gun type FIELD OF THE INVENTION
This invention is inthe field of digital image processing andserves as dangerous goods identification powered by deep learning.
BACKGROUND OF THE INVENTION
With rapid popularization of the Internet especially mobile Internet, the amount of multimedia data stored in the form of image and video is increasing in a way that we cannot imagine. Fast and accurate object detection algorithm via images from a variety ofsources has become the hotspot in scientific research. In this invention, we focus on identifying dangerous goods of gun type, inciudingbut not limit to handgun, submachine gun, and ammunition. Most of the traditional monitoring methods rely on the human resources. However, due to the scale of multimedia contents, it is nearly impossible to satisfy the actual needs of Internet monitoring depending only on manual inspection. Moreover, the traditional visual recognition methods have limitations inthe feature extraction ofthe image, such as SIFT and SURF. Other solutions that utilize deep learning method are usually not end-to-end, resulting in massive storage cost and separate training stages. We must also deal with the problem of overfitting due to the design of network architecture that consists of millions of parameters.
Deeping learning is indeed a subarea of machine learning. It has become the most intriguingresearch field since 2012, the year in which Rectified Linear Unit (ReLU) was introduced to resolve the issue of gradient vanishing during back propagation. It overtakes other machine learning algorithms, like SVM and logistic regression, in model performance.Since then, many emerging real world applications or systems based on deep learning have catch up with human abilities in fields like image classification, natural language processing and decision making in complex systems.
The idea of artificial neural networks (ANN) was introduced in the 1950s to mathematically model the function of biological neurons. The early research of the Artificial Neural Network reached plateau several times.When back propagation is used to update the parameters of ANN, the gradient of loss function with w.r.t each weight will start diminishing, as the network gets deeper. This is known as gradient vanishing and the performance will drop if more layers are added into the network. Because of this, the deepest network in the last decade contains only three layers as opposed to 100 layers of Inception Net introduced last year. Since the invention of SVM, artificial neural networks received little attention and research interest, because the former has superior accuracy and is intrinsically a convex optimization problem that can be solved via its dual problem. Not until a few years ago, the development of high performance computing platform and big data processing technique, together with ReLU, deep learning started to attract eye balls and has been so far the most powerful component of Artificial Intelligence. The classical exampleof deep learning is to classify hand-written digital image and house price predicting, which can also be done with traditional machine models but with low scalability and performance. In this invention, we focus on real world image classification and use TensorFlow as the deep learning framework to implement our model. We choose TensorFlow because it was introduced and maintained by Google and has the largest deep learning communities and Github contributors.
SUMMARY
In order to solve the shortcomings and deficiencies in the existing technology, this invention proposes an image classification method for firearms that is based on deep learning. Through combining deep learning with image recognition for firearms, and adopting the layer-by-layer initializing training mode, the proposed method significantly improves the training times and overcomessome of the technical difficulties of training process. The method gives full play to the advantages of deep learning and auto-learning.
The technical solution of this invention is implemented as follows:
Ourdeep learning image classification method for firearms includes: firearm image database, convolutional neural network and fully connected layers. The images of firearm image database are imported into the model and the representative characteristics will be the output after forward evaluation. At last, the classifier is able to identifyimages containing sensitive object and the result will be presented.
In summary, the deep neural network contains an input layer, intermediate hidden layers, and an output layer. The following steps are included:
Step (1), the input layer preprocess original images, including cropping, scaling and segmentation.
Step (2), hidden layers combine forward unsupervised learning andbackward supervised learning.
Step (3), the output layer creates CNN feature maps, includingcombinations of peripheral features, morphological features, and color features of the original image, and they will be used for classification task processed in FC layers.
Optionally, we use Auto encoder to initialize model parameters. It is a bottom-up unsupervised learning with symmetrical architecture: the input is encoded and compresseddown to the middle layer and is then decompressed to its original with some information lost. The encoding and decoding parameters are optimized to minimize the reconstruction error (as the distance between the output and input).
Optionally, the specific process of top-down supervised learning worksas follows: during trainingphase, the error is transmitted from top to bottom, and the parameters of each layer are fine-tuned accordingly.
DESCRIPTION OF DRAWING
Figure 1 is the flow block diagram of model training in our invention.
Figure 2 is the flow block diagram of trained model making predictions.
Figure 3 shows the generic structure of the deep neural network. DESCRIPTION OF PREFERRED EMBODIMENT Network Design
We implement this model as a convolutional neural network and evaluate it on dataset from ImageNet and other sources. The initial convolutional layers of the network extract features from the image while the fully connected layers predict the class probabilities.
Our network architecture is inspired by the GoogLeNet model for image classification and the YOLO model for real-time object detection. Figure 1 display the architecture of our detection network. Our network has four convolutional layers followed by 2 fully connected layers. Each convolutional layer consists of three components: convolution, batch normalization, and pooling. Convolution as itself is a 3x3 kernel (the entries are learned during training) that filters across the entire image. The stride is set to be one and zero padding is added to preserve the peripheral information.
Right after convolution, there is a pooling to down-sample the previous output. It is then followed by a non-linear activation. These three components are grouped together in this order and repeated four times before flattening the 2D array into a 1D vector. This vector is then fed into two consecutive fully connected layers to produce the final prediction. L2 regularization and random dropout are included in our design in the training stage to fight against overfitting and optimize the performance of the final classifier.
Procedure
The procedure of this invention is implemented as follows: 1. Download firearm related image dataset from reliable source. 2. Data set cleaning: we extract useful information from the data set and localize the object (gun/knife) shown in the image and delete images with poor quality. 3. Image preprocessing: we scale images into a uniform 32x32 shape and create a path for the image dataset. 4. Train test splitting: we use stratified sampling that enables roughly the same proportion of positive instance in both training and test set. The ratio of training set to test set is approximately 10:1. 5. This invention uses deep learning model that contains four convolutional layers and two fully connected layers. def ιηοάθΐ (data_flow, train=True) : ©data: original inputs ©return: logits f# Define Convolutional Layers for i, (weights, biases, config) in enumerate (zip (self. conv_weights, self. conv_biases, self. conv_config)) : with tf. name_scope (config [’ name’ ] + ’_model’): with tf. name_scope(’ convolution ): if default 1, 1, 1, 1 stride and SAME padding data_flow = tf. nn. conv2d(data_flow, filter=weights, strides=[l, 1, 1, 1], padding=’SAME’) data_flow = data_flow + biases if not train: self, visualize_filter_map(data_flow, how_many=config[’ out_depth’ ], display_size=32 // (i //2 + 1), name=config[’name’] + ’ _conv’) if config[’ activation ] == ’ relu’ : data_flow = tf. nn. relu(data_flow) if not train: self, visualize_filter_map(data_flow, how_many= config[’ out_depth’ ], display_size=32 // (i // 2 + 1), name=config[’name’] + ’ _relu ) else: raise Exception^ Activation Func can only be Relu right now. You passed’, config[’activation ]) if config [’pooling’] : data_flow = tf. nn. max_pool ( data_f low, ksize=[l, self. pooling_scale, self. pooling_scale, 1], strides=[l, self.pooling_stride, self.pooling_stride, 1], padding=’ SAME’) if not train: self, visualize_filter_map(data_flow, how_many= config[’ out_depth’ ], display_size=32 // (i // 2 + 1)^ // 2, name=config[’ name’ ] + ’ _pooling’)
If Define Fully Connected Layers for i, (weights, biases, config) in enumerate(zip(self. fc_weights, self. fc_biasos, self. fc_config)): if i == 0: shape = data_flow. get_shape(). as_list() data_flow = tf. reshape(data_flow, [shape[0], shape[l] * shape[2] * shape[3]]) with tf. name_sc ope (config [’name’] + ’model’): ffffff Dropout if train and i == len(self. fc_weights) - 1: ata_flow = tf. nn. dropout(data_flow, self. dropout_rate, seed=4926) ffffff data_flow = tf. matmul(data_flow, weights) + biases if config [’activation’] == ’relu’: data_flow = tf. nn. relu(data_flow) elif config[’activation’] is None: pass else: raise Exception( Activation Func can only be Relu or None right now. You passed , config[’ activation’ ]) return data_flow 6. The model is trained with post-processed training data using back propagation. Both L2 regularization and dropout are used to fight against overfitting and increase the robustness of the features extracted by the convolutional layers. Model parameters are updated through stochastic gradient descent with the loss function based on each mini-batch. # Training computation. logits = model(self. tf_train_samples) with tf. name_scope (’ loss’): self, loss = tf.reduce_meanCtf.nn. softmax_cross_entropy_with_logits(logits=logits, labels=self. tf_train_labels)) self, loss += self. apply_regularization(_lambda=5e-4) self. train_summaries. appendCtf. summary, scalar (’ Loss’, self, loss)) tf learning rate decay global_step = tf. Variable(0) learning_rate = tf. train. exponential_decay( learning_rate=self. base_learning_rate, global_step=global_step * self. train_batch_size, decay_steps=100, decay_rate=self. decay_rate, staircase=True ) tf Optimizer. with tf. name_scope (’ optimizer’ ): if (self. optimizeHethod == ’ gradient’): self, optimizer = tf. train \ . Gradi entDescentOptimi zer Clearning_rate) \ . minimizeCself. loss) elif Cself. optimizeMethod == ’momentum’): self, optimizer = tf. train \ . MomentumDptimizer(learning_rate, 0.5) \ . minimize Cself. loss) elif Cself. optimizeHethod == ’ adam’ ) : self, optimizer = tf. train \ . AdamDptimizer (learning_rate) \ . minimize Cself. loss) 7. Testing: we take unlabeled images (from test set) as input and evaluate all the nodes in the neural network to output final prediction. def testCself, test_samples, test_labels, *, data_iterator): if self, saver is None: self. define_model0 if self.writer is None: self, writer = tf. summary. FileVriter (’ . /board’, tf. get_def ault_graphO) print(’ Before session’) with tf. SessionCgraph=tf. get_default_graphC)) as session: self.saver.restoreCsession, self. save_path) tftftf 9)¾ accuracies = [] confusionMatrices = []
for i, samples, labels in data_iteratorCtest_samples, test_labels, chunkSize=self. test_batch_size): result = session. runC self. test_prediction, feed_dict={self. tf_test_samples: samples} ) tf self, writer, add_summaryCsummary, i) accuracy, cm = self, accuracyCresult, labels, need_confusion_matrix=True) accuracies, appendCaccuracy) confusionMatrices. appendCcm) print(’ Test Accuracy: %.lf%%’ % accuracy) printC Average Accuracy:’, np. average(accuracies)) print(’Standard Deviation:’, np. std(accuracies)) self. print_confusion_matrix(np. add. reduce(confusionMatrices))

Claims (2)

  1. CLAIM
  2. 1. A method for image identification based on deep learning, comprising: getting the representative characteristics by importing the image database to the model; using the classifier to identify images containing sensitive objects. 1
AU2017101803A 2017-12-24 2017-12-24 Deep learning based image classification of dangerous goods of gun type Ceased AU2017101803A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2017101803A AU2017101803A4 (en) 2017-12-24 2017-12-24 Deep learning based image classification of dangerous goods of gun type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2017101803A AU2017101803A4 (en) 2017-12-24 2017-12-24 Deep learning based image classification of dangerous goods of gun type

Publications (1)

Publication Number Publication Date
AU2017101803A4 true AU2017101803A4 (en) 2018-02-15

Family

ID=61167718

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2017101803A Ceased AU2017101803A4 (en) 2017-12-24 2017-12-24 Deep learning based image classification of dangerous goods of gun type

Country Status (1)

Country Link
AU (1) AU2017101803A4 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325144A (en) * 2018-08-27 2019-02-12 章云娟 Content complexity detection method
CN110210413A (en) * 2019-06-04 2019-09-06 哈尔滨工业大学 A kind of multidisciplinary paper content detection based on deep learning and identifying system and method
CN110909784A (en) * 2019-11-15 2020-03-24 北京奇艺世纪科技有限公司 Training method and device of image recognition model and electronic equipment
CN111738290A (en) * 2020-05-14 2020-10-02 北京沃东天骏信息技术有限公司 Image detection method, model construction and training method, device, equipment and medium
CN112750117A (en) * 2021-01-15 2021-05-04 重庆邮电大学 Blood cell image detection and counting method based on convolutional neural network
CN113954072A (en) * 2021-11-05 2022-01-21 中国矿业大学 Vision-guided wooden door workpiece intelligent identification and positioning system and method
CN114494891A (en) * 2022-04-15 2022-05-13 中国科学院微电子研究所 Dangerous article identification device and method based on multi-scale parallel detection
WO2022235241A1 (en) * 2021-05-03 2022-11-10 Bahcesehir Universitesi A ballistic solution system and a method thereof

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325144A (en) * 2018-08-27 2019-02-12 章云娟 Content complexity detection method
CN110210413A (en) * 2019-06-04 2019-09-06 哈尔滨工业大学 A kind of multidisciplinary paper content detection based on deep learning and identifying system and method
CN110210413B (en) * 2019-06-04 2022-12-16 哈尔滨工业大学 Multidisciplinary test paper content detection and identification system and method based on deep learning
CN110909784A (en) * 2019-11-15 2020-03-24 北京奇艺世纪科技有限公司 Training method and device of image recognition model and electronic equipment
CN110909784B (en) * 2019-11-15 2022-09-02 北京奇艺世纪科技有限公司 Training method and device of image recognition model and electronic equipment
CN111738290A (en) * 2020-05-14 2020-10-02 北京沃东天骏信息技术有限公司 Image detection method, model construction and training method, device, equipment and medium
CN111738290B (en) * 2020-05-14 2024-04-09 北京沃东天骏信息技术有限公司 Image detection method, model construction and training method, device, equipment and medium
CN112750117A (en) * 2021-01-15 2021-05-04 重庆邮电大学 Blood cell image detection and counting method based on convolutional neural network
CN112750117B (en) * 2021-01-15 2024-01-26 河南中抗医学检验有限公司 Blood cell image detection and counting method based on convolutional neural network
WO2022235241A1 (en) * 2021-05-03 2022-11-10 Bahcesehir Universitesi A ballistic solution system and a method thereof
CN113954072A (en) * 2021-11-05 2022-01-21 中国矿业大学 Vision-guided wooden door workpiece intelligent identification and positioning system and method
CN113954072B (en) * 2021-11-05 2024-05-28 中国矿业大学 Visual-guided intelligent wood door workpiece recognition and positioning system and method
CN114494891B (en) * 2022-04-15 2022-07-22 中国科学院微电子研究所 Hazardous article identification device and method based on multi-scale parallel detection
CN114494891A (en) * 2022-04-15 2022-05-13 中国科学院微电子研究所 Dangerous article identification device and method based on multi-scale parallel detection

Similar Documents

Publication Publication Date Title
AU2017101803A4 (en) Deep learning based image classification of dangerous goods of gun type
Maninis et al. Convolutional oriented boundaries
US20170083623A1 (en) Semantic multisensory embeddings for video search by text
CN107683469A (en) A kind of product classification method and device based on deep learning
Ju et al. Fish species recognition using an improved AlexNet model
Ma et al. Lightweight attention convolutional neural network through network slimming for robust facial expression recognition
CN111460200B (en) Image retrieval method and model based on multitask deep learning and construction method thereof
Vallet et al. A multi-label convolutional neural network for automatic image annotation
CN114358188A (en) Feature extraction model processing method, feature extraction model processing device, sample retrieval method, sample retrieval device and computer equipment
CN113569895A (en) Image processing model training method, processing method, device, equipment and medium
CN116594748A (en) Model customization processing method, device, equipment and medium for task
CN112883931A (en) Real-time true and false motion judgment method based on long and short term memory network
Franchi et al. Latent discriminant deterministic uncertainty
Li et al. An improved lightweight network architecture for identifying tobacco leaf maturity based on Deep learning
Xu et al. Weakly supervised facial expression recognition via transferred DAL-CNN and active incremental learning
Barbhuiya et al. Gesture recognition from RGB images using convolutional neural network‐attention based system
Jangtjik et al. A CNN-LSTM framework for authorship classification of paintings
Xu Mt-resnet: a multi-task deep network for facial attractiveness prediction
Qayyum et al. Ios mobile application for food and location image prediction using convolutional neural networks
Zhong et al. A diversified deep belief network for hyperspectral image classification
Kaur et al. Targeted style transfer using cycle consistent generative adversarial networks with quantitative analysis of different loss functions
Soujanya et al. A CNN based approach for handwritten character identification of Telugu guninthalu using various optimizers
Agbo-Ajala et al. Age group and gender classification of unconstrained faces
Liu et al. Learning to refine object contours with a top-down fully convolutional encoder-decoder network
CN112364193A (en) Image retrieval-oriented method for fusing multilayer characteristic deep neural network model

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry