Convolutional Neural Network

Hui Lin @Google

Ming Li @Amazon

Types of Neural Network

Computer Vision

Computer Vision

Computer Vision

Image Data

Image Data

Convolutions

HTML5 Icon

Edge Detection

Parameters

Padding

Strided convolutions

Summary of Convolutions

Convolutions Over Volume

Your Turn: Number of Parameters in One Layer

Question: If you have 10 filters that are \(3 \times 3 \times 3\) in one layer of a neural network, how many parameters does that layer have?

Summary of Notation

If layer \(l\) is a convolution layer:

Pooling Layers

Pooling Layers

Examples: LeNet - 5

LeCun et al., 1998. Gradient-based learning applied to document recognition

LeCun et al., 1998. Gradient-based learning applied to document recognition

Examples: LeNet - 5

Activation Shape Activation Size # Parameters
Input (32, 32, 1) 1024 0
CONV1 (f=5, s=1) (28, 28, 6) 6272 \(5 \times 5 \times 6 + 6 = 156\)
POOL1 (f=2, s=2) (14, 14, 6) 1176 0
CONV2 (f=5, s=1) (10, 10, 16) 1600 \(5 \times 5 \times 6 \times 16 + 16 = 2416\)
POOL2 (f=2, s=2) (5, 5, 16) 1176 0
FC3 (120, 1) 120 \(400 \times 120 + 1 = 48001\)
FC4 (84, 1) 84 \(120 \times 84 + 1 = 10081\)
Softmax (10, 1) 10 \(84 \times 10 +1 = 841\)

Types of Layer in A Convolutional Network

Using Keras To Build CNN

Typical keras workflow:

  1. Define your training data: input tensors and target tensors
  2. Define a network of layers (or models) that maps your inputs to your targets
  3. Configure the learning process by choosing a loss function, an optimizer, and some metrics to monitor
  4. Iterate on your training data by calling the fit() method of your model

Using Keras To Build CNN

# Define model structure
cnn_model <- keras_model_sequential() %>%
  layer_conv_2d(filters = 32, kernel_size = c(3, 3), 
  activation = "relu", input_shape = input_shape) %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = "relu") %>%
  layer_dropout(rate = 0.25) %>%
  layer_flatten() %>%
  layer_dense(units = 128, activation = "relu") %>%
  layer_dropout(rate = 0.5) %>%
  layer_dense(units = num_classes, activation = "softmax")

Using Keras To Build CNN

# Compile model
cnn_model %>% compile(
  loss = loss_categorical_crossentropy,
  optimizer = optimizer_adadelta(),
  metrics = c('accuracy')
)

Using Keras To Build CNN

# Train model
cnn_history <- cnn_model %>%
  fit(
    x_train, y_train,
    batch_size = batch_size,
    epochs = epochs,
    validation_split = 0.2
  )
# Model prediction
cnn_pred <- cnn_model %>%
  predict_classes(x_test)

Size of the Model

Effective CNNs

Different Architecture Search Algorithms:

Understanding Neural Networks