Computer Vision Quiz-5

Q1. For a multi-channel input feature map, we apply Max-pooling independently on each channel and then concatenate the results along the channel axis?

  1. True
  2. False

Answer: 1
Explanation: Max-pooling operation is applied independently on each channel and then the results are concatenated along the channel axis to form the final output. Refer to this beautiful explanation by Andrew Ng to understand more.

Q2. A fully convolutional network can take as input the image of any size?

  1. True
  2. False

Answer: 1
Explanation: Because a fully convolutional network doesnot contain any fully connected or Dense layer, this can take as input the image of any size.

Q3. In R-CNN, the bounding box loss is only calculated for positive samples (samples that contains classes present in the dataset)?

  1. True
  2. False

Answer: 1
Explanation: In R-CNN, the bounding box loss is only calculated for positive samples (samples that contains classes present in the dataset) as it makes no sense to fine-tune a bounding box that doesn’t contain object.

Q4. In the VGG16 model, we have all the Conv layers with same padding and filter size and the downsampling is done by MaxPooling only?

  1. True
  2. False

Answer: 1
Explanation: Earlier models like AlexNet use large filter size in the beginning and downsampling was done either by max-pooling or by convolution. But in the VGG16 model, we have all the Conv layers with same padding and filter size and the downsampling is done by MaxPooling only. So what have we gained by using, for instance, a stack of three 3×3 conv. layers instead of a single 7×7 layer? First, we incorporate three non-linear rectification layers instead of a single one, which makes the decision function more discriminative. Second, we decrease the number of parameters. Refer to the Section 2.3 of this research paper to understand more.

Q5. 1×1 convolution can also help in decreasing the computation cost of a convolution operation?

  1. True
  2. False

Answer: 1
Explanation: 1×1 convolution can also help in decreasing the computation cost of a convolution operation. Refer to this beautiful explanation by Andrew Ng to understand more.

Q6. Can we use Fully Convolutional Neural Networks for object detection?

  1. Yes
  2. No

Answer: 1
Explanation: Yes a Fully Convolutional Neural Networks can be used for object detection. For instance, YOLO etc.

Q7. Which of the following networks can be used for object detection?

  1. Overfeat
  2. Faster RCNN
  3. YOLO
  4. All of the above

Answer: 4
Explanation: All of the above mentioned networks can be used for object detection. For instance, Faster RCNN belongs to Region based methods whereas YOLO, Overfeat belongs to sliding window based methods.

Q8. AlexNet was one of the first networks that uses ReLU activation function in the hidden layers instead of tanh/sigmoid (which were quite common at that time)?

  1. True
  2. False

Answer: 1
Explanation: This was one of the revolutionary ideas that boomed deep learning i.e. using ReLU activation function in the hidden layers instead of tanh/sigmoid (which were quite common at that time).

1 thought on “Computer Vision Quiz-5

  1. Jason Cordova

    You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

    Reply

Leave a Reply