Computer Vision Quiz-4

Q1. The values in a filter/mask are called as

  1. Coefficients
  2. Weights
  3. Both of the above
  4. None of the above

Answer: 3
Explanation: The values in a filter/mask are called as either coefficients or weights.

Q2. Which of the following networks uses the idea of Depthwise Separable Convolutions?

  1. AlexNet
  2. MobileNet
  3. ResNet
  4. VGG16

Answer: 2
Explanation: As mentioned in the MobileNet paper, MobileNets are based on a streamlined architecture that uses depthwise separable convolutions to build light weight deep neural networks that work even in low compute environment, such as a mobile phones. Refer to this research paper to understand more.

Q3. What is the output of a Region Proposal Network (RPN) at each sliding window location if we have k anchor boxes?

  1. 2k scores and 4k bounding box coordinates
  2. 4k scores and 2k bounding box coordinates
  3. k scores and 4k bounding box coordinates
  4. 4k scores and 4k bounding box coordinates

Answer: 1
Explanation: In a Region Proposal Network (RPN), for k anchor boxes we get 2k scores (that estimate probability of object or not) and 4k bounding box coordinates corresponding to each sliding window location. Refer to Figure 3 of this research paper to understand more.

Q4. Which of the following networks uses Skip-connections?

  1. DenseNet
  2. ResNet
  3. U-Net
  4. All of the above

Answer: 4
Explanation: All of the above mentioned networks uses Skip-connections.

Q5. For binary classification, we generally use ________ activation function in the output layer?

  1. Tanh
  2. ReLU
  3. Sigmoid
  4. Leaky ReLU

Answer: 3
Explanation: For binary classification, we want the output (y) to be either 0 or 1. Because sigmoid outputs the P(y=1|x) and has value between 0 and 1, so it is appropriate for binary classification.

Q6. In ResNet’s Skip-connection, the output from the previous layer is ________ to the layer ahead?

  1. added
  2. concatenated
  3. convoluted
  4. multiplied

Answer: 1
Explanation: In ResNet’s Skip-connection, the output from the previous layer is added to the layer ahead. Refer to the Figure 2 of this research paper to understand more.

Q7. In Fast R-CNN, we extract feature maps from the input image only once as compared to R-CNN where we extract feature maps from each region proposal separately?

  1. True
  2. False

Answer: 1
Explanation: Earlier in R-CNN we were extracting features from each region proposals separately using a CNN and this was very time consuming. So, to counter this, in Fast R-CNN we extract feature maps from the input image only once and then project the region proposals onto this feature map. This saves a lot of time. Refer to this link to understand more.

Q8. For Multiclass classification, we generally use ________ activation function in the output layer?

  1. Tanh
  2. ReLU
  3. Sigmoid
  4. Softmax

Answer: 4
Explanation: For Multiclass classification, we generally use softmax activation function in the output layer. Refer to this beautiful explanation by Andrew Ng to understand more.

Leave a Reply