Machine Learning Quiz-5

Q1. The optimizer is an important part of training neural networks. which of the following is not the purpose of using optimizers?

  1. Speed up algorithm convergence
  2. Reduce the difficulty of manual parameter setting
  3. Avoid overfitting
  4. Avoid local extremes

Answer: 3
Explanation: To avoid overfitting, we use regularization and not optimizers.

Q2. Which of the following is not a regularization technique used in machine learning?

  1. L1 regularization
  2. R-square
  3. L2 regularization
  4. Dropout

Answer: 2
Explanation: Of all the above mentioned, R-square is not a regularization technique. R-squared is a statistical measure of how close the data are to the fitted regression line.

Q3. Which of the following are hperparameter in the context of deep learning?

  1. Learning Rate, α
  2. Momentum parameter, β1
  3. Number of units in a layer
  4. All of the above

Answer: 4
Explanation: According to Wikipedia, “In machine learning, a hyperparameter is a parameter whose value is used to control the learning process”. So, all of the above are hyperparameters.

Q4. Which of the following statement is not true with respect to batch normalization?

  1. Batch normalization helps in decreasing training time
  2. Batch normalization add slight regularization effect
  3. After using of batch normalization there is no need to use the dropout
  4. Batch normalization helps in reducing the covariate shift

Answer: 3
Explanation: Although Batch Normalization has a slight regularization effect but this is not why we use this. This is used to make the neural network more robust (reduce covariate shift) and easy to train. While Dropout is used for regularization (reducing overfitting). So, the third option is incorrect.

Q5. In a machine learning project, modelling is an iterative process but deployment is not.

  1. True
  2. False

Answer: 2
Explanation: Deployment is an iterative process, where you should expect to make multiple adjustments (such as metrics monitored using dashboards or percentage of traffic served) to work towards optimizing the system.

Q6. Which of the following activation function works better for hidden layers?

  1. Sigmoid
  2. Tanh

Answer: 2
Explanation: The Tanh activation function usually works better than sigmoid activation function for hidden units because the mean of its output is closer to zero, so it centers the data better for the next layer and the gradients are not restricted to move in a certain direction.

Q7. The softmax function is used to calculate the probability distribution over a discrete variable with n possible values?

  1. True
  2. False

Answer: 1
Explanation: The softmax function is used to calculate the probability distribution over a discrete variable with n possible values. This can be seen as a generalization of the sigmoid function which was used to represent a probability distribution over a binary variable.

Q8. Let say you want to use the transfer learning from task A to task B. Which of the following scenario would support to use this transfer learning?

  1. Task A and B have same input x
  2. You have lot more data for task A than task B
  3. Low level features from task A could be helpful for learning B
  4. All of the above

Answer: 4 Explanation: All of the things mentioned above are pre-requisites for performing transfer learning. Refer to this beautiful explanation by Andrew Ng to know more.

Leave a Reply