Deep Learning – IIT Ropar Week 3 Assignment Answers

Deep Learning - IIT Ropar

Deep Learning – IIT Ropar Week 3 Assignment Answers (Jan-Apr 2026)


1. Fill in the blank.

Based on the complete architecture described in the case study, calculate the total number of learnable parameters (including all weights and biases) in the neural network.

Answer : 97

2. Fill in the blank.

Using the input vector x and the first column of W1, compute the pre-activation value of the first neuron in Hidden Layer 1. Ignore bias terms.
( Round to two decimal places.)

Answer : 0.35

3. Suppose the pre-activation values at the output layer for this patient are

z=[1.2,0.6,1.5]

Compute the softmax probabilities for the three severity classes.
(Round each value to two decimal places.)

  • [0.47, 0.19, 0.33]
  • [0.35, 0.19, 0.47]
  • [0.34, 0.19, 0.47]
  • [0.40, 0.20, 0.40]
Answer : c

4. Fill in the blank.

If the true class for this patient is Class 2 (Critical), compute the categorical cross-entropy loss for the prediction obtained in Question 3.
(Round to two decimal places.)

Answer : 0.76

5. From the clinical priorities described in the case study, why is a prediction such as [0.34, 0.19, 0.47], still concerning, even though Class 2 has the highest probability?

  • Because softmax probabilities do not sum to 1
  • Because the confidence gap between classes is small
  • Because sigmoid should have been used instead
  • Because cross-entropy loss cannot be computed
Answer : b

6. Based on the case study description of the input data collected at the registration desk, how many numerical features are provided as input to the neural network for each patient?

  • Three, corresponding to vital signs only
  • Four, excluding the self-reported pain score
  • Five, including vitals and pain score
  • Twelve, corresponding to the size of the dataset
Answer : c

7. The output layer of the neural network uses softmax activation. According to the case study, why is softmax the most appropriate choice for this layer?

  • Because sigmoid cannot be used in output layer of any neural network
  • Because the system predicts three mutually exclusive urgency classes
  • Because the output values must be unbounded
  • Because softmax eliminates the need for a loss function
Answer : b

8. Based on the number of neurons used in the model architecture, what is the correct dimension of the weight matrix connecting the hidden layer to the output layer?

  • 5 × 3
  • 4 × 3
  • 3 × 4
  • 4 × 4
Answer : b

9. Fill in the blank:

During deployment, the system produces the following softmax output for a patient:
[0.15, 0.25, 0.60]
Using the decision rule described in the case study, which urgency class will the system recommend?

Answer : 2

10. The hospital administration is concerned about situations in which the model assigns low probability to the true urgency class, especially for critical patients. Which of the following situations would result in a higher categorical cross-entropy loss during training?

  • The model assigns a low probability to the correct urgency class
  • The model assigns equal probabilities to all three classes
  • The model assigns probability 1.0 to the correct class
  • The model confidently predicts the wrong urgency class
Answer : a, b, d

11. Based on the architecture described in the case study, how many learnable parameters (weights + biases) are there in the entire neural network?

  • 43
  • 47
  • 51
  • 55
Answer : b

12. Fill in the blank

Using the provided input vector x and weight matrix W1, compute the pre-activation value of the first neuron in Hidden Layer 1 (before ReLU). Ignore bias terms.

Answer (rounded to two decimals): ______

Answer : 0.48

13. After applying ReLU activation to the value computed in Question 12, what will be the output
of that neuron?

Answer : 0.48

14. If the true label for this patient is Class 1 (Immediate admission required), compute the binary cross-entropy loss for the prediction 0.78.

  • 0.248
  • 0.5
  • 0.782
  • 0.481
Answer : a

15. According to backpropagation algorithm, which layer’s parameters will receive gradient updates first during training for this patient sample?

  • Input layer
  • First hidden layer
  • Second hidden layer
  • Output layer
Answer : d