Deep Learning – IIT Ropar Week 4 Assignment Answers

2 months ago

Sanket Kumar

2 minutes

Deep Learning – IIT Ropar Week 4 Assignment Answers (Jan-Apr 2026)

Course Link : Click Here

Deep Learning – IIT Ropar Week 5

1.If the team switches from batch gradient descent to mini-batch gradient descent with a batch size of 4,000, how many parameter updates occur in one epoch?

2
4,000
2,000
8,000

Answer : c

2. Why does mini-batch gradient descent generally converge faster than batch gradient descent in this scenario?

It always finds the global minimum
It updates weights more frequently
It eliminates noisy gradients completely
It does not require tuning learning rates

Answer : b

3. The team observes oscillations in loss while using mini-batch gradient descent. Which change would most directly reduce these oscillations?

Increasing learning rate
Removing mini-batches
Adding momentum
Increasing batch size to full dataset

Answer :

4. Which optimizer would best handle noisy gradients while also adapting learning rates automatically?

Vanilla Gradient Descent
Momentum-based Gradient Descent
RMSProp
Adam

Answer : d

5. If each parameter update takes 80 ms, how long (in seconds) will 3 epochs take with the batch size 4000?

Answer : b

6. What is the most likely cause of this behavior?

Exploding gradients
Vanishing gradients
Overfitting
High batch variance

Answer : b

7. Which optimizer would most effectively mitigate this issue?

Batch Gradient Descent
SGD without momentum
Adam
Fixed learning rate GD

Answer : c

8. Which additional technique would help improve gradient flow in this neural network?

Increasing depth further
Input normalisation
Removing non-linearities
Using larger batch sizes

Answer : b

9. If the network performs 250 updates per epoch, how many updates occur after 6 epochs?

Fill in the blank: __________

Answer : 1500

10. Which change would likely worsen the learning problem in this scenario?

Reducing learning rate further
Adding momentum
Switching to Adam
Normalizing inputs

Answer : a

11. What is the most likely reason for this behavior?

Low momentum
High learning rate
Large dataset
Small batch size

Answer : b

12. Which adjustment would best stabilize training without slowing convergence too much?

Remove momentum
Reduce learning rate slightly
Increase batch size drastically
Use batch gradient descent

Answer : b

13. Which optimizer allows computing gradients after a look-ahead step?

Adam
RMSProp
Nesterov Accelerated Gradient
Vanilla SGD

Answer : c

14. What happens if the momentum coefficient is set very close to 1?

Training stops
Training becomes noiseless
Model may diverge
Convergence is guaranteed

Answer : c

15. Why does mini-batch gradient descent often generalize better than full batch gradient descent?

It uses fewer parameters
It introduces controlled noise in updates
It always converges faster
It removes the need for regularisation

Answer : b

For other NPTEL Assignment Answer : Click Here