Deep Learning – IIT Ropar Week 11 Assignment Answers

2 months ago

Sanket Kumar

4 minutes

Deep Learning - IIT Ropar

Deep Learning – IIT Ropar Week 11 Assignment Answers (Jan-Apr 2026)

Course Link : Click Here

Deep Learning – IIT Ropar Week 12

1. Why is an RNN more suitable than a feed-forward neural network for this task?

Because it trains faster on the large text datasets
Because it removes the need for the labeled data
Because it handles sequences with an internal state
Because it avoids the weight sharing across the layers

Answer : c

2. If the model forgets the beginning of long emails, what is the most likely reason?

The optimizer updates the weights too slowly
The output layer has too few neurons
The vocabulary size is not large enough
The hidden state changes at every time step

Answer : d

3. Which training phenomenon explains why early words have little influence on later predictions?

The loss function becomes flat during the training
The model memorizes only the recent inputs
Gradients become smaller over many time steps
The learning rate decreases automatically

Answer : c

4. If truncated BPTT is used with a window size of 15, what limitation does this introduce?

The output cannot be computed past 15 steps
The hidden state is reset after 15 steps
Only the last 15 steps influence learning
The network stops training after 15 steps

Answer : c

5. Which architectural change would best help this system retain important early information?

Applying stronger regularization
Reducing the number of layers
Using a gated recurrent model
Increasing the number of output units

Answer : c

6. Why is sequence modeling required for this video-based task?

Because the different frames have different resolutions
Because motion depends on changes over time
Because videos always contain multiple objects
Because the images are hard to process

Answer : b

7. What does exploding gradient behavior typically cause during training?

Constant output values
Reduced model capacity
Very large weight updates
Very slow convergence

Answer : c

8. Which technique is commonly used to control exploding gradients?

Increasing the learning rate
Limiting gradient magnitude
Reducing the batch size
Removing nonlinearities

Answer : b

9. Why would an LSTM outperform a vanilla RNN in this application?

It reorganizes time-dependent inputs into a layered structure
It suppresses the influence of the information from distant time steps
It eliminates recurrent connections to simplify the sequence modeling
It stores and controls information using gated memory mechanisms

Answer : d

10. Which LSTM gate controls how much previous information should be erased?

Reset gate
Input gate
Forget gate
Output gate

Answer : c

11. Why is it reasonable to generate only one output after processing the full message?

Because only the final word matters
Because sentiment depends on the full sequence
Because the intermediate outputs are unavailable
Because the model cannot output multiple values

Answer : b

12. What is a key architectural difference between a GRU and an LSTM?

GRUs have no memory at all
GRUs process all data without recurrence
GRUs merge input and forget functions
GRUs uses more gates than a LSTM

Answer : c

13. What is the function of the reset gate in a GRU?

It controls how the hidden state contributes to the final output at each time step
It determines how much new input information is added to the hidden state
It regulates the flow of error gradients during backpropagation through time
It controls how much past hidden state is used when forming the new state

Answer : d

14. Why do LSTM and GRU gates use the sigmoid activation function?

It speeds up the training process
It removes the need for normalization
It outputs values between 0 and 1
It prevents the gradient explosion

Answer : c

15. Which design choice best helps the model capture long-range emotional cues?

Reducing the size of the hidden state representation
Using gated recurrent units to regulate information
Removing earlier time steps to simplify the sequence
Relying only on the most recent inputs for prediction

Answer : b

For other NPTEL Assignment Answer : Click Here