Deep Learning – IIT Ropar Week 11 Assignment Answers
Deep Learning – IIT Ropar Week 11 Assignment Answers (Jan-Apr 2026)
1. Why is an RNN more suitable than a feed-forward neural network for this task?
- Because it trains faster on the large text datasets
- Because it removes the need for the labeled data
- Because it handles sequences with an internal state
- Because it avoids the weight sharing across the layers
Answer : c
2. If the model forgets the beginning of long emails, what is the most likely reason?
- The optimizer updates the weights too slowly
- The output layer has too few neurons
- The vocabulary size is not large enough
- The hidden state changes at every time step
Answer : d
3. Which training phenomenon explains why early words have little influence on later predictions?
- The loss function becomes flat during the training
- The model memorizes only the recent inputs
- Gradients become smaller over many time steps
- The learning rate decreases automatically
Answer : c
4. If truncated BPTT is used with a window size of 15, what limitation does this introduce?
- The output cannot be computed past 15 steps
- The hidden state is reset after 15 steps
- Only the last 15 steps influence learning
- The network stops training after 15 steps
Answer : c
5. Which architectural change would best help this system retain important early information?
- Applying stronger regularization
- Reducing the number of layers
- Using a gated recurrent model
- Increasing the number of output units
Answer : c
6. Why is sequence modeling required for this video-based task?
- Because the different frames have different resolutions
- Because motion depends on changes over time
- Because videos always contain multiple objects
- Because the images are hard to process
Answer : b
7. What does exploding gradient behavior typically cause during training?
- Constant output values
- Reduced model capacity
- Very large weight updates
- Very slow convergence
Answer : c
8. Which technique is commonly used to control exploding gradients?
- Increasing the learning rate
- Limiting gradient magnitude
- Reducing the batch size
- Removing nonlinearities
Answer : b
9. Why would an LSTM outperform a vanilla RNN in this application?
- It reorganizes time-dependent inputs into a layered structure
- It suppresses the influence of the information from distant time steps
- It eliminates recurrent connections to simplify the sequence modeling
- It stores and controls information using gated memory mechanisms
Answer : d
10. Which LSTM gate controls how much previous information should be erased?
- Reset gate
- Input gate
- Forget gate
- Output gate
Answer : c
11. Why is it reasonable to generate only one output after processing the full message?
- Because only the final word matters
- Because sentiment depends on the full sequence
- Because the intermediate outputs are unavailable
- Because the model cannot output multiple values
Answer : b
12. What is a key architectural difference between a GRU and an LSTM?
- GRUs have no memory at all
- GRUs process all data without recurrence
- GRUs merge input and forget functions
- GRUs uses more gates than a LSTM
Answer : c
13. What is the function of the reset gate in a GRU?
- It controls how the hidden state contributes to the final output at each time step
- It determines how much new input information is added to the hidden state
- It regulates the flow of error gradients during backpropagation through time
- It controls how much past hidden state is used when forming the new state
Answer : d
14. Why do LSTM and GRU gates use the sigmoid activation function?
- It speeds up the training process
- It removes the need for normalization
- It outputs values between 0 and 1
- It prevents the gradient explosion
Answer : c
15. Which design choice best helps the model capture long-range emotional cues?
- Reducing the size of the hidden state representation
- Using gated recurrent units to regulate information
- Removing earlier time steps to simplify the sequence
- Relying only on the most recent inputs for prediction
Answer : b