Deep Learning – IIT Ropar Week 10 Assignment Answers

2 months ago

Sanket Kumar

3 minutes

Deep Learning – IIT Ropar Week 10 Assignment Answers (Jan-Apr 2026)

Course Link : Click Here

Deep Learning – IIT Ropar Week 11

1. After applying a 5 × 5 convolution with stride 1 and padding 2 to a 128 × 128 RGB aerial image, what will be the spatial size of the resulting feature maps?

124 × 124
128 × 128
130 × 130
64 × 64

Answer : b

2. If the first CNN layer uses 16 different convolution filters, how many feature maps will be produced?

Answer : c

3. After applying a 2 × 2 max-pooling layer with stride 2, what will be the new spatial size of the feature maps?

64 × 64
128 × 128
62 × 62
32 × 32

Answer : a

4. Why are pooling layers particularly useful for the drone vision system?

They increase model parameters and computational cost
They reduce spatial size while retaining key visual features
They remove color information from the input images
They sharpen image boundaries using high-frequency filters

Answer : b

5. Which CNN component mainly provides robustness when objects appear slightly shifted or rotated in drone images?

Convolution filters
Softmax layers
Fully connected layers
Pooling layers

Answer : d

6. If masking a small region of an MRI scan causes a large drop in tumor probability, what does this imply?

The region is irrelevant for classification
The region have important tumor-related features
The CNN is unstable during prediction
The CNN has memorized the training data patterns

Answer : b

7. Which visualization technique is used when image patches are masked to analyze prediction sensitivity?

Dropout applied during training to reduce overfitting
Filter visualization to inspect learned weights
Occlusion mapping used by masking selected regions
Batch normalization applied during inference

Answer : c

8. Why are first-layer CNN filters easier to interpret visually?

They have fewer parameters than deeper layers
They operate directly on pixel patterns
They do not use nonlinearities during activation
They have higher resolution feature maps

Answer : b

9. A filter that detects vertical edges will respond most strongly to which image regions?

Flat areas with edges
Uniform textures only
Vertical boundaries
Random noise patterns

Answer : c

10. If certain neurons activate strongly for tumor images but not for healthy scans, what does this indicate?

Overfitting to the training data distribution
Learning of tumor-specific visual features
Random noise in weights caused by initialization
Model instability during gradient updates

Answer : b

11. Why does VGG use many stacked 3 × 3 convolution layers instead of one large filter?

To create deep representations with fewer parameters
To reduce training data requirements
To eliminate pooling layers entirely
To prevent overfitting through architectural constraints

Answer : a

12. What is the main benefit of Inception modules using parallel filters of different sizes?

Lower memory usage across all CNN layers
Removal of nonlinearities between layers
Faster training due to simpler operations
Capturing features at multiple spatial scales

Answer : d

13. Why are skip connections used in ResNet architectures?

To reduce image resolution in later layers
To help gradients flow through deep networks
To remove convolution layers automatically
To improve pooling operations in the network

Answer : b

14. If a deep CNN trains poorly but a ResNet version performs well, what is the most likely reason?

Fewer parameters in ResNet models
Reduced overfitting due to regularization
Skip connections which solve most issues
Larger filters used in ResNet layers

Answer : c

15. To recognize both small distant signs and large nearby signs, which architecture choice is most appropriate?

Inception-style parallel convolutions
Global average pooling across the final feature map
Only 3 × 3 filters in all layers
No pooling across the entire network

Answer : a

For other NPTEL Assignment Answer : Click Here