Deep Learning – IIT Ropar Week 10 Assignment Answers

Deep Learning - IIT Ropar

Deep Learning – IIT Ropar Week 10 Assignment Answers (Jan-Apr 2026)


1. After applying a 5 × 5 convolution with stride 1 and padding 2 to a 128 × 128 RGB aerial image, what will be the spatial size of the resulting feature maps?

  • 124 × 124
  • 128 × 128
  • 130 × 130
  • 64 × 64
Answer : b

2. If the first CNN layer uses 16 different convolution filters, how many feature maps will be produced?

  • 3
  • 5
  • 16
  • 128
Answer : c

3. After applying a 2 × 2 max-pooling layer with stride 2, what will be the new spatial size of the feature maps?

  • 64 × 64
  • 128 × 128
  • 62 × 62
  • 32 × 32
Answer : a

4. Why are pooling layers particularly useful for the drone vision system?

  • They increase model parameters and computational cost
  • They reduce spatial size while retaining key visual features
  • They remove color information from the input images
  • They sharpen image boundaries using high-frequency filters
Answer : b

5. Which CNN component mainly provides robustness when objects appear slightly shifted or rotated in drone images?

  • Convolution filters
  • Softmax layers
  • Fully connected layers
  • Pooling layers
Answer : d

6. If masking a small region of an MRI scan causes a large drop in tumor probability, what does this imply?

  • The region is irrelevant for classification
  • The region have important tumor-related features
  • The CNN is unstable during prediction
  • The CNN has memorized the training data patterns
Answer : b

7. Which visualization technique is used when image patches are masked to analyze prediction sensitivity?

  • Dropout applied during training to reduce overfitting
  • Filter visualization to inspect learned weights
  • Occlusion mapping used by masking selected regions
  • Batch normalization applied during inference
Answer : c

8. Why are first-layer CNN filters easier to interpret visually?

  • They have fewer parameters than deeper layers
  • They operate directly on pixel patterns
  • They do not use nonlinearities during activation
  • They have higher resolution feature maps
Answer : b

9. A filter that detects vertical edges will respond most strongly to which image regions?

  • Flat areas with edges
  • Uniform textures only
  • Vertical boundaries
  • Random noise patterns
Answer : c

10. If certain neurons activate strongly for tumor images but not for healthy scans, what does this indicate?

  • Overfitting to the training data distribution
  • Learning of tumor-specific visual features
  • Random noise in weights caused by initialization
  • Model instability during gradient updates
Answer : b

11. Why does VGG use many stacked 3 × 3 convolution layers instead of one large filter?

  • To create deep representations with fewer parameters
  • To reduce training data requirements
  • To eliminate pooling layers entirely
  • To prevent overfitting through architectural constraints
Answer : a

12. What is the main benefit of Inception modules using parallel filters of different sizes?

  • Lower memory usage across all CNN layers
  • Removal of nonlinearities between layers
  • Faster training due to simpler operations
  • Capturing features at multiple spatial scales
Answer : d

13. Why are skip connections used in ResNet architectures?

  • To reduce image resolution in later layers
  • To help gradients flow through deep networks
  • To remove convolution layers automatically
  • To improve pooling operations in the network
Answer : b

14. If a deep CNN trains poorly but a ResNet version performs well, what is the most likely reason?

  • Fewer parameters in ResNet models
  • Reduced overfitting due to regularization
  • Skip connections which solve most issues
  • Larger filters used in ResNet layers
Answer : c

15. To recognize both small distant signs and large nearby signs, which architecture choice is most appropriate?

  • Inception-style parallel convolutions
  • Global average pooling across the final feature map
  • Only 3 × 3 filters in all layers
  • No pooling across the entire network
Answer : a