







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A comprehensive assignment for a university course on neural networks and convolutional neural networks. It covers various aspects of neural network design, implementation, and application, including activation functions, gradient descent optimization, and convolutional operations. The assignment includes exercises that require students to implement different neural network architectures, analyze their performance, and apply convolutional filters for image processing tasks.
Typology: Assignments
1 / 13
This page cannot be seen from the preview
Don't miss anything!
Course: CS 725 – Instructor: Preethi Jyothi Due date: 11:59 pm, October 28, 2024
General Instructions
assgmt2/ | +- nn_template.py +- part2.pdf +- loss_bgd_1.png,loss_bgd_2.png,loss_bgd_3.png +- loss_adam.png +- convolution_1d_template.py +- template_2dconv.py +- part5.py [EXTRA CREDIT] +- kaggle.csv [EXTRA CREDIT]
Compress your submission directory using the command: tar -cvzf [rollno1][rollno2].tgz assgmt2 and upload this .tgz to Moodle. Make sure the filename is roll numbers of all team members delimited by “". This sub- mission is due on or before 11:59 pm on Oct 28, 2024. No extensions will be entertained.
Part I: Implement a Feedforward Neural Network (25 points)
For this problem, you will implement a feedforward neural network training algo- rithm from scratch. This network will have:
nn_template.py outlines the structure of the neural network with support for various activation functions and optimizers. Your task is to complete the missing portions of the code, labeled as TODOs in nn_template.py. This neural network is designed to classify data into binary classes (denoted by 0, 1).
Firstly, implement the sigmoid, tanh functions and their derivatives. These will be used in both the forward and the backward passes of the network. Complete the missing functions sigmoid(x) in TODO 1a , sigmoid_derivative(x) in TODO 1b , tanh(x) in TODO 1c and tanh_derivative(x) in TODO 1d. relu(x) and relu_derivative(x) are already implemented. Note that the sigmoid function is σ (x) = (^1) +^1 e−x , and its derivative is σ ′(x) = σ (x) · ( 1 − σ (x)). The tanh function is
tanh(x) = e
x (^) −e−x ex^ +e−x^ , and its derivative is tanh
′(x) = 1 − tanh (^2) (x). [2 pts]
Next, implement the following functions within the NN class.
L(y, ˆy) = −[y log( yˆ) + ( 1 − y) log( 1 − yˆ)]
The derivative of this loss with respect to ˆy is:
∂ L ∂ yˆ
y y^ ˆ
1 − y 1 − yˆ
train: Over num_epochs, the function train calls the forward and backward passes and computes gradients for all the training examples in a batch (stored in X and y) and updates the weights using an optimizer. The training loop also calculates train and test losses after every epoch.
If your implementation is correct, your code will converge in less than 30 epochs us- ing batch_size = 100 and your test accuracy will be a perfect 1.0. During evaluation, we will also check your code on a new dataset.
Part II: Explainability in Neural Networks (20 points)
All modern neural networks, despite exhibiting remarkable performance, lack inter- pretability or explainability a. In this section, we will aim to design neural networks for two toy datasets that are both optimal (in not needing more layers than required for perfect separability) and interpretable.
You are given two binary classification datasets shown in Figure 2 and Figure 3. Points within the blue regions are labeled 1 and the rest are labeled 0. Assume that points on the boundaries are labeled 1. Design an optimal neural network for each dataset that perfectly classifies the points, and provide written explanations of what each neuron is aiming to do.
Note the following points when designing your neural network:
g(x; T) =
1 if x ≥ T 0 otherwise
0 1 2 3 4 5
1
2
3
4
5
x 1 + x 2 = 4
x 1 + x 2 = 6
x 1 + x 2 = 9
x 1 + x 2 = 1
x 1 x 2
4 1
1 -
-1 9
-1 (^1)
-1 1
1 1
1 1
x 1 x 2
4 1
1 -
-1 -1 9
-1 (^1)
-1 1
(^1 )
1 1
(A)
(B)
g ( x ; T ) = (^) {^1 if^ x^ ≥^ T 0 otherwise
Figure 2: Bands of blue [8 pts]
5
4
3
1 2
0
1
1
2
2 3 4 5 6 7 8 9 10
3
4
5
6
7
8
9
10
Figure 3: Catch the star [12 pts]
Here’s a sample problem and its solution to illustrate what we expect in your answer. Note that you need to submit one pdf file titled part2.pdf that contains drawings of both neural networks and explanations for the hidden neurons. Feel free to use your code written in Part I to validate your answers. However, DO NOT submit any of these files. We will be grading solely based on your written solutions in the pdf.
aThere is an entire sub-field of deep learning that focuses on explainability. Here’s a survey paper.
Part IV: De-convoluting Convolutions (30 points)
1D Convolutions. The 1D convolution in machine learning is actually the discrete convolution between two functions f and g, defined on the set of integers, Z. The convolution of f and g is defined as:
( f ∗ g)[n] =
∞ ∑ m=−∞
f [m]g[n − m]
In the context of machine learning, since we deal with finite length arrays, the sum above is computed only for those indices that are within the bounds of the arrays. (For indices that are outside the bounds, say m which is greater than the length of array f , the corresponding discrete function at that point is taken to be zero). We will call f above as the “input" of the convolution, and g as the “kernel" or “filter" that is performing the convolution.
Padding. Let us quickly walk you through the concept of padding. Depending on what we want the output size to be, we can pad the input of the convolution (see Figure 4).
Stride. Next, let us look at stride. Stride of a convolution operation dictates how many steps the kernel moves at each step. Stride is typically used to reduce the dimensions of the input; see Figure 5. Note that the stride and kernel size are two independent parameters. If the kernel size in Figure 5 was 4, and assuming valid padding, the output size would be 2 × 2. Thus, if f and g are arrays of length L (^) f and Lg respectively, then the length of the convolution with stride 1 and full padding of f and g is L (^) f + Lg − 1.
(A) 1D Conv. Implement a function Convolution_1D in the file convolution_1d_template.py that takes in two 1-dimensional numpy arrays, and computes the 1D convolution of the two arrays, with a given stride and ‘valid’ or ‘full’ as possible padding modes. [6 pts]
Figure 4: Full, same and valid padding
Figure 5: Illustration of a convolution with stride 2
(B) Two dice. You are given two dice, an n faced die and an m faced die. Both dice are biased, that is, the probabilities of landing on each face are not equal. You are given the probability mass functions of the two dice as two numpy arrays. For example, consider a 6-faced die A and a 3-faced die B with the following probability mass functions:
pA = [0.1, 0.2, 0.3, 0.1, 0.1, 0.2] pB = [0.3, 0.4, 0.3]
The two dice are rolled together. Let us try to compute the probability that the sum of the faces of the two dice is equal to k. As a concrete example, for dice A and B above, let us try to compute the probability that the sum of the faces is equal to 7. The possibilities and their associated probabilities are:
( (^6) A, 1B) : 0.2 × 0.3 = 0. ( (^5) A, 2B) : 0.1 × 0.4 = 0. ( (^4) A, 3B) : 0.1 × 0.3 = 0.
Figure 6: Illustration of edge detection using convolution.
Figure 7: Illustration of noise removal.
(E) Image Denoising. Various filters can be used to pick up high and low fre- quency components of an image. Thus, they can be used for image denoising. This is because noise in images is usually high frequency and certain filters can be used to remove this high frequency component. You are given an image with noise added to it, noisycutebird.png. Using only our helper functions load_image and save_image and the numpy library, write a function remove_noise to remove this noise while maintaining the sharpness of the image edges. See Figure 7. This function must take a square image patch as input, apply the function to the patch and return a single pixel value. Pass these to your movePatchOverImg function with the provided image and save the resulting image. [2 pts]
(F) Unsharp masking. Unsharp masking is a technique of creating the illusion of a sharp image by enhancing the intensity of the edges of the image. A mask of the edges is created and added to the image. We will do this using Gaussian blur, which is 2D convolution of an image with a Gaussian kernel.
gaussian(x, y) =
2 πσ^2
exp −
(x − size 2 − 1 )^2 + (y − size 2 − 1 )^2 2 σ^2
Figure 8: Illustration of unsharp masking
Kaggle Competition: Extra Credit (5 points)
Challenge: Classification from Image Features.
Overview. In this Kaggle competition, you will solve a realistic classification task to predict a class label (0 to 99) based on a set of visual features extracted from the publicly available ImageNet dataset. The final model will be evaluated on a test dataset via Kaggle, and test performance will be measured using Accuracy. Competition link: You can join the competition on Kaggle: IIT Bombay CS 725 Assignment 2 (Autumn 2024). Please sign up on Kaggle using your IITB LDAP email ID, with your Kaggle “Display Name" set to the roll number of any member in your team. This is important for us to identify you on the leaderboard. Dataset description. You are given three CSV files:
Task description. Implement a classification model for the given problem. You are free to use any of the predefined neural network layers from PyTorch or other ML libraries with any choice of optimizers and regularization techniques. You do not need to stick to the code you’ve written in Part I. Tune the hyperparameters on a held- out set from train.csv to achieve best model performance on the test set. Predict the target label on the test dataset.
Evaluation. The performance of your model will be evaluated based on the classifi- cation accuracy calculated on the test dataset (automatically via Kaggle). Your model will be evaluated on the provided test set, where a random 50% of the examples are marked as private and the remaining are public. The final evaluation will be based on the private part of the test set, which will be revealed via the private leaderboard after the competition concludes.
Submission. Submit your source file named part5.py and a CSV file kaggle.csv with your predicted label for the test dataset, following the format in sample.csv. This is an extra credit problem. Top-scoring performers on the “Private Leader- board" (with a fairly relaxed threshold determined after the deadline passes) will be awarded up to 5 extra points.