# Neural Networks (part 2)

## 1. Introduction 
In this notebook you create a NN for recognizing handwritten digits or fashion items.

#### Refrences:

***MNIST dataset***: Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.

***Fashion dataset***: Han Xiao and Kashif Rasul and Roland Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms, arXiv, cs.LG/1708.07747


## 2. Loading the data

You make use of the MNIST or the Fashion-MINST dataset. **You get to choose one of them!** Both datasets consist of 70000 images of 28$\times$28 pixels (each pixel has a gray value 0 - 255). The dataset can be downloaded from the internet (see code below). In the text I reference to the images as numbers, if you do the fashion dataset, you can read here fashion items instead.

In [None]:
# importing the required modules
import numpy as np
import matplotlib.pyplot as plt

# to get matplot figures render correctly in the notebook use:
%matplotlib inline 

In [None]:
# Load data from https://www.openml.org/

from sklearn.datasets import fetch_openml

# Pick either the numbers or the fashion database here:
#X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False, parser='auto')
X, y = fetch_openml('Fashion-MNIST', version=1, return_X_y=True, as_frame=False)


X = X.T # required as the rows should be the features and the columns the samples; shape (784, 70000).
y = y.astype('int') # y has values as strings ('0', '1', ...'9') and we want integers (0, 1, ..., 9)

# inspect shape.
print(X.shape)
print(y.shape)


To have an idea of the data let's print a single sample

In [None]:
sample = X[:, 1] # sample number 1
label = y[1] # label of sample 1
plt.imshow(sample.reshape(28,28)) # need to reshape to a square image of 28 by 28
print(f'label = {label}')

## 3. Create the feature matrix X and labels matrix Y


In [None]:
# rescale feature matrix to values between 0 and 1
'''YOUR CODE GOES HERE '''

# create one-hot encoded Y matrix
'''YOUR CODE GOES HERE '''

In [None]:
# create a train, validation and test dataset
# for the test data set take the first 5000 samples
# for the train data set take sample 5001 to 60000
# for the validation set take sample 60001 to 70000

X_test = '''YOUR CODE GOES HERE '''
X_train = '''YOUR CODE GOES HERE '''
X_valid = '''YOUR CODE GOES HERE '''

Y_test = '''YOUR CODE GOES HERE '''
Y_train = '''YOUR CODE GOES HERE '''
Y_valid = '''YOUR CODE GOES HERE '''


## 4. The Neural Network

The NN that you will make has the following layout (this is something that will work, but feel free to make changes if you like):

- input layer ($l=0$): 784 nodes
- hidden layer ($l=1$): 300 nodes; ReLu activation function
- hidden layer ($l=2$): 100 nodes; ReLu activation function
- output layer ($l=3$): 10 nodes; Softmax activation function


### Define the activation functions

Below you define the required activation functions and their derivatives.

In [None]:
# define some activation functions



### Initalize the weights and biases


In [None]:
# intialize the neural network
rng = np.random.default_rng()

# number of nodes in each layer
'''YOUR CODE GOES HERE '''

# input layer
# no weights and biases

# hidden layer 1 
'''YOUR CODE GOES HERE '''

# hidden layer 2 
'''YOUR CODE GOES HERE '''

# output layer 
'''YOUR CODE GOES HERE '''


## 5. Train the NN

Firt you define three parameters that can be adjusted for optimal training.
- **learning_rate**: determines to what extend we update the weights and biases in the gradient descent step
- **no_epochs**: the number of times we pass the training data set through the network for training
- **batch_size**: how many samples we pass through the network before doing a gradient descent update

In [None]:
# train settings
'''YOUR CODE GOES HERE '''

Below you implement the training of the network. As a start copy and paste your code from part 1. Note that we use the **Cross Entropy** (CE) loss function.

In [None]:
# perform the training

'''YOUR CODE GOES HERE '''


## 6. Analyze the result
Make a plot of the training and validation loss and training and validation accuracy as function of the epoch.

In [None]:
# plot the loss and accuracy as function of the epoch number
'''YOUR CODE GOES HERE '''

Adapt the hyperparameters and/or NN layout to try to improve the result.

Finally use the test set to check the performance of the model

In [None]:
# compute the accuracy and losses of the test set
'''YOUR CODE GOES HERE '''

Finally it is interesting to check a few samples and their predicted label

In [None]:
# check the prediction of 10 random images
'''YOUR CODE GOES HERE '''

# Neural Networks - The autoencoder **(5EC mandatory)**

You are going to train a neural network that is able to generate a new picture of a any on the 0..9 numbers. The shape of the NN resembles that of a bow tie:

- input layer ($l=0$): 784 nodes
- hidden layer ($l=1$): 300 nodes; ReLu activation function (FIXED)
- hidden layer ($l=2$): 100 nodes; ReLu activation function (FIXED) 
- hidden layer ($l=3$): 10 nodes; Softmax activation function (FIXED)
- hidden layer ($l=4$): ? nodes; ? activation function
- hidden layer ($l=5$): ? nodes; ? activation function
- output layer ($l=6$): 784 nodes; ? activation function

You can train it by passing pictures trough it, the outgoing picture should be almost identical. Think about what loss function would be most appropriate here. Use the weights of the previously trained NN for the left side of the 'bow tie'. This setup is also known as an **autoencoder**. Layers 1-3 function as the _encoder_, wheras the layers 4-6 are the _decoder_. It can be used to find a compressed version of your input data. In case the loss is zero, you have obtained a perfect compression.

Once training is complete you can use the right hand side of the bow tie to generate a new image starting from the input vector, for example, a3 = np.zeros((10)), a3[3]=1 and see if you get a figure back that resembles a 2.

**Deliverables:** 
1. A plot of 10 newly generated numbers with your network.
2. An outputweights.npz file with all weights and biases of your filal Neural Network (encoder and decoder)
3. A code to read in the outputweights.npz file and use the image generator

_Deliverable 1 could look like:_

![numbers_small.png](attachment:a70fc836-93cc-41d2-88b0-3fed747ab4e0.png)
_It would be great if some of you manage to find a solution for the dark (missing) pixels!_ 

_Or in case of the fashion set:_

![zalando.png](attachment:cdbe129b-208e-438f-8c21-6d92276b02e6.png)

In [None]:
# Setup the Network



In [None]:
# Train the decoder



In [None]:
# Test your image generation and improve your training procedure (hyperparameter tuning)



### Deliverable 1 

In [None]:
# Prove that your network does what it should do and present your deliverable convincingly!



### Deliverable 2 

In [None]:
## saves the obtained weights
#outfile='totalweightsNN.npz'
#np.savez(outfile, W1,...,b1,...)
#npzfile = np.load(outfile)
#npzfile.files
#W1=npzfile['arr_0']
#W1.shape


### Deliverable 3 