This is the second part in a three-part series. The first part can be read here.
The new look
If you have built neural networks with Perceptron before, you may be surprised to see it has a complete redesign (not available on Mac yet):
To get Perceptron up and running you will need to make sure you have all its required modules, as I have not yet realised it as an application. You can clone the repository here. Also, you can use this as an example dataset. It contains about 50,000 handwriting samples (MNIST) and is already processed for Perceptron.
Or, you can try and use your own dataset and preprocess it:
Firstly, put your dataset in the original_datasets folder as a text file. Then select it.
From here, you have a few options to make sure your dataset is as ready as possible to be used. Usually, rows are separated via a line break n, but if yours isn’t you’ll need to identify the correct character. Once you do, the live data sample will update and should separate rows correctly. Often the first row of dataset contains labels (as seen in the sample above), in which case you must set set ‘ignore first row’ to ‘yes.’
Then, Perceptron presents you with the option to list fields that should be ignored/removed (where the first field is 0). For example, you don’t want to input a unique ID field with no real relevance to the data into the neural net.
You should minimise many values in your dataset to avoid forcing the neural net to handle huge numbers, regardless of activation. In the new version, with the ‘Fields For Minimisation’ input, you can either enter the word ‘all’, or list fields that need specific minimisation. Depending on that, the software will offer further options so you can choose exceptions, or the divider value.
Also, your dataset should not end up with any alphabetic characters. Even the new Perceptron does not attempt to handle language in that way. Instead, it converts all alphabetic valued fields to one-hot vectors. Finally, you must specify your target values (values to predict), and their design.
It may be that you want the neural net to predict real values, in which case you can list the fields’ positions and select ‘real.’ However, you may be solving a classification problem where a binary vector is necessary. If you only have one target integer that represents the class the data falls into, then select your target type as binary and specify how many classes there are in total.
With the handwritten digit recognition, your target values are integers between 0-9, inclusive. If you then specify the target type as binary, Perceptron will know to construct a binary vector of the range specified that look like the following, where it sets each target value to 1, and the rest to 0.
[0,0,0,1,0,0,0,0,0,0] = 3
[0,0,0,0,1,0,0,0,0,0] = 4
[0,0,0,0,0,0,0,1,0,0] = 7
…etc. Again, the live data sample will show what the output vector will look like.
Training and Testing
For this post, I will use the digit dataset I’ve already processed. Each item in the dataset is a 28×28 pixel matrix of values representing an image each of a handwritten digit between 0-9. Each pixel will be an input to our neural net, so we design a structure of 784 input neurons, 1 hidden layer of 40 neurons, and an output layer of 10 neurons. This can more simply be written as 784, 80, 10.
There’ll be a total of 784 + 80 + 10 = 874 neurons, and a total of (80 · 784) + (10 · 80) = 63520 weights. This is because if there are 3 neuron layers, there will be 2 weight layers.
To get a basic example up and running, follow these steps, which I will explain in depth afterwards.
- Make sure digits.txt is in the processed_datasets folder. Then open Perceptron and use digits.txt as the dataset file name.
- Enter values like this:
- Click ‘Start learning.’
- Either wait for all 10 epochs to finish, or stop the learning early.
- You can use ‘test with input’ to write your own number on some paper, hold it up to the camera, and see it if gets it right. This isn’t actually a super helpful tool, but it is satisfying to see your model work on a real example. For best results, make sure the threshold input camera can actually see your writing. So,
You can also test with images, enter values manually, etc.
6. Now, play around with different values and see how they affect the learning success and visual structure of the neural net.
This is an example of how optimising a neural network could look. We entered values, got poor results (yellow), perhaps changed the hidden layer neuron count, retrained, and eventually got purple. Finally, we achieved our best result: dark blue.
In the next part, we will take a closer look at the code behind this magic.