fbpx
Build a First Neural Network Build a First Neural Network
Neural networks are weirdly good at translating languages and identifying dogs by breed, but they can be intimidating to get started with. In an... Build a First Neural Network

Neural networks are weirdly good at translating languages and identifying dogs by breed, but they can be intimidating to get started with. In an effort to smooth this on-ramp, I created a neural network framework specifically for teaching and experimentation. It’s called Cottonwood and this notebook shows how to build a first neural network.

What you’ll get out of this case study to build a first neural network

You will get to build an honest-to-goodness neural network, a densely connected 8-layer autoencoder and run it on an image dataset. This will set you up well for subsequent projects using your framework of choice, whether it’s Cottonwood, PyTorch, Tensorflow.

What you’ll need to get started

To run this case study, you’ll need Python 3, Git, and access to your command line.

No prior machine learning experience necessary.

Get cottonwood

To get Cottonwood, navigate your command line to directory you’d like it to live in. Then clone the Git repository, and install the Cottonwood package. The “-e” option lets you edit your copy of Cottonwood on the fly, opening up all kinds of opportunities for experimentation.

git clone https://gitlab.com/brohrer/cottonwood.git
python3 -m pip install -e cottonwood

It’s also important to specify the version of Cottonwood we want to use. Because Cottonwood young and designed to encourage experimentation, it’s not guaranteed to be backward compatible. The version matters. This case study runs well on version 14.

cd cottonwood
git checkout v14

When you’re done, navigate to where you want this case study to live and open up a notebook or script there.

Import some packages

The first thing to do is to pull in the Cottonwood building blocks we’re going to need. These include several types of layers, a Dense layer, a Difference layer, and a RangeNormalization layer. It also includes an ANN (artificial neural network) model and a Nordic runes data set that comes prepackaged with Cottonwood for coding up examples.

from cottonwood.core.layers.dense import Dense
from cottonwood.core.layers.difference import Difference
from cottonwood.core.layers.range_normalization import RangeNormalization
from cottonwood.core.model import ANN
import cottonwood.data.data_loader_nordic_runes as dat

Get some data

Then we need to pull the data in. The get_data_sets() function in our data loader pulls in two generators, a training_set and an evaluation_set. A generator is a convenient Python object that hands you a new example every time you call next() on it. We put this to the test by calling next(training_set) and getting a first example of our data.

training_set, evaluation_set = dat.get_data_sets()

sample = next(training_set)

n_pixels = sample.shape[0] * sample.shape[1]

We can take a quick look at it and see that it’s a two dimensional array of zeros and ones, a very crude representation of an image. It has seven rows and seven columns, 49 pixels it all. If you squint and tilt your head you can just make out the pattern. This data set is a collection of 24 Norse runes, coarsely pixelized.

print(sample)
 

[[0 1 0 0 0 1 0]

 [0 1 0 0 0 1 0]

 [0 1 1 0 0 1 0]

 [0 1 0 1 0 1 0]

 [0 1 0 0 1 1 0]

 [0 1 0 0 0 1 0]

 [0 1 0 0 0 1 0]]

 

Create some layers

Now we get to construct the neural network itself. The first step is to add a RangeNormalization layer. This ensures that the data falls in a range that can be meaningfully interpreted by the network.

Then we have a sequence of six Dense layers. The first argument is the number of outputs that the layer will have. The second argument explicitly connects each Dense layer to the previous layer. The Dense layer uses the previous layer to know how many inputs it should expect.

Here we have five hidden layers and an output layer. The hidden layer with the smallest number of nodes is the bottleneck layer. At this layer the information contained in the images is at its highest compression. Instead of 49 separate pixel values, it’s reduced down to eight node activities.

Finally, we add a Difference layer to find how close the output of the autoencoder is to the input. This gives us an error signal that we can try to make as small as possible, re-creating the outputs with as much fidelity as we can.

layers = []

layers.append(RangeNormalization(training_set))

layers.append(Dense(17, previous_layer=layers[-1]))

layers.append(Dense(13, previous_layer=layers[-1]))

layers.append(Dense(8, previous_layer=layers[-1]))

layers.append(Dense(12, previous_layer=layers[-1]))

layers.append(Dense(19, previous_layer=layers[-1]))

layers.append(Dense(n_pixels, previous_layer=layers[-1]))

layers.append(Difference(layers[-1], layers[0]))

Build and run the model

Finally it’s time to fire up the autoencoder! We pass in the layers to the ANN model to initialize it. Then, thanks to all of our preparatory work, training the model just requires calling its train() method and passing at the training_data_set. Similarly evaluating the model is as straightforward as calling its evaluate() method and passing the evaluation_data_set.

When the model is done running, it returns its error history. If the model behavior is behaving well, the error gets smaller over time.

autoencoder = ANN(layers=layers)

autoencoder.train(training_set)

autoencoder.evaluate(evaluation_set)

When the model runs, it also saves out some informative reports. In the reports directory, we can inspect model_parameters.txt. This human readable text file gives all the information necessary to re-create this exact model in a different framework.

type: artificial neural network

number of training iterations: 800000

number of evaluation iterations: 200000

error_function:  mean squared error

layer 0:  range normalization

  range maximum: 1

  range minimum: 0

layer 1:  fully connected

  number of inputs: 49

  number of outputs: 17

  activation function:  hyperbolic tangent

  initialization:  LSUV

  optimizer:  momentum

    learning_rate: 0.001

    momentum amount: 0.9

    minibatch size: 1

layer 2:  fully connected

  number of inputs: 17

We can also see the performance history in a plot. This is helpful for getting a quick feel for how our model is performing.

To get a more detailed look at the model, including the nodes in each layer and the weights between them, we can browse through the snapshots generated while the model runs. This visualization gives a good pictorial representation of the model and helps to spark research questions and ideas for things to try in the next run.


Next steps

[Related article: Building a Custom Convolutional Neural Network in Keras]

Now you’re in a good place. You have a fully functional autoencoder performing compression on an image dataset. You have a few options if you’d like to take it to the next level. You can experiment with different architectures. You can pull in a new set of images of your own. If you have been bitten by the bug, and really want to know how it all works, I have prepared a series of courses for you at the End-to-End Machine Learning School: 193, 312, 313, and 314.

Or you can sign up for my Introduction to Neural Networks workshop sequence, offered at ODSC East 2020 in Boston covering theory in part 1 and practice in part 2.

Brandon Rohrer

Brandon Rohrer

Brandon got started in machine learning studying robotics at MIT, machine vision and signal processing at Sandia National Laboratories, then predictive modeling at DuPont Pioneer, and cloud computing at Microsoft. At Facebook he built global power distribution models using satellite imagery and unstructured text classifiers. Now at iRobot he helps robot get better at doing their jobs.

1