TarzaNN User's Guide

Network structure can be defined in two ways: hard-coded in the C++ source, and through XML files. The former is used by developers while adding new features to the simulator and thus it is documented in the Programmer's Guide. The latter is the recommended way to use the simulator, and will be documented here.

Documentation (including the motion model, Selective Tuning and learning implementations) can be found in the TarzaNN apendix from Albert's dissertation (here).

Network Structure

TarzaNN neural networks are composed of "feature planes" interconnected by "filters."

Feature planes are 2D arrays of neurons characterized by identical properties (neuron type, neuron parameters) and receptive fields.

Filters are 2D definitions of receptive fields, and they describe how the output of a feature plane is connected to the neurons of another one (like the arrays of weights in classical neural networks).

All the entities that make up a network will be described bellow, and their usage and parameters will be presented through examples.

Feature planes

Several kinds of feature plane can be defined in TarzaNN.

Regular (default) feature planes are collections of neurons as described above. At each simulation time step, the feature plane will read its inputs and produce an output based on the filters and neuron properties.

Input feature planes are the neural network's connection to the outside world. When declaring an input feature plane in the XML file, users must provide an input file name. TarzaNN supports a wide variety of input formats, and will detect the format automatically. For black and white images, the input feature plane will have a single output, corresponding to the pixel values in the image. For colour images, the input feature plane will have 4 outputs, corresponding to the luminosity and to the red, green and blue colour channels. (XXX Andrei - is this so? How about difference of colour? How about for video sequences - this probably requires it's own section) In all cases, the input values can be scaled by providing optional minimum and maximum pixel values.

Input controller feature planes are similar to regular input feature planes, and are meant for active vision applications. Input socket feature planes receive their input and send commands to a camera connected through sockets. Input virtual feature plane simulates a physical camera by providing a window onto a larger image read from a file.

Filters

Filters are matrices of weights that connect two feature planes and define the receptive fields of neurons. Nothing more to it. TarzaNN allows you to define different types of filters (more can be added by subclassing the Filter class, more in the Programmer's Guide or by defining file filters - see below). All filters can be rotated, shifted, scaled, etc. through filter operations. For details on the filters, see the class hierarchy

Filter types:

Gaussian filters
Difference of Gaussian filters are obtained by subtracting two Gaussians
Laplacian filter
Wilson filter
Double Wilson filter
Retinal filters (are special)

Filter operations:

Multiply
Add
Clamp
Shift X Percent
Shift Y Percent
Shift X Pixels
Shift Y Pixels

File filters allow users to define a filter as an image file - work in progress, format and tools TBD

Convolutions are performed in the Filter class, important to note how the border problem is solved: in the XML file you define for each filter a "padding_type" attribute, with the following values: 2 - mirror the sides of the image (seems to be the best for natural images), 4 - fill with background (now fill with 0), and 8 - repeat border pixels (i.e. one line is repeated as many times as needed). The last two work best for synthetic images.

Neurons

To better understand the various neurons available, we start with a primer on neuron internals.

In the general case a model neuron performs a weighted sum of its inputs, the result of which is called the activation, followed by a non-linearity. We extended this model to allow for significant more flexibility. Three classes of neuron inputs are possible: regular, non-Fourier, and gating inputs.

For regular and gating inputs, the regular and gating activations are calculated as presented above, by using the filters as weights.

Non-Fourier inputs are first summed in pairs (so order is important!) and then each sum is passed through the associated filter to calculate the activation. The idea behind this is to rectify the signal by summing the response of on- and off-center neurons, but nothing of the sort is enforced, it is up to the designer to make sure that inputs and input pairs make sense.

The regular and non-Fourier activations are then summed, clamped at zero (XXX - do we want this in the general case? I say yes), and passed through a non-linearity (more on this below).

The result is gated by the thresholded gating activation, i.e. if for a given neuron the gating activation is above a threshold, the result is presented in the output, otherwise the output is set to zero (XXX is this what we want? Or just some inhibition?)

Any of the above categories can be missing (e.g. we can have pure, unadulterated non-Fourier neurons). Of course, it doesn't make too much sense to have pure gating neurons.

As mentioned above, activations pass through a nonlinearity that can be defined by specifying the neuron type in the XML file. By default, there is no nonlinearity, so the (clamped) activation is passed on as output. The most interesting category of non-linearity is defined by differential equations (currently we simulate Wilson-Cowan and Hodgkin-Huxley model neurons). Other non-linearities can be defined, more on this in the Programmer's Guide.

Synchronization

TarzaNN has a few distinct synch modes, a brief description of each follows. The issue is a bit complicated by the fact that some synch options can be set in the XML file, at the feature plane level, and some can be set through the GUI menus. This needs to be cleaned up.

Asynchronous (XML) - means that the FP does not wait for anything and anybody, it just iterates, reading inputs and producing outputs. Probably best for simulations with differential equation neurons, where the workload on FPs is more or less similar.

Synchronous (XML) - means that the FP waits to be notified about new input being available. Works best for feed-forward networks. One of the peculiarities: the FP will start as soon as one input becomes available, if it were to wait for all, the networks with feedback would hang.

Synchronous mode (GUI) - ? Needs to be checked to activate the ones below.

Pause after each step (GUI) - each FP in the network will run once, then the network stops waiting for input. By default runs in the Asynchronous mode described above.

Layer-by-layer mode (GUI) - each layer in the network will perform one step (i.e. each FP in the layer will execute once) before the next layer executes (modulo number of layers). Layers execute in the order in which they are defined in the XML file. If Pause after each step is checked, the simulation will stop waiting for input after all the layers have executed. Intended mainly for STM style WTA.