25 Years of Programming
An open source source for C, C++, OWL, BASIC, MDB, XLS, DOT, and more...
Home   Projects   Up   Sitemap   Search   Blog   Forum+Chat   About Us   Privacy   Terms of Use   Feedback   FAQ   Images   Services   Payments   Humor   Music

"Visible" multilayer perceptron backpropagation neural network, Part 2 - C++ Network code

This page has the source code for classes that implement the multilayer perceptron (MLP) backpropagation neural network used by the VCNEURAL project. There are many source code comments to try to make it easy to understand.

The code is mostly the same as that used in the WNeural project for Borland OWL from which the VCNeural project was converted, but this code uses the Standard Template Library (STL) instead of the Borland BIDS container classes.

The classes on this page have been tested to work with Visual C++ for Windows XP and GNU GCC g++ for Linux. You will need the corresponding versions of my.h and mylib.cpp:

I don't have an application for Linux equivalent to the Windows "visible neural net" project that allows watching it as it "thinks". In Linux, what I'd prefer to create is an application that puts the network to a useful purpose.

There are various notes about my neural net projects in neural.doc, complex.doc, and biology.doc.

There are many online sources for reading about neural net theory and practice. See especially ftp://ftp.sas.com/pub/neural/FAQ.html, which is the FAQ for http://groups.google.com/group/comp.ai.neural-nets/topics.

network.h

/*	network.h				1-27-2010		Microsoft Visual C++ or GNU GCC g++
	This is part of the VCNeural project.
	Copyright (C)1995-97, 2007, 2010 Steven Whitney.
	Published under GNU GPL (General Public License) Version 3, with ABSOLUTELY NO WARRANTY.
	Initially published by http://25yearsofprogramming.com.

class declarations for the neural network.

*/
#ifndef  __NETWORK_H
#define  __NETWORK_H

#ifdef WIN32
	#include "..\..\mylib\my.h"
#else
	#include "my.h"
#endif

using namespace std;

//////////////////////////////////////////////////////////////////////////////
// globals
//----------------------------------------------------------------------------
enum { INPUT, HIDDEN, OUTPUT };	// neuron types
class Node;						// forward reference

double RandomWeight();	// calculates a random weight for Synapses and Node.bias

//////////////////////////////////////////////////////////////////////////////
/* a Synapse holds a pointer to the source node and the associated weight of the connection.
A synapse (rather than a whole neuron) is either excitatory  or inhibitory.
A single real neuron has both types of synapses.
Allowing negative weights is correct. A neuron firing CAN have an
excitatory effect (positive weight) on some of its downline neighbors and an
inhibitory effect (negative weight) on others, based on the type of synapse.
As a side effect, a negative weight allows a node to invert its input (desirable, I think).
*/
class Synapse
{
public:
	Synapse() { from = 0; weight = 0.; }		// default ctor required by TArray (don't use it)
	Synapse(Node* fromnode);					// normal constructor
	Synapse(Node* fromnode, double startweight);// for specifying weight when building from a file

	bool operator == (const Synapse& other) const;

	Node* from;		// pointer to the donor node
	double weight;	// weighting: input from donor = from->output * weight

};                  	    // class Synapse
//////////////////////////////////////////////////////////////////////////////
// a single neuron
class Node
{
public:							// not sure everything needs to be public anymore
	Node();						// default constructor required for container (don't use!)
	Node(int nodetype, int HiddenLayerMembership = 1);	// constructor
	~Node();											// destructor

	bool operator == (const Node& other) const;
	friend ostream& operator << (ostream&, const Node&);

	bool OutputsTo(Node* tonode);	// whether this node already outputs to tonode
	bool Compute();					// sum its inputs and determine its activation state
	bool Backpropagate();			// determine errors at donor nodes

	vector<Synapse> inlist;	// it collects inputs from these nodes
	vector<Node*> outlist;	// these nodes collect inputs from this one

	int Type;				// type of node (INPUT, HIDDEN, OUTPUT)
	int HiddenLayer;        // A Hidden Node is simply assigned to a layer, which is >=1.
	double input;	   		// sum of all inputs
	double output;	   		// output sent when fired
	double bias;			// bias, threshold, or theta.
	double error;	   		// backpropagated error at this node
	double LearningRate;	// (alpha)
	int x, y;				// screen coordinates for display

};                    	// end class Node
//////////////////////////////////////////////////////////////////////////////
/* a Case consists of an array of input values that define ONE problem for the network
and an array of the output values that define the correct answer.
If the Targets array is empty, it is a "real" (not a training) problem,
and the correct answer is unknown.

Note that logical AND is actually 4 training Cases, not 1.  That is, there
are 4 separate problems (each one a Case) that must be solved to know "AND".
*/
class Case
{
public:
	Case();
	~Case();
 
	bool operator == (const Case& other) const { return(&other == this); }
	friend ostream& operator << (ostream& os, const Case&);
	friend istream& operator >> (istream& is, Case&);

	bool HasSameInputs(const Case& other) const;

	vector<double> Inputs;  	// list of values destined for input nodes
	vector<double> Targets;		// list of Onode values that constitute correct answer

	// This is a convenient place to store this status variable:
	// whether the NeuralNet has solved this Case correctly.
	// When they are all true simultaneously, the network is fully trained.
	bool IsSolved;

};                    	// end class Case
//////////////////////////////////////////////////////////////////////////////
// the neural network
class NeuralNet
{
public:

	NeuralNet(); // note the ctor no longer calls CreateNetwork. You must do it explicitly.
	~NeuralNet();

	bool operator == (const NeuralNet& other) const { return(&other == this); }
	friend ostream& operator << (ostream& os, const NeuralNet&);
	friend istream& operator >> (istream& is, NeuralNet&);

	// all are fully regularly-connected networks
	// # of nodes in each layer. 1 hidden layer is default, 3 permitted.
	bool CreateNetwork(size_t icount = 2, size_t h1count = 4, size_t h2count = 0, size_t h3count = 0, size_t ocount = 1);

	int IsBad();								// 0 if all is OK, else error codes.
	bool IsOK();								// true if Ok, false if not, but no error codes.
	int Run();									// execution loop. Call repeatedly to cycle the net.
	void ForwardPass();            				// 1 forward pass to produce outputs
	void Backprop();           					// 1 backward pass to adjust weights
	int Count(int type, int layer = 0) const; 	// how many nodes of this type exist

	void FlushCases();							// empty the data set array
	size_t LoadCases(const string& filename);	// load training data set (Cases) from disk
	bool LoadNextCase();						// load the input Nodes to run the next Case
	bool IsFullyTrained();				// true if solutions have been found for all the training Cases.
	void MarkAllCasesAsTrained(bool trainedvalue);	// Mark all Cases as trained or not.

	vector<Case> Cases;         // the data set containing all Cases

	vector<double> oset;		// translated from Onodes, using TOLERANCE

	// a double-linked list would be a better implementation for this because it allows
	// unlimited node counts (some applications could use many thousands of nodes),
	// yet still allows easy forward and backward propagation.
	vector<Node*> nw; 			// the master array containing all neurons (nodes)

	//-----------------------------------------
	// VARIABLES FOR USE BY THE CALLING APPLICATION FOR SETTING GRAPHIC DISPLAY MODES.
	int graphicon;					// how often to display graphics while processing
	// 0 no display                		FAST but invisible
	// 1 after all Cases are finished  	connection colors don't change, but new ones appear
	// 2 after each set is solved		a little slow, but the cycling shows up
	// 3 every iteration               	very SLOW
	bool shownodenumbers;			// display the node #s (or if false, as dots (slower))
	bool CurrentCaseWasSolved();	// This is public and ONLY for determining whether to
									// show graphics when graphicon == 2.
	//-----------------------------------------

protected:

	void nwFlush();					// empty the nw[] array (deleting its objects, too)
	// initialize a new node and add to master list
	Node* CreateNode(int nodetype, int HiddenLayerMembership);	// set HLM=0 for INPUT,OUTPUT
	bool connect(Node* from, Node* to);							// connect two nodes
	bool connect(Node* from, Node* to, double weight);			// connect two nodes w/weight

	bool IsCurrentCaseSolved(); // calculate raw errors in Onodes; true if all are within TOLERANCE.
	void copyoutputs();			// copy translated o-node values to oset array
	bool CurrentCaseIndexIsLegal();	// Tests for being in legal range.
	bool IsATrainingRun();   	// Cases will be used for training.
	bool IsARealRun();          // Cases has no target values, and so is a real run.
	int WriteOutputsToFile();	// placeholder fn. Writes ending Onode values to disk, or whatever.

	size_t CurrentCaseIndex;	// index into Cases. If -1, no Case in this data set has yet been run.
	double TOLERANCE;			// maximum allowable error for a correct solution

};                		// class NeuralNet
//////////////////////////////////////////////////////////////////////////////

#endif		// network.h

network.cpp

/*	network.cpp					1-27-2010		Microsoft Visual C++ or GNU GCC g++
	This is part of the VCNeural project.
	Copyright (C)1995-97, 2007, 2010 Steven Whitney.
	Published under GNU GPL (General Public License) Version 3, with ABSOLUTELY NO WARRANTY.
	Initially published by http://25yearsofprogramming.com.

Class code implementing the neural network.

--Call Run() repeatedly to cycle the net. With each call, Run() performs the
  next task it needs to do. In Windows, this allows running the network from IdleAction() or
  OnIdle() so it doesn't hog CPU time and coexists peacefully with other applications.
  
------
To Do:

Test by creating and training a network, save it to disk, reload it, and
retest on target datasets. It should solve all with no backprop (although with the current
configuration, it's impossible to be sure there was no backprop without adding a test to the
code; Run() doesn't report it.)
Do several different networks. If it works, then the logic is probably correct.

------------
OTHER NOTES:
--other notes are in neural.doc, complex.doc, biology.doc
--There are many online sources for reading about neural net theory and practice.
  See especially ftp://ftp.sas.com/pub/neural/FAQ.html, which is the FAQ for
  http://groups.google.com/group/comp.ai.neural-nets/topics
  
*/
// The stdafx include cannot be inside a condition directive (gives weird error),
// so you must manually uncomment it for Windows.
#include "stdafx.h"

#include "network.h"

// turn off VC++ warnings about insecure strcpy(). strcpy_s() didn't work properly.
#pragma warning(disable:4996)

using namespace std;

//////////////////////////////////////////////////////////////////////////////
// global code
//----------------------------------------------------------------------------
/* Calculates a random weight for Synapses and Node.bias. It is a global fn
to ensure that the same calculation is always used in both places.
The [-2,2] values are based on an example in another program. Can be modified however needed.
*/
double RandomWeight()
{
// random starting weight from -2 to +2 inclusive, 4 digit precision
double weight = ((double)random(20001)) / 10000.;	// 0 to 2
if(random(2))            							// allow negative weights
	weight = -weight;
return(weight);
}
//////////////////////////////////////////////////////////////////////////////
// class Synapse code
//----------------------------------------------------------------------------
// constructor
Synapse::Synapse(Node* fromnode)
{
from = fromnode;
weight = RandomWeight();
}
//----------------------------------------------------------------------------
// constructor where weight is specified.  (for building net from disk data)
Synapse::Synapse(Node* fromnode, double startweight)
{
from = fromnode;
weight = startweight;
}
//----------------------------------------------------------------------------
bool Synapse::operator == (const Synapse& other) const
{
return(&other == this);
}
//----------------------------------------------------------------------------
// 						end class Synapse
//////////////////////////////////////////////////////////////////////////////
// class Node code
//----------------------------------------------------------------------------
// default constructor that was required by BIDS container classes.
// don't know if the STL vector<> uses it. 
Node::Node()  {}
//----------------------------------------------------------------------------
// constructor for normal use
Node::Node(int nodetype, int HiddenLayerMembership) 
{
Type = nodetype;
input = output = error = 0.;

// Hidden layer membership attribute
HiddenLayer = 0;
if(Type == HIDDEN)
	HiddenLayer = max(HiddenLayerMembership, 1);// start at 1, not 0. max() prevents value < 1

// I thought LearningRate should be 0 < LR <=1, but some examples use > 1.
// One source says different node types should have different rates, so these do.
switch(Type)
{
	case INPUT : LearningRate = 1.00; break;	// value irrelevant for input nodes
	case HIDDEN: LearningRate = 0.15; break;
	case OUTPUT: LearningRate = 0.20; break;
}
/* This is an experimental alternative: random learning rates. Some problems benefit
from a high learning rate, others from a low rate. The idea here is a shotgun approach:
have multiple nodes working on the problem, each with a different learning
characteristic. It seems to help, at least with preventing stuck networks.
It might require a larger number of nodes to have a chance of working well since you want
a good mixture of high and low LR nodes.
*/
LearningRate = 0.001 + ((double)random(5001)) / 10000.;	// 0.001 to .5001, mean = ~0.25

// SET BIAS VALUE
if(Type == INPUT)	// INPUT are always 0.
	bias = 0.;
else
	bias = RandomWeight();	// same as a Synapse because in this app the bias IS the weight.

x = y = 0;	// caller can assign them at any time later, to keep Node platform-independent.
}              						// constructor
//----------------------------------------------------------------------------
// destructor
Node::~Node()
{
outlist.clear();	
inlist.clear();
}
//----------------------------------------------------------------------------
bool Node::operator == (const Node& other) const
{
return(&other == this);
}
//----------------------------------------------------------------------------
/* Write node data to a stream.  This is (improperly) only for formatted display to screen,
or maybe to a file for later review.  A node, along with its connection lists,
is only saved to disk by the NeuralNet object, which doesn't use this fn for its output.
*/
ostream& operator << (ostream& os, const Node& n)
{
char ch;
switch(n.Type)
{
	case INPUT:		ch = 'I'; break;
	case HIDDEN:	ch = 'H'; break;
	case OUTPUT:	ch = 'O'; break;
}
os << ch;
if(n.Type == HIDDEN)
	os << n.HiddenLayer;
os << ": In: " << n.input << "  Out: " << n.output << "  Err: " << n.error
   << "  Bias: " << n.bias << "  LR: " << n.LearningRate << "  WtIn:";

// connection weights
for(size_t j = 0 ; j < n.inlist.size() ; j++)
	os << " " << setprecision(3) << n.inlist[j].weight;
return(os);
}              					// Node operator <<
//----------------------------------------------------------------------------
/* Gather its inputs and determine its output. Making each node responsible
for collecting its inputs at the time it needs them may make feedback
loops more achievable. (The alternative was to make each node inject its output
into the receiving node.)
returns true if computation was done, false if input node list is empty.
*/
bool Node::Compute()
{
// EACH node type can have its own computation and activation function, but each activation
// fn requires its own associated error fn for Backprop.
if(Type == INPUT)
{
	// sensory nodes already have their inputs, but must calculate their output
	// Passthrough activation (= no, or identity, activation function)
	// will work for the current binary training Cases as-is.
	// I'm unclear whether a different fn is better for other types of inputs; it's possible.
	output = input;     					// passthrough
	return(true);
}
// HIDDEN AND OUTPUT NODES
output = 0;									// pre-assign the basic output level
size_t count = inlist.size();				// the container of Synapses.
if(!count)									// no computations to perform
	return(false);

// GATHER ITS INPUTS: GET OUTPUT VALUE FROM EACH SOURCE NODE,

// THIS IS THE "COMBINATION FUNCTION", IN THIS CASE A SIMPLE SUM.
input = bias;         						// initialize to the bias level
size_t i;
for(i = 0 ; i < count ; i++)
	input += inlist[i].from->output * inlist[i].weight;

// APPLY AN ACTIVATION/SQUASHING/TRANSFER FUNCTION APPROPRIATE TO THE NODE TYPE.
if(Type == HIDDEN)
{
	// A source: "Use a tanh (hyperbolic tangent) activation function for the hidden units".
	// Find it and try it. But that means its error calculation method and Backprop will be different.

	if(input < -45)		// floating point overflow avoidance
		output = 0;
	else
		if(input > 45)
			output = 1;
		else output = 1 / (1 + exp(-input));  	// this is the logistic fn
}
else	// OUTPUT NODES
{
	// A source: "For binary targets, use logistic fn"
	// "For continuous-valued targets with no known bounds, use the identity or "linear"
	// activation function (which amounts to no activation function)
	// unless you have a very good reason to do otherwise."
	// (This is so the output values can have arbitrarily large or small values,
	// corresponding to the target values desired.)

	// The activation rule. The function used (if any) is the "activation function".
	// The output rule.
	// Note that logistic function output can never be negative (which is biologically correct),
	// The output is only a magnitude. The synapse determines the +/- effect.
	// (but OUTPUT nodes don't output thru synapses!)

	if(input < -45)		// floating point overflow avoidance
		output = 0;
	else
		if(input > 45)
			output = 1;
		else output = 1 / (1 + exp(-input));  	// this is the logistic fn

	// it is copyoutputs() that makes allowance for the fact that the raw outputs are not exactly 0,1
}
return(true);
}										// Compute
//----------------------------------------------------------------------------
/* Backpropagate the error at this node back to its donor nodes,
then adjust the connection weights accordingly.
On entry here, this->error is the RAW value (correct target answer - this->output)
returns true if any connection weight was changed, false if not
*/
bool Node::Backpropagate()
{
if(Type == INPUT)		// Backprop stops at sensory nodes anyway (no inputs)
	return(false);

if(error == 0.)			// if no error at our node, there's nothing to propagate
	return(false);

if(!inlist.size())		// no nodes to propagate the error to
	return(false);

size_t i;

// APPLY A BACKPROP METHOD APPROPRIATE TO THE NODE TYPE AND THE ACTIVATION FN IT USED.
if(Type == HIDDEN)
{
	// on entry, for a H node, error contains the summed (errorsignal * weight) from all downline nodes.
	// the derivative calculation must be done HERE (not where the error signals are summed) because
	// the calculation depends on the activation function used by THIS node, not its recipient nodes,
	// and different nodes (node types) may have different activation fns.)
	error = error * output * (1 - output);	// logistic activation function
	double LRE = LearningRate * error;      // precalculate to save time in loop.
	for(i = 0 ; i < inlist.size() ; i++)
	{
		// Other programs use a shared bias with fixed output of 1, and adjust the weight to it.
		// I gave each node its own member bias instead.
		// Rather than adjusting the weight of this node to the fixed bias of 1,
		// we can just modify the bias value directly, with the same result.

		bias += LRE * bias;

		// Calculate error at donor node, which it will then backpropagate.
		// All inlist[i].from->error were set to zero at start of Backprop() run.
		// Note that this calculation uses the OLD value of inlist[i]->weight, BEFORE we change it.
		// I don't know if it's right. If it's wrong, reverse the order of the next two lines.

		inlist[i].from->error += error * inlist[i].weight;

		inlist[i].weight += LRE * inlist[i].from->output;

/*
		// If it is ever necessary to keep weight in bounds.
		double limit = MAXDOUBLE / 10;
		if(inlist[i].weight > 0)
			inlist[i].weight = min(inlist[i].weight, limit);
		else
			if(inlist[i].weight < 0)
				inlist[i].weight = max(inlist[i].weight, -limit);
		// I don't know if underflow checks are required (fractions too close to zero).
*/
	}
}
else	// THIS IS FOR OUTPUT NODES
{
	// revise error to be the "error signal" rather than the raw difference error.
	// this calculation depends on the activation function of this node.
	error = error * output * (1 - output);
	double LRE = LearningRate * error;
	for(i = 0 ; i < inlist.size() ; i++)
	{
		bias += LRE * bias;
		inlist[i].from->error += error * inlist[i].weight;
		inlist[i].weight += LRE * inlist[i].from->output;
	}
}
return(true);
}									// Backpropagate
//----------------------------------------------------------------------------
/* tests whether two nodes are already connected, whether this node outputs to the given node.
*/
bool Node::OutputsTo(Node* tonode)
{
// search our outlist for tonode
for(size_t i = 0 ; i < outlist.size() ; i++)
	if(outlist[i] == tonode)
		return(true);
return(false);
}									// OutputsTo
//----------------------------------------------------------------------------
// 									end class Node
//////////////////////////////////////////////////////////////////////////////
// class Case code
//----------------------------------------------------------------------------
// constructor
Case::Case() 
{
IsSolved = false;
}
//----------------------------------------------------------------------------
// destructor
Case::~Case()
{
Inputs.clear();
Targets.clear();
}
//----------------------------------------------------------------------------
// write to a stream.  			// File format is simply 2 lines:
// input1 input2 input3...\n 	// list of input values
// answer1 answer2...\n			// list of output values that are the correct answer
ostream& operator << (ostream& os, const Case& c)
{
size_t count = c.Inputs.size();
size_t i;
for(i = 0 ; i < count ; i++)
	os << c.Inputs[i] << " ";
os << endl;

count = c.Targets.size();
for(i = 0 ; i < count ; i++)
	os << c.Targets[i] << " ";
os << endl;

return(os);
}
//----------------------------------------------------------------------------
// read from a stream.
istream& operator >> (istream& is, Case& c)
{
string s;					// to receive unknown-length file line (can be huge)
double d;
char *buf;

c.Inputs.clear();   		// empty any existing data
c.Targets.clear();

is >> ws;					// eat whitespace in case there are leading blank lines

// READ THE INPUT VALUES

getline(is,s,'\n');            		// read the file line
buf = new char[s.length() * 2 + 1];	// make a matching size char array
strcpy(buf, s.c_str());             // copy the string to it
s.clear();							// free some memory
istringstream inbuf(buf);			// a stream we can read values from
while(inbuf >> d)					// while we can read doubles,
	c.Inputs.push_back(d);			// add them to the Inputs array
delete[] buf;                       // deallocate buf
buf = 0;							// make it null

// READ OUTPUT VALUES
// IF THE LINE IS BLANK, NONE ARE READ, AND THE SET HAS NO CORRECT ANSWER,
// WHICH MAKES IT A "REAL" PROBLEM.

getline(is,s,'\n');
buf = new char[s.length() * 2 + 1];
strcpy(buf, s.c_str());
s.clear();
istringstream inbuf2(buf);  		// can't reuse inbuf
while(inbuf2 >> d)
	c.Targets.push_back(d);
delete[] buf;

// set other variables that were not saved with the file data
c.IsSolved = false;

return(is);
}                        			// operator >>
//----------------------------------------------------------------------------
// a less rigorous alternative to ==
// returns true if both Cases have the same Inputs[] elements in the same order.
// In an array of Cases, all the inputs must be unique.  That is, a
// particular set of inputs can't appear twice with different
// correct answers or the training will be ambiguous.
bool Case::HasSameInputs(const Case& other) const
{
size_t thiscount = Inputs.size();
size_t othercount = other.Inputs.size();

size_t shorter = min(thiscount, othercount);	// just in case
for(size_t i = 0 ; i < shorter ; i++)
	if(Inputs[i] != other.Inputs[i])    	// if 1 doesn't match, we're ok
		return(false);

return(true);
}                     			// HasSameInputs
//----------------------------------------------------------------------------
// 									class Case
//////////////////////////////////////////////////////////////////////////////
// class NeuralNet code
//----------------------------------------------------------------------------
// constructor
NeuralNet::NeuralNet() 
{
graphicon = 2;				// 2 is the best default.
shownodenumbers = true;		// this starts out checked in WNeural.cpp View menu

// Starts at -1 because it is (and must be) incremented BEFORE each use so it always
// refers to the Case currently being run.
// However note that NO value is legal for it when Cases is empty (as it is now).
// The start of value of -1 is a FLAG that LoadNextCase() has never been run on this data set.
CurrentCaseIndex = -1;

// maximum difference between two doubles to be considered equal, .1, .01 work well.
// 0 probably risks overflow error, but anything greater seems ok.  .0001 worked.
// lower tolerance takes longer, but gives you longer to watch.

TOLERANCE = 0.1;		// parameters could become inheritable later

}		                        	// constructor
//----------------------------------------------------------------------------
// destructor
NeuralNet::~NeuralNet()
{
Cases.clear();
oset.clear();
nwFlush();
}
//----------------------------------------------------------------------------
/* write to a stream all data required for recreating the network.
Format:
inodecount h1count h2count h3count onodecount TOLERANCE\n
node# nodeBias nodeLearningRate\n	<-list of nodes with their Bias and LearningRate
node# nodeBias nodeLearningRate\n
etc...
fromnode# tonode# weight\n  		<-list of connections with their weights
fromnode# tonode# weight\n
etc...
*/
ostream& operator << (ostream& os, const NeuralNet& n)
{
// write how many nodes of each type
os << n.Count(INPUT) << " "
   << n.Count(HIDDEN,1) << " "
   << n.Count(HIDDEN,2) << " "
   << n.Count(HIDDEN,3) << " "
   << n.Count(OUTPUT) << " "
   << n.TOLERANCE << endl;

// write each tonode's bias, LearningRate
size_t nwcount = n.nw.size();
size_t i, j, k;
for(i = 0 ; i < nwcount ; i++)
{
	os << n.nw[i]->bias << " " << n.nw[i]->LearningRate << endl;
}
// write all connection data
for(i = 0 ; i < nwcount ; i++)				// i is the tonode
{
	// locate (the array index of) each of this node's source nodes and the weight
	size_t incount = n.nw[i]->inlist.size();
	for(j = 0 ; j < incount ; j++)			// j iterates source nodes that output to us
	{										// but we need to find its index in nw[]
		for(k = 0 ; k < nwcount ; k++)		// on loop exit, k is the fromnode's index in nw[]
			if(n.nw[i]->inlist[j].from == n.nw[k])		
				break;
		
		os << k << " " << i << " " << n.nw[i]->inlist[j].weight << endl;
	}
}
return(os);
}                      	// end <<
//----------------------------------------------------------------------------
/* read from a stream and reconstruct the identical network.
could be the basis for a NeuralNet::NeuralNet(const string& filename) constructor
*/
istream& operator >> (istream& is, NeuralNet& n)
{
size_t i, icount, h1count, h2count, h3count, ocount, tolerance;

// read specified # of each node type, and tolerance
is >> icount >> h1count >> h2count >> h3count >> ocount >> tolerance;

if(is.fail()) 		// avoid destroying existing network if read already failed
	return(is);

// destroy any existing network
n.FlushCases();
n.oset.clear();
n.nwFlush();

// create network to match
n.TOLERANCE = tolerance;

for(i = 0 ; i < icount ; i++)  n.CreateNode(INPUT,0);
for(i = 0 ; i < h1count ; i++) n.CreateNode(HIDDEN,1);
for(i = 0 ; i < h2count ; i++) n.CreateNode(HIDDEN,2);
for(i = 0 ; i < h3count ; i++) n.CreateNode(HIDDEN,3);
for(i = 0 ; i < ocount ; i++)  n.CreateNode(OUTPUT,0);

double bias, learningrate;
size_t total = n.nw.size();
for(i = 0 ; i < total ; i++)
{
	is >> bias >> learningrate;
	n.nw[i]->bias = bias;
	n.nw[i]->LearningRate = learningrate;
}

// connect the node #s using weight
int fromnode, tonode;
double weight;
while(is >> bias >> learningrate >> fromnode >> tonode >> weight)
	n.connect(n.nw[fromnode],n.nw[tonode],weight);
return(is);
}                     	// end >>
//----------------------------------------------------------------------------
/* Placeholder function to use for writing the ending Onode values to disk file for analysis,
display, or further processing.
For a "real" run, it is called after the ForwardPass of each Case. Otherwise the output is lost,
replaced when the next Case is processed.
There is not much point calling it for a training run.

Currently it writes the output in the 2-line format of a .TRS (training set) file:
inputs (that it started with)...\n
outputs (the calculated output node values)...\n
*/
int NeuralNet::WriteOutputsToFile()
{
// There is currently no method to change this file name, except here.
ofstream outfile("results.trs",ios::app);	// APPEND mode

size_t count = nw.size();
size_t i;
for(i = 0 ; i < count ; i++)
	if(nw[i]->Type == INPUT)
		outfile << nw[i]->input << " ";
outfile << endl;

for(i = 0 ; i < count ; i++)
	if(nw[i]->Type == OUTPUT)
		outfile << nw[i]->output << " ";
outfile << endl;

return(1);
}                     				// WriteOutputsToFile
//----------------------------------------------------------------------------
/* Tests CurrentCaseIndex for being in legal range.
*/
bool NeuralNet::CurrentCaseIndexIsLegal()
{
return((Cases.size() > 0) && (CurrentCaseIndex >= 0) &&
	(CurrentCaseIndex < Cases.size()));
}                     				// CurrentCaseIndexIsLegal
//----------------------------------------------------------------------------
/* Call this with false after every action that renders prior training no longer valid.
It is called with true when LoadCases loads a set of cases that have no associated correct
outputs. That means it is a "real" (not training) run, and the network is assumed to
be trained.
*/
void NeuralNet::MarkAllCasesAsTrained(bool trainedvalue)
{
for(size_t i = 0 ; i < Cases.size() ; i++)
	Cases[i].IsSolved = trainedvalue;
}                     				// MarkAllCasesAsTrained
//----------------------------------------------------------------------------
/* The loaded Cases will be used for training.
*/
bool NeuralNet::IsATrainingRun()
{
return(!IsARealRun());
}                     				// IsATrainingRun
//----------------------------------------------------------------------------
/* Cases contains no Cases that have targets, so it is a real run.
You should not mix training and real runs in one training set.
*/
bool NeuralNet::IsARealRun()
{
size_t count = Cases.size(), i;
for(i = 0 ; i < count ; i++)
	if(Cases[i].Targets.size())
		return(false);
return(true);
}                     				// IsARealRun
//----------------------------------------------------------------------------
/*  returns 0 if OK, else one negative error code (at a time). There can be multiple errors.
*/
int NeuralNet::IsBad()
{
if(!nw.size())			// -1 = network is not created
	return(-1);
if(!Cases.size())   	// -2 = there is no data set (training or real) loaded
	return(-2);

size_t i, inodes = Count(INPUT), onodes = Count(OUTPUT);
for(i = 0 ; i < Cases.size() ; i++)
{
	if(Cases[i].Inputs.size() != inodes)		// -3 = Inputs != input nodes
		return(-3);
	if(Cases[i].Targets.size() != onodes)		// -4 = Outputs != output nodes
		return(-4);
}
return(0);							// everything is ok, ready to start processing.
}                     				// IsBad
//----------------------------------------------------------------------------
// true if Ok, false if not, but no error codes.
bool NeuralNet::IsOK()
{
return(!IsBad());
}                     				// IsOK
//----------------------------------------------------------------------------
// count how many nodes of this type are in the net. layer is ignored for INPUT, OUTPUT.
int NeuralNet::Count(int type, int layer) const
{
int tally = 0;
size_t i, count = nw.size();
for(i = 0 ; i < count ; i++)
{
	switch(type)
	{
		case INPUT:
		case OUTPUT:
			if(nw[i]->Type == type)
				tally++;
			break;
		case HIDDEN:
			if((nw[i]->Type == type) && (nw[i]->HiddenLayer == layer))
				tally++;
		break;
	}
}
return(tally);
}                     				// Count
//----------------------------------------------------------------------------
/* Initialize a new neuron and add to master list.
Set HiddenLayerMembership to 0 for INPUT, OUTPUT nodes.
returns a pointer to the new node so it can be wired into the net.
*/
Node* NeuralNet::CreateNode(int nodetype, int HiddenLayerMembership)
{
Node* newnode = new Node(nodetype, HiddenLayerMembership);	// create and initialize the node
nw.push_back(newnode);
return(newnode);
}									// CreateNode
//----------------------------------------------------------------------------
/* deletes all the (newed) objects in nw, then empties the array.
*/
void NeuralNet::nwFlush()
{
// using gcnew would eliminate need to delete them
for(vector<Node*>::iterator i = nw.begin() ; i != nw.end() ; i++)
	delete *i;
nw.clear();	
}									// nwFlush
//----------------------------------------------------------------------------
/* Create (or recreate) all the neurons and establish their connections to each other.
returns true if successful, false if not. If it returns false, the network is EMPTY and invalid.
*/
bool NeuralNet::CreateNetwork(size_t icount, size_t h1count, size_t h2count, size_t h3count, size_t ocount)
{
// destroy existing network, if any.
nwFlush();
MarkAllCasesAsTrained(false);		// the new network is not trained on the loaded set.

// STRICTLY ENFORCE RULES.
if( (icount < 1) || (h1count < 0) || (h2count < 0) || (h3count < 0) || (ocount < 1) ||
	( (h1count < 1) && ((h2count > 0) || (h3count > 0)) ) ||
	( (h2count < 1) && (h3count > 0) ) )
		return(false);

// create the neurons.  The creation order determines calculation order, too,
// and several routines depend on them having been created in this order.
// There is no need for an explicit layer class.
size_t i;
for(i = 0 ; i < icount ; i++)	CreateNode(INPUT,0);
for(i = 0 ; i < h1count ; i++)	CreateNode(HIDDEN,1);
for(i = 0 ; i < h2count ; i++)	CreateNode(HIDDEN,2);
for(i = 0 ; i < h3count ; i++)	CreateNode(HIDDEN,3);
for(i = 0 ; i < ocount ; i++)	CreateNode(OUTPUT,0);

// NOW FULLY CONNECT EACH NODE TO DESIGNATED NUMBER OF OTHER NODES.
size_t k, nodecount = nw.size();
bool usenextlayer;						// used in hidden nodes case below.
int nextlayer;							// the "next" layer when connecting layers
for(k = 0 ; k < nodecount ; k++)
{
	switch(nw[k]->Type)
	{
		case INPUT:		// each input node outputs to each H1 node...
			if(h1count > 0)
			{
				for(i = k+1 ; i < nodecount ; i++)
					if((nw[i]->Type == HIDDEN) && (nw[i]->HiddenLayer == 1))
						if(!connect(nw[k],nw[i]))
						{
							nwFlush();
							return(false);
						}
			}
			else		// but if there are NO hidden nodes, they connect to OUTPUT.
			{
				for(i = k+1 ; i < nodecount ; i++)
					if(nw[i]->Type == OUTPUT)
						if(!connect(nw[k],nw[i]))
						{
							nwFlush();
							return(false);
						}
			}
			break;
		case HIDDEN:	// each H node connects to each node in next hidden layer, OR to OUTPUT if none.
			switch(nw[k]->HiddenLayer)
			{
				case 1: usenextlayer = (h2count > 0); break;
				case 2: usenextlayer = (h3count > 0); break;
				case 3: usenextlayer = false;		  break;
			}
			if(usenextlayer)
			{
				nextlayer = nw[k]->HiddenLayer + 1;
				for(i = k+1 ; i < nodecount ; i++)
					if((nw[i]->Type == HIDDEN) && (nw[i]->HiddenLayer == nextlayer))
						if(!connect(nw[k],nw[i]))
						{
							nwFlush();
							return(false);
						}
			}
			else  	// but if there are NO more hidden layers, they connect to OUTPUT.
			{
				for(i = k+1 ; i < nodecount ; i++)
					if(nw[i]->Type == OUTPUT)
						if(!connect(nw[k],nw[i]))
						{
							nwFlush();
							return(false);
						}
			}
			break;
		case OUTPUT:		// output nodes connect to nothing
			break;
	}
}											// end for(connect all nodes)
return(true);
}						// end CreateNetwork
//----------------------------------------------------------------------------
/* connect two specified existing neurons
doesn't check whether connection between the two already exists, so check before calling.
returns true
*/
bool NeuralNet::connect(Node* fromnode, Node* tonode)
{
fromnode->outlist.push_back(tonode);			// add to fromnode's output list
tonode->inlist.push_back(Synapse(fromnode)); 	// add to tonode's donor list
return(true);
}									// connect
//----------------------------------------------------------------------------
/* connect two specified existing neurons with a specified weight.
(for constructing a net from disk file)
doesn't check whether connection between the two already exists.
returns true.
*/
bool NeuralNet::connect(Node* fromnode, Node* tonode, double weight)
{
fromnode->outlist.push_back(tonode);					// add to fromnode's output list
tonode->inlist.push_back(Synapse(fromnode,weight));	// add to tonode's donor list
return(true);
}									// connect
//----------------------------------------------------------------------------
/* Determines whether the NeuralNet is fully trained on all data sets.
returns true if all Cases.IsSolved are true, false if any is false.
*/
bool NeuralNet::IsFullyTrained()
{
for(size_t i = 0 ; i < Cases.size() ; i++)
	if(!Cases[i].IsSolved)
		return(false);
return(true);
}									// IsFullyTrained
//----------------------------------------------------------------------------
/* This is public and ONLY for use by the calling app to determine whether to
show graphics when graphicon == 2. Otherwise there's no way to tell because
that isn't one of Run()'s return values.
returns true if Cases[CurrentCaseIndex].IsSolved is true.
*/
bool NeuralNet::CurrentCaseWasSolved()
{
if(CurrentCaseIndexIsLegal())				// prevent fatal indexing error in next line
	return(Cases[CurrentCaseIndex].IsSolved);
return(false);
}									// CurrentCaseWasSolved
//----------------------------------------------------------------------------
/* Determine whether output nodes contain the correct answer, if known, for one Case,
and sets the RAW error value of each o-node as the starting point for Backprop().
RAW error = desired output - actual output. It is translated to ERROR SIGNAL later.
Returns true if solution for this set was found, false if not.
The result is also stored in the Case object IsSolved variable.
*/
bool NeuralNet::IsCurrentCaseSolved()
{
if(!CurrentCaseIndexIsLegal())				// prevent fatal indexing error below
	return(false);

Cases[CurrentCaseIndex].IsSolved = true;	// any node with an error turns it false
int tindex = 0;                     		// index into Targets array
size_t i, count = nw.size();
for(i = 0 ; i < count ; i++)
	if(nw[i]->Type == OUTPUT)   			// only test output nodes
	{
		// REMEMBER NOT TO BREAK FROM THIS LOOP AS SOON AS YOU KNOW THE SOLUTION WASN'T FOUND,
		// BECAUSE YOU'RE CALCULATING THE ERRORS ALSO, FOR *ALL* THE NODES.
		// calculate the error.  nw[i]->output was already calculated in Compute()
		nw[i]->error = Cases[CurrentCaseIndex].Targets[tindex] - nw[i]->output;
		tindex++;

		if(fabs(nw[i]->error) > TOLERANCE)
			Cases[CurrentCaseIndex].IsSolved = false;
	}
return(Cases[CurrentCaseIndex].IsSolved);
}									// IsCurrentCaseSolved
//----------------------------------------------------------------------------
/* make ONE computational pass forward through the network,
Traverses nodes in sequential order, the order in which they were born, which makes this
synchronous updating. For asynchronous, you'd select some number of the nodes at random
and only calculate those (like the agent cells in a WAdapt colony).
*/
void NeuralNet::ForwardPass()
{
size_t i, count = nw.size();
for(i = 0 ; i < count ; i++)
	nw[i]->Compute();

if(IsARealRun())		// For a real run, the Onodes contain the answer, whatever it is.
	WriteOutputsToFile();
}									// ForwardPass
//----------------------------------------------------------------------------
/* make ONE computational pass backwards through the network,
calculating errors and adjusting connection weights.
*/
void NeuralNet::Backprop()
{
// first reset all errors to 0 because error calculations are additive.
int i, count = (int)nw.size();
for(i = 0 ; i < count ; i++)
	if(nw[i]->Type != OUTPUT)		// must preserve o-node errors set by IsCurrentCaseSolved()
		nw[i]->error = 0;           // because they are our starting point.

for(i = count - 1 ; i >= 0 ; i--) 	// start at end and traverse backwards
	nw[i]->Backpropagate();			// propagate node's error to its inputs
									// and adjust connection weights.

// This sets ALL Cases[i].IsSolved = false here, because every time
// you run backprop, adjusting the weights destroys the training to some degree.
// So it is only fully trained if it has gotten thru all the sets
// without calling Backprop even one time.
MarkAllCasesAsTrained(false);

}									// Backprop
//----------------------------------------------------------------------------
/* Transfer the current o-node values to the oset output array.  Values in
the array are translated using TOLERANCE to exactly 0 or exactly 1.
They can then be used 1) to build a compound output (byte or string),
which could then be used to trigger a complex set of actions) or
2) as passthrough input to another network.
It assumes that output does not have a mid-range value (i.e. it really can
be translated to either 0 or 1).  A midrange value is translated to 0.
*/
void NeuralNet::copyoutputs()
{
oset.clear();						// empty previous values

size_t i, count = nw.size();
for(i = 0 ; i < count ; i++)
	if(nw[i]->Type == OUTPUT)
	{
		// this calculation isn't very good, although it works for the binary sets.
		oset.push_back((nw[i]->output < (1. - TOLERANCE)) ? 0. : 1.);
	}
}                    				// copyoutputs
//----------------------------------------------------------------------------
void NeuralNet::FlushCases()
{
Cases.clear();
CurrentCaseIndex = -1;
oset.clear(); 			// any existing oset is from an old Case
}                             		// FlushCases
//----------------------------------------------------------------------------
/* read all data set info from disk.  A file containing multiple training
Cases simply has multiple 2-line pairs, as described above for one set.
returns the number of Cases loaded (0 if bad file or ANY input Case was not unique).
If it returns 0, Cases is empty and invalid.
*/
size_t NeuralNet::LoadCases(const string& filename)
{
FlushCases();	// this sets CurrentCaseIndex to its "unused" value of -1

ifstream infile(filename.c_str());
if(!infile)
	return(0);

Case t;
while(infile >> t)					// while you can load a case
{
	for(size_t i = 0 ; i < Cases.size() ; i++)
		if(Cases[i].HasSameInputs(t))
		{
			FlushCases();           // empty ALL the Cases as invalid
			return(0);				// failure
		}
	Cases.push_back(t);
}

// if this is a real run, the net is assumed to be trained and ready for it (whether it is or not).
// if it is a training set, they are all Untrained.
MarkAllCasesAsTrained(IsARealRun());

return(Cases.size());
}					 				// LoadCases
//----------------------------------------------------------------------------
/* Set CurrentCaseIndex to the NEXT Case and load the input Nodes with the input variables for it.
It is no longer necessary to load an answer array because it is part of the Case, so
CurrentCaseIndex inherently provides a path to it. (this fn was formerly poseproblem()).
returns true if it succeeded, false if no Case was loaded (Cases is empty or other reasons).
*/
bool NeuralNet::LoadNextCase()
{
if(!Cases.size())
	return(false);

// These calculations also ensure that CurrentCaseIndex never has an out-of-bounds value,
// except for its initial -1 value which flags that no Case has ever been run.
// You must never set/increment/decrement its value anywhere else, except the limited
// places where it is set to -1.
if(IsARealRun())
{
	// Each Case is only run once, and all have been run. No more to do.
	if(CurrentCaseIndex == (Cases.size() - 1))
		return(false);
}
else	// a training set run cycles through all the cases until all are solved
{
	if(CurrentCaseIndex == (Cases.size() - 1))
		CurrentCaseIndex = -1;
}
CurrentCaseIndex++;	// This was the standard method: increment to the next case.
// This is an experimental alternative: present the cases in random order.
// It might help keep the net from getting stuck, or it might only make it less likely
// to START with a set that will make it get stuck, which isn't really much help.
// It will require more iterations due to not efficiently traversing the sets 0 to N in order,
// but repeating already-trained Cases shouldn't hurt much. It also makes the reports of
// iteration counts inexact. But it does seem to help.
CurrentCaseIndex = random(Cases.size());

int index = 0;							// index into Inputs[]
size_t i, count = nw.size();
for(i = 0 ; i < count ; i++)			// run through all the nodes
	if(nw[i]->Type == INPUT)			// but you only fill INPUT nodes
		nw[i]->input = Cases[CurrentCaseIndex].Inputs[index++];
return(true);
}          				       		// LoadNextCase
//----------------------------------------------------------------------------
/*  This is the NeuralNet's execution loop. Call repeatedly to cycle the net.
If there is nothing to do, it does nothing and returns.
This was nearly called DoNextThing() because that's what it does. It tracks what it's in the
middle of doing, and does the next small step of it.
returns:
	2 = FINISHED because all Cases in a REAL RUN have been processed.
	1 = FINISHED because the network is fully trained (training run)
	0 = you need to call Run() again because processing is not completed.
		Either a REAL run has not finished processing all its Cases, OR
		A TRAINING run has not yet solved all Cases.
   -N = Negative error that encodes the reason it could not do any processing at all.

*/
int NeuralNet::Run()
{
int errorcode = IsBad();	// NeuralNet not created or no data set loaded, etc = not ready.
if(errorcode)
	return(errorcode);		// Inform caller why not ready. Error codes are negative numbers.

// If it's a REAL run, do one forward pass of one Case. Multiple Cases are allowed so
// they can all be read from one file and the output of each written to a disk file.
if(IsARealRun())
{
	if(LoadNextCase())  // this will try to run one Case. returns 0 if there are none left to do.
	{
		ForwardPass();	// this fn writes the result to disk file; else it would be lost.
		return(0);		// not finished processing
	}
	else
		return(2);  	// the entire real run is completed, no more Cases to process.
}
else	// If it's a TRAINING run
{
	if(IsFullyTrained())		// If it's fully trained, there's nothing to do.
		return(1);

	// These ifs are not collapsed into one, to make the logic more clear.
	if(CurrentCaseIndex == -1)	// If this is the first-ever Case processed, force it to load.
		LoadNextCase();
	else
	{
		// Whenever training is successful for one Case, IsSolved for that Case is set true,
		// so you can use that as a flag for whether the current Case is finished processing
		// and you can move on to the next.
		if(CurrentCaseIndexIsLegal())	// prevent next line causing fatal indexing error
			if((Cases[CurrentCaseIndex].IsSolved))
				LoadNextCase();
	}
	// Most of the time, the fn will run only these few lines: Forward, Backprop, and return.
	ForwardPass();
	if(!IsCurrentCaseSolved())	// calculates Onode error values for backprop use.
		Backprop();
	else
		copyoutputs();	// the program currently doesn't use these output values for anything.

	return(IsFullyTrained());	// 0 or 1
}
}									// Run
//----------------------------------------------------------------------------
// 									 class NeuralNet
//////////////////////////////////////////////////////////////////////////////

 

Valid HTML 4.01 Transitional Valid CSS
Yahoo! Search
Search the web Search this site
View content labeling at ICRA.