25 Years of Programming
An open source source for C, C++, OWL, BASIC, MDB, XLS, DOT, and more...
Home   Projects   Up   Sitemap   Search   Blog   Forum+Chat   About Us   Privacy   Terms of Use   Feedback   FAQ   Images   Services   Ads   Donate   Humor

Classifier (Rule Set) classes for the natural language processing chatbot program in C++

This is the "classifier module" for the WTalk.cpp natural language processing chatbot project. It defines the C++ classes that encode the elements of the current situational state (potential stimuli), encode the rules the program uses to generate responses, and a class that manages the set of rules, performs the lottery, etc.

This module was inspired by John Holland's classifier system concept. I first implemented it in my MSDOS program Classify.cpp. While working with it, I realized that it had potential for expansion and generalization for multiple uses. The problem with the original, however, was in its numeric coding of states and responses. This module was the result of generalizing it.

Unless memory is at a premium, there's no point using tightly packed numeric strings which must be translated to and from whatever they encode. What developed here was a system that uses plain text strings and keywords to describe states (stimuli) and plain text action lists to define the corresponding responses or list of responses. This wandered away from the original concept of a classifier system, probably turning into a traditional rule based system. But it evolved so directly from the original classifier system, and still seemed so closely related to it, that while developing it I wondered seriously what the difference was between a classifier system and a rule-based system, or whether there really was any difference.

The idea of text strings for stimuli and responses is hardly revolutionary today, when textual property lists of program objects are common, but at the time it was very exciting. Today, in addition to property lists, there are other things that were previously compactly coded but are now being encoded and utilized in the form of plain text: RTF (Rich Text Format), HTML, XML, etc.

There were two very exciting aspects of this usage in this program:

  1. It is, after all, a program whose purpose is to understand natural language.  What could be more important, then, than to encode the program's internal descriptions of existing environmental states using natural language phrases which the program (in some later stage of development) could use its own natural language parsing capabilities to interpret and understand?  That is, maybe now the encoded phrase meaning it is raining outside would be "raining outside". At the moment, that would be interpreted by an interpreting engine and the rule-based response generator.  But in the future, "raining outside" is a meaningful English phrase that the program itself should easily be able to interpret itself, eliminating the engine, or rather becoming the engine. And doing so is not that great a jump in programming methodology.
     
  2. The action lists are similarly able to be transferred into the database eventually. They are meta-programs that consist of simple lists of the elemental subunits that the program is capable of executing, which are called when encountered. Currently, the response-generating engine translates the textual action lists into actual actions, but at some point it will be only a small step to transfer those action lists into the database as the operational definitions of those words and phrases. An operational definition is a procedure (program or subroutine) that when followed actually causes the action to be performed, as contrasted with a lexical definition, which is just a text description.

    One of the goals of the program is to reach a point where
  • Words and phrases that refer to sensory states are operationally defined in that they are represented in the database as sets of the values that program variables had when that state existed, i.e. as memories of what the program was experiencing at that time, and the only thing it can truly experience is the values of its variables.
     
  • Words and phrases that refer to actions are operationally defined in that they are represented in the database as actual subroutines that the program can follow to perform those actions.

Once a word or phrase is defined in these terms, it really is defined in terms of the program's experience, and is not just a bunch of text.

 

clasfier.h

/*	clasfier.h	        3/2/02
	This is part of the WTalk project.
	Copyright (C)1999-2000, 2002 Steven Whitney.
	Published under GNU GPL (General Public License) Version 3, with ABSOLUTELY NO WARRANTY.
	Initially published by http://25yearsofprogramming.com.

*/
#ifndef __CLASFIER_H
#define __CLASFIER_H

#include <classlib\arrays.h>
#include "c:\bcs\my.h"
#pragma hdrstop

//////////////////////////////////////////////////////////////////////////////
// globals
//----------------------------------------------------------------------------
// These are logical operators used for defining a Rule's stimulus clause, to
// determine whether a Rule is pertinent to the current environmental state, i.e. is it
// a bidder?  does the current state match its stimulus clause?
// CONTAINS is for finding key phrases in the user input string.  It is the
// only operator that isn't equally valid for string and int types,
// BUT it's ok to use it generally, and even use it in mutations:
// even though it's meaningless when applied to int types, it CAN be tested
// for without error, and will never evaluate TRUE, anyway.
// (a keyword will never be found inside an int).
enum ops { LT = 1, LE, EQ, GE, GT, NE, CONTAINS, DNCONTAIN };	// StateNode operators
//          1       2   3   4   5   6     7          8
//////////////////////////////////////////////////////////////////////////////
// StateNode is a dual-purpose class, used as a state-reporting node in a StateArray,
// and as the "if" clause in a Rule.
class StateNode
{
public:
	StateNode(); 									// default ctor sets nothing
	StateNode(const string& name, const string& val, int opr = EQ);
	StateNode(const string& name, uint val, int opr = EQ);
	StateNode(const StateNode& other) { *this = other; };
	virtual ~StateNode();

	virtual StateNode& operator = (const StateNode& other);
	virtual BOOL operator == (const StateNode& other) const;
	virtual BOOL operator != (const StateNode& other) const { return(!(other == *this)); }
	virtual BOOL operator < (const StateNode& other) const;

	friend ostream& operator << (ostream& os, const StateNode& s);	// write
	friend istream& operator >> (istream& is, StateNode& s);		// read

	// functions
	virtual int mutate();

	// variables
	string varname;		// name of the variable (word or phrase ok)
	int op;		  		// (encoded) relationship to test for (or that exists)
	string value;		// variable's value, as text.  if numeric, positive only, and
						// 0-padded to 5 digits so operators work properly.
protected:
};
//----------------------------------------------------------------------------
// 						end class StateNode
//////////////////////////////////////////////////////////////////////////////
// StateArray contains the current values of all the state variables,
// in text form readable and searchable by the rules. (i.e., a central bulletin board).
// many state variables may exist here alone, not as permanent members anywhere else.
class StateArray : public TArrayAsVector<StateNode>		// maybe sorted by varname?
{
public:
	StateArray(int upper = 100);
	StateArray(const StateArray& other);
	virtual ~StateArray();

	virtual StateArray& operator = (const StateArray& other);

	friend ostream& operator << (ostream& os, const StateArray& s);	// write
	friend istream& operator >> (istream& is, StateArray& s);		// read

	// array functions
	virtual int Add(const StateNode&);

	// functions
	string GetValue(const string& varname);	// return value of the given variable
	string Remove(const string& varname);	// delete node, if any, holding varname

	// variables

protected:
};
//----------------------------------------------------------------------------
// 						end class StateArray
//////////////////////////////////////////////////////////////////////////////
// a rule links a set of conditions with the action(s) to take when those conditions exist.
class Rule
{
public:
	Rule(int newbid = 1000);
	Rule(const Rule& other);
	Rule(Rule*, Rule*, int newbid = 1000, int mutationrate = 1);	// mating constructor
	virtual ~Rule() {}
	virtual void init();											// common setup

	virtual Rule& operator = (const Rule& other);             // assignment
	virtual BOOL operator == (const Rule& other) const;
	virtual BOOL operator < (const Rule& other) const;

	friend ostream& operator << (ostream& os, const Rule&); // read
	friend istream& operator >> (istream& is, Rule&);		// write

	// functions
	virtual BOOL SameAs(const Rule& other) const;	// like operator ==, but compares members
	virtual double Similarity(const Rule& other);	// measure of similarity.  not yet written
	void paysource(int denominator);		// pay source 1/denominator of its bid value
	void AdjustBid(int amount);				// receive feedback amount w/o overflowing or < 0
	// #error if this works, give it a better name to avoid confusion
	void AdjustBid2(double factor);			// adjust by a percentage instead of absolute
	void SupportBid(int amount);			// price support: bring bid up to at least amount

	// variables
	static ulong LastID;// for consecutive numbering of Rules
	ulong id;			// unique identifier, which the rule keeps when written to disk.
	int bid;    		// strength rating.  must be int.  see complex.doc.
	uint wincount;		// # of times it's won lottery. not yet used.
	BOOL isabidder;		// is eligible in next lottery
	BOOL wasawinner;	// whether it was a winner in the last lottery
	Rule* source;		// the rule that this rule will pay passback to.
	BOOL locked;		// if locked, it is an essential rule.
						// the rules that give the pgm its minimum required functionality,
						// (askaboutprevious,etc.) must never be deleted.
						// (and ruleset must never become empty.)
						// later, could be int: 0=not locked, others code reason for locking
						// Currently, except for the essential rules, which start and
						// stay locked, no rule should ever be locked OR unlocked
						// by the pgm.

	TArrayAsVector<StateNode> IfNodes;	// "AND" nodes of a possibly compound "IF" clause
										// empty means if(1) (always true)
	TArrayAsVector<string> Actions;		// list of actions. empty means do nothing.
										// If you learn a name for a set of actions, you
										// can copy the set to dic as its op.def.,
										// and call it by name(?).
protected:
	virtual int mutate();				// mutate IfNodes or Actions
};
//----------------------------------------------------------------------------
// 							end class Rule
//////////////////////////////////////////////////////////////////////////////
// a RuleSet performs all functions required to manage a set of rules.
class RuleSet
{
public:
	RuleSet(uint startcount = 1000, string filename = "");
	RuleSet(const RuleSet&);						// copy constructor
	virtual ~RuleSet();                             // destructor

	friend ostream& operator << (ostream& os, const RuleSet&);	// write
	friend istream& operator >> (istream& is, RuleSet&);		// read

	// array functions allow treating the set as though it were derived from its array
	Rule * & operator [] (int loc) const { return(Array[loc]); }// matches TIArray[]
	uint GetItemsInContainer() { return(Array.GetItemsInContainer()); }
	Rule* Find(ulong ruleid);				   	// ptr to rule with this id, or 0

	// functions
	BOOL Add(const Rule&);						// add a copy of a specififed rule to Array
	BOOL Addchild(BOOL preferbidders = FALSE);	// select parents & add a child rule to Array
	BOOL AddCustom();							// make rule to respond to current situation
	int CountBidders();							// count number of current bidders
	int HighestBid();							// return highest bid any Rule has
	BOOL IsASource(const Rule* r);				// is r a source for any other rule?
	BOOL IsOK();								// successfully loaded?
	int KillWeakest(int dcount = 1);	// delete the given number of weakest non-locked rules
	Rule* Lottery(BOOL biddersonly);			// chance of winning is proportional to bid
	int MarkBidders();							// mark all Rules that address the current state
	int PayAllWinners(int amount, int denominator = 0);	// returns number of winners paid
	// #error if this works, give it a better name to avoid confusion
	int PayAllWinners2(double factor, int denominator = 0);	// returns number of winners paid
	long SumBids(BOOL biddersonly); 			// TRUE=bidders only, FALSE=all rules eligible
	int SupportAllWinners(int amount);			// returns number of winners paid
	BOOL Sweep(int minbid = 100);				// delete all rules with bid < minimum
	Rule* Winner();								// find latest winner
	void ZeroFlags(BOOL waw, BOOL iab, BOOL src);	// reset wasawinner, isabidder, source
	uint Load(const string& filename);

	// variables
	// the ruleset contains the StateArray that its Rules address
	StateArray State;

	// Don't use sorted: first rule encountered in KillWeakest() must be the oldest.
	// indirect: rules must be individually accessible outside the container
	// keep considering protected derivation from TIArray, and/or as TArray<Rule*>
	// Don't derive it: Eventually, the rule set could become so large that it must be
	// broken into subsets, each specialized for a particular situation.
	// So Talker COULD wind up with an array OF RuleSets.
	TIArrayAsVector<Rule> Array; 	// the set of rules (or "classifiers")

protected:
	string sourcefile;  	// file the set was loaded from
	BOOL writeonexit;
};
//----------------------------------------------------------------------------
// 							end class RuleSet
//////////////////////////////////////////////////////////////////////////////

#endif			// clasfier.h

clasfier.cpp

/*	clasfier.cpp         3/1/02
	This is part of the WTalk project.
	Copyright (C)1999-2000, 2002 Steven Whitney.
	Published under GNU GPL (General Public License) Version 3, with ABSOLUTELY NO WARRANTY.
	Initially published by http://25yearsofprogramming.com.

*/
#include "clasfier.h"
#include "c:\bcs\library\filearay.h"

//////////////////////////////////////////////////////////////////////////////
// class StateNode
//----------------------------------------------------------------------------
// default ctor is useless except required for array use, and sets nothing.
StateNode::StateNode() {}
//----------------------------------------------------------------------------
// constructor for strings
StateNode::StateNode(const string& name, const string& val, int opr)
{
varname = name;
op = opr;
value = val;
}                       	// constructor
//----------------------------------------------------------------------------
// constructor for uint
StateNode::StateNode(const string& name, uint val, int opr)
{
varname = name;
op = opr;
char buf[6];  			// create an equivalent string, 0-padded to 5 digits
ostrstream(buf,sizeof(buf)) << setw(5) << setfill('0') << val << ends;
value = buf;
}                       	// constructor
//----------------------------------------------------------------------------
// destructor
StateNode::~StateNode()
{
}                       	// destructor
//----------------------------------------------------------------------------
// assignment operator
StateNode& StateNode::operator = (const StateNode& other)
{
if(this != &other)
{
	varname = other.varname;
	op = other.op;
	value = other.value;
}
return(*this);
}                    		// assignment operator
//----------------------------------------------------------------------------
// equality
BOOL StateNode::operator == (const StateNode& other) const
{
	return(&other == this);
}                          	//operator ==
//----------------------------------------------------------------------------
// less than: alphabetical, based on varname
BOOL StateNode::operator < (const StateNode& other) const
{
	return(varname < other.varname);
}                         	//operator <
//----------------------------------------------------------------------------
// output to a stream
// the [] bracket delimiters are easy to read in manual file editing,
// have an advantage in the opening and closing symbols being different,
// and only rarely appear in incoming text. Quotes, which are common,
// may be useful to test for, so they must be allowed to become embedded,
// and thus can't be the delims.
// #error make s.op a string?: "EQ", etc.: much easier to read during manual file editing.
// you will have to translate it back to numeric on input.
// maybe get rid of the commas & just use spaces instead.
ostream& operator << (ostream& os, StateNode& s)
{
	os << bstring(s.varname) << "," <<  s.op << "," << bstring(s.value);
	return(os);
}                         	//operator << write
//----------------------------------------------------------------------------
// read from a stream.
istream& operator >> (istream& is, StateNode& s)
{
	is.ignore(MAXINT,'[');			// discard all chars to opening bracket.
									// (also skips extra spacer lines between rules)
	getline(is,s.varname,']');		// variable name(read to closing bracket)
	is.ignore(MAXINT,',');			// discard all chars up to comma
	is >> s.op;						// operator
	is.ignore(MAXINT,'[');			// discard all chars up to opening bracket.
	getline(is,s.value,']');		// value
	return(is);
}                  			//operator >>
//----------------------------------------------------------------------------
// op is the only member that it makes sense to mutate.
// the idea here is to make op values of existing (so presumably useful) rules
// drift rather than jump wildly so that the new operator is just a logical step
// away from the old one, thus usually continuing to encompass the relation
// that made it useful in the first place.
int StateNode::mutate()
{
switch(op)
{
	case LT:
		op = (random(2) ? LE : NE);
		break;
	case LE:
		switch(random(3))
		{
			case 0: op = LT; break;
			case 1: op = EQ; break;
			case 2: op = NE; break;
		}
		break;
	case EQ:
		switch(random(4))
		{
			case 0:					// 50% chance of CONTAINS, generalizes keywords
			case 1: op = CONTAINS; break;
			case 2: op = LE; break;
			case 3: op = GE; break;
		}
		break;
	case GE:
		switch(random(3))
		{
			case 0: op = GT; break;
			case 1: op = EQ; break;
			case 2: op = NE; break;
		}
		break;
	case GT:
		op = (random(2) ? GE : NE);
		break;
	case NE:
		op = (random(2) ? GT : LT);
		break;
	case CONTAINS:
		op = (random(2) ? EQ : CONTAINS);
		break;
	// #error needs more thought on logical values.  Should it mutate at all?
	case DNCONTAIN:
		op = (random(2) ? NE : CONTAINS);
		break;
	default:
		logerror("Unhandled operator in StateNode::mutate()");
		break;
}
return(1);
}                         	//mutate
//----------------------------------------------------------------------------
// 						end class StateNode
//////////////////////////////////////////////////////////////////////////////
// class StateArray
//----------------------------------------------------------------------------
// constructor
StateArray::StateArray(int upper)
	: TArrayAsVector<StateNode>(upper,0,upper)
{
// auto-load from disk.  You must already be logged into the correct dir.
ifstream("states.dat") >> *this;
}                       	// constructor
//----------------------------------------------------------------------------
// copy constructor
StateArray::StateArray(const StateArray& other)
	: TArrayAsVector<StateNode>(other.GetItemsInContainer(),0,100)
{
	*this = other;
}							// copy constructor
//----------------------------------------------------------------------------
// destructor
StateArray::~StateArray()
{
// auto-write to disk.  You must already be logged into the correct dir.
ofstream(backupfile("states.dat").c_str()) << *this;
}                       	// destructor
//----------------------------------------------------------------------------
// assignment operator
// if Rule::IfNodes is changed to a StateArray, you can
// use this to easily create a new Rule from the existing State.
StateArray& StateArray::operator = (const StateArray& other)
{
if(this != &other)
{
	Flush();
	for(int i = 0 ; i < other.GetItemsInContainer() ; i++)
		Add(other[i]);
}
return(*this);
}                    		// assignment operator
//----------------------------------------------------------------------------
// output to a stream
ostream& operator << (ostream& os, const StateArray& s)
{
int count = s.GetItemsInContainer();
os << count << endl;
for(int i = 0 ; i < count ; i++)
	os << s[i] << endl;
return(os);
}                     		//operator <<
//----------------------------------------------------------------------------
// read from a stream
// be sure to set ALL members, even those not read from the stream
istream& operator >> (istream& is, StateArray& s)
{
s.Flush();			  	// probably not necessary.  unsure if desirable.
int count = 0;
StateNode node;
is >> count;
for(int i = 0 ; is.good() && (i < count) ; i++)
{
	is >> node;
	s.Add(node);      	// replaces any existing node for the same varname
}
return(is);
}                  			// operator >>
//----------------------------------------------------------------------------
// add a node, first deleting node (if any, there should be just one!) that
// hold a previous value for the same variable.
int StateArray::Add(const StateNode& s)
{
for(int i = 0 ; i < GetItemsInContainer() ; )
	if((*this)[i].varname == s.varname)
	{
		Destroy(i);
		break;			// assume there's only 1
	}
	else
		i++;
return(TArrayAsVector<StateNode>::Add(s));
}                      		//Add
//----------------------------------------------------------------------------
// returns value of the given variable, or an empty string if not found.
string StateArray::GetValue(const string& varname)
{
for(int i = 0 ; i < GetItemsInContainer() ; i++)
	if((*this)[i].varname == varname)
		return((*this)[i].value);
return("");
}                    		//GetValue
//----------------------------------------------------------------------------
// delete node, if any, holding varname.
// returns value, if any, that it had before removal.
string StateArray::Remove(const string& varname)
{
string s;
for(int i = 0 ; i < GetItemsInContainer() ; i++)
	if((*this)[i].varname == varname)
	{
		s =(*this)[i].value;
		Destroy(i);
		break;
	}
return(s);
}                    		//Remove
//----------------------------------------------------------------------------
// 						end class StateArray
//////////////////////////////////////////////////////////////////////////////
// class Rule
//----------------------------------------------------------------------------
// static members
ulong Rule::LastID = 0;
//--------------------------------------------------------------------------
// constructor.
// When loading from disk, this constructor is used, then >> overwrites some members.
Rule::Rule(int newbid)
	: IfNodes(20,0,20), Actions(5,0,5)		// you must fill these AFTER construction
{
init();
bid = newbid;
id = ++LastID;
}          					// constructor
//--------------------------------------------------------------------------
// initialization common to all ctors, and can also be used elsewhere.
void Rule::init()
{
wincount = 0;
isabidder = FALSE;
wasawinner = FALSE;
source = 0;
locked = FALSE;
}          					// init
//--------------------------------------------------------------------------
#if 0
// 5/18/2006 These 2 mating constructors appear to be vestiges of the Classify.cpp
// versions, not yet revised for this program, but kept in case they prove usable.
// crossover mating constructor, mate 2 existing Rules to create the new one
// this is a sort of peculiar type of copy constructor that requires 2 sources
// if used, this must apply a similar method to IfNodes and/or Actions arrays.
//      stimulus response
//      -------- --------
// e.g. #00 #000# 00000000
// and  #1# 1111# 11111111, cut at 3 would produce children:
//      -----------------
//      #00 1111# 11111111
// or   #1# #000# 00000000. (in book, both offspring are produced. see complex.doc)
Rule::Rule(Rule* first, Rule* second, int newbid, int mutationrate)
	: IfNodes(20,0,20), Actions(5,0,5)		// you must fill these AFTER construction
{
init();
bid = newbid;
if(!first || !second)           // one or both pointers is null
{
	if(!first && !second)       		// both null.  random values at least
	{									// guarantee a valid rule
	}
	else	// only one is null.  use the valid one as the only parent
	{
		Rule* parent = (first ? first : second);
	}
}
else 									// both parents valid
{
	// eliminate bias that can be introduced by the Classifier selection process:
	// first is likely to be one of the strongest rules, since it
	// won an "open to all" lottery.  if first was the only strong rule in a field of
	// very weak ones, second may only have been chosen because first couldn't
	// mate with itself.  Since the starting sequence always comes from
	// first, it is more likely to have come from the stronger rule,
	// when in fact its trailing sequence may have been the reason for its strength.
	// The rules should have an equal chance of providing either portion.
	if(random(2))
		swap(first,second);

	// create the crossover.
	// copy first (cutpoint) nodes from first, the rest of nodes from second
	int cutpoint = random(15)+1;		// 1-15, at least one char from each parent
}
if(!random(mutationrate)) 				// introduce a random mutation
	mutate();
}                  			//mating constructor (crossover)
#endif	// 0
//--------------------------------------------------------------------------
#if 0
// mating constructor, mate 2 Rules to create the new one
// this is a sort of peculiar type of copy constructor that requires 2 sources
// new version, selects each position from one parent or the other
// #error bring over method from WADAPT.CPP for different length sources,
// and use for IfNodes(?) and Actions(definitely).
// You must define what is a null or segment-end marker IfNode.
// for Actions, use an empty string(NO), or a very unusual one ("ZZZZ", or whatever).
Rule::Rule(Rule* first, Rule* second, int newbid, int mutationrate)
	: IfNodes(20,0,20), Actions(5,0,5)		// you must fill these AFTER construction
{
init();
bid = newbid;
if(!first || !second)           // one or both pointers is null
{
	if(!first && !second)       		// both null.  random values at least
	{									// guarantee a valid rule
	}
	else	// only one is null.  use the valid one as the only parent
	{
		Rule* parent = (first ? first : second);
	}
}
else									// both parents valid
{
}
if(!random(mutationrate)) 				// introduce a random mutation
	mutate();
} 						// mating constructor (normal, mine)
#endif	// 0
//--------------------------------------------------------------------------
Rule::Rule(const Rule& other)
	: IfNodes(20,0,20), Actions(5,0,5)		// you must fill these AFTER construction
{
*this = other;
}							// copy constructor
//--------------------------------------------------------------------------
// assignment
Rule& Rule::operator = (const Rule& other)
{
if(this != &other)
{
	bid = other.bid;
	isabidder = other.isabidder;
	wasawinner = other.wasawinner;
	source = other.source;
	locked = other.locked;
	wincount = other.wincount;
	id = other.id;				// you must duplicate the id, not give it a new one

	// now copy the arrays
	IfNodes.Flush();
	Actions.Flush();
	for(int i = 0 ; i < other.IfNodes.GetItemsInContainer() ; i++)
		IfNodes.Add(other.IfNodes[i]);
	for(i = 0 ; i < other.Actions.GetItemsInContainer() ; i++)
		Actions.Add(other.Actions[i]);
}
return(*this);
}    						//operator =
//--------------------------------------------------------------------------
// You MUST use &other == this, not compare members.  See RuleSet::Kill().
// The exact rule at the given ptr must be Destroyed, not just an identical one.
// Otherwise, you might leave wild ptrs lying around.
BOOL Rule::operator == (const Rule& other) const
{
return(&other == this);
}                    		//operator ==
//--------------------------------------------------------------------------
// for sorted container or comparison by bid.
// The only use for a sorted container is for a routine that finds the highest bid rules.
// (RuleSet.Array must not be sorted)
BOOL Rule::operator < (const Rule& other) const
{
return(bid < other.bid);
}                      		//operator <
//----------------------------------------------------------------------------
// output a rule to any ostream.
ostream& operator << (ostream& os, const Rule& r)
{
os << r.id << " " << r.bid << " " << r.wincount << " " << r.locked << endl;
int count = r.IfNodes.GetItemsInContainer();
os << count << endl;
for(int i = 0 ; i < count ; i++)
	os << r.IfNodes[i] << endl;
count = r.Actions.GetItemsInContainer();
os << count << endl;
for(i = 0 ; i < count ; i++)
	os << bstring(r.Actions[i]) << endl;	// brackets emphasize intentional blank lines
return(os);
}                			//operator << write
//-----------------------------------------------------------------------
// read a rule from an istream.
istream& operator >> (istream& is, Rule& r)
{
r.init();
r.IfNodes.Flush();		// you DO want to completely replace any existing contents,
r.Actions.Flush();		// NOT append.

int count = 0;
StateNode node;
string s;
is 	>> r.id				// overwrites the id it was given on construction
	>> r.bid >> r.wincount >> r.locked;
is >> count;
for(int i = 0 ; is.good() && (i < count) ; i++)
{
	is >> node;
	r.IfNodes.Add(node);
}
count = 0;
is >> count;
for(i = 0 ; is.good() && (i < count) ; i++)
{
	is.ignore(MAXINT,'[');		// discard all chars to opening bracket.
	getline(is,s,']');			// variable name(read to closing bracket)
	r.Actions.Add(s);
}
Rule::LastID = max(Rule::LastID,r.id);		// ? ensure no id is ever below the top
return(is);
}                     	//operator >> read
//--------------------------------------------------------------------------
// alternative to operator ==, actually compares some of the members.
// they're the same if their Ifs and Actions are the same. (order not significant)
BOOL Rule::SameAs(const Rule& other) const
{
int ifcount = IfNodes.GetItemsInContainer();
int actioncount = Actions.GetItemsInContainer();

// first test: counts must be the same
if( (ifcount != other.IfNodes.GetItemsInContainer()) ||
	(actioncount != other.Actions.GetItemsInContainer()) )
		return(FALSE);

// now compare nodes.  any non-matching node returns FALSE
for(int i = 0 ; i < ifcount ; i++)
	if(other.IfNodes.Find(IfNodes[i]) == INT_MAX)
		return(FALSE);

for(i = 0 ; i < actioncount ; i++)
	if(other.Actions.Find(Actions[i]) == INT_MAX)
		return(FALSE);

return(TRUE);
}                      		// SameAs
//----------------------------------------------------------------------------
// measure of similarity.  not yet written
double Rule::Similarity(const Rule& other)
{
logerror("Rule::Similarity(const Rule& other) does nothing.");
if(SameAs(other))
	return(1);
return(0);
}                  			//Similarity
//--------------------------------------------------------------------------
int Rule::mutate()
{
int ifcount = IfNodes.GetItemsInContainer();
int acount = Actions.GetItemsInContainer();
if(!ifcount && !acount)						// nothing to mutate
	return 0;
int which;					// which to mutate (0=Actions, 1=IfNodes)
if(!ifcount)
	which = 0;
else
	if(!acount)
		which = 1;
	else
		// IfNodes are more plentiful than Actions.
		// to mutate an IfNode 2/3 of the time, just change 2 to 3.
		which = random(2);
if(which)       			// mutate IfNode(s)
{
	// you can't add a node because a Rule has no knowledge of State variables

	if(random(2))                        		// mutate a node's op
		IfNodes[random(ifcount)].mutate();
	else                                    	// delete a random # of IfNodes
	{
		// most new rules will be created from State, and have far too many IfNodes.
		// so delete a random # of randomly selected ones.
		for(int delcount = random(ifcount + 1); delcount > 0 ; delcount--)
			IfNodes.Destroy(random(IfNodes.GetItemsInContainer()));
	}
}
else						// mutate Action(s)
{
	// if rules are usually created by mating, interleaving action lists by the
	// wadapt.cpp method, action lists will tend to be long,
	// so delete a random # of randomly selected ones.
	for(int delcount = random(acount + 1); delcount > 0 ; delcount--)
		Actions.Destroy(random(Actions.GetItemsInContainer()));

	// you could also reorder actions: I can't rule it out, but it seems very
	// unlikely to achieve anything useful.

	// insert end-of-segment marker node?
}
return(1);
}   						//mutate
//--------------------------------------------------------------------------
// pay source 1/denominator of its bid value, OR the most that source can
// accept without overflowing.  Note that this payment is always positive (or 0).
// Negative feedback only propagates backward into the system by
// "end of chain" rules eventually going broke and reducing or stopping payments.
// because of integer math, payment is zero when bid is too low.
// that's why denominator should remain an int, not a double, fraction.
void Rule::paysource(int denominator)
{
// can't pay sensor, and no point in paying self
if(!source || (source == this))
	return;

// how much of proposed payment can source accept without overflowing?
// 					proposed			max acceptable
int payment = min(bid / denominator, MAXINT - source->bid);

source->bid += payment;     // pay our "supplier"
bid -= payment;				// subtract it from our wallet

// #error you could zero out source here, since it's served its purpose.
}                     		//paysource
//--------------------------------------------------------------------------
// receive positive or negative feedback payment, OR whatever portion of it
// that it can accept without overflowing or going negative.
// this is NOT related to paysource.  it's for receiving SUCCESSREWARD or
// other feedback payments from the *environment*.
void Rule::AdjustBid(int amount)
{
if(amount >= 0)
	bid += min(amount,MAXINT - bid);
else
	// 1 is the minimum for a locked (permanent) rule; others can fall to 0.
	bid = max(bid - amount, (locked ? 1 : 0));
}                     		// AdjustBid
//--------------------------------------------------------------------------
// receive positive or negative feedback %.
// this is NOT related to paysource.  it's for receiving SUCCESSREWARD or
// other feedback payments from the *environment*.
// factor is used directly as a multiplier: for +10% payment, use 1.1.
void Rule::AdjustBid2(double factor)
{
double newbid = bid * factor;
bid = min(newbid,(double)MAXINT);
// 1 is the minimum for a locked (permanent) rule; others can fall to 0.
bid = max(bid, (locked ? 1 : 0));
}                     		// AdjustBid
//--------------------------------------------------------------------------
// price support: bring bid up to at least given amount
void Rule::SupportBid(int amount)
{
bid = max(bid,amount);
}                     		//SupportBid
//--------------------------------------------------------------------------
// 						end class rule
//////////////////////////////////////////////////////////////////////////////
// Code for class RuleSet
//----------------------------------------------------------------------------
// constructor for normal use.
RuleSet::RuleSet(uint startcount, string filename) : Array(startcount,0,1000)
{
Load(filename);
}							//constructor
//--------------------------------------------------------------------------
// copy constructor
RuleSet::RuleSet(const RuleSet& other) : Array(other.Array.GetItemsInContainer(),0,1000)
{
writeonexit = other.writeonexit;

// copy the other's rules
int count = other.Array.GetItemsInContainer();	// how many rules to copy
for(int i = 0 ; i < count ; i++)				// copy all the other's rules
	Add(*(other.Array[i]));                     // this is RuleSet::Add
}                 			//copy constructor
//--------------------------------------------------------------------------
// destructor
RuleSet::~RuleSet()
{
if(writeonexit)
	ofstream(backupfile(sourcefile).c_str()) << *this;	// auto-write to disk
}							//destructor
//--------------------------------------------------------------------------
// write to a stream
ostream& operator << (ostream& os, const RuleSet& rs)
{
os << Rule::LastID << endl; 	// highest rule # used is saved with the SET

// write the individual rules.  No count written.  1 file = 1 ruleset.
uint count = rs.Array.GetItemsInContainer();
os << count << endl;					// count is used for corrupt file test.
for(uint i = 0 ; i < count ; i++)
	os << *(rs.Array[i]) << endl;		// this will put a blank line between rules
return(os);
}                     		//operator << write
//--------------------------------------------------------------------------
// read RuleSet from a stream
istream& operator >> (istream& is, RuleSet& rs)
{
uint targetcount = 0;			// number of rules there should be

is >> Rule::LastID;				// highest rule # ever used is saved with the SET
is >> targetcount;

rs.Array.Flush(TShouldDelete::Delete);	// empty it before filling
rs.writeonexit = TRUE;					// reset
Rule r;
while(is >> r)					// read until file is used up
	rs.Add(r);                  // this is RuleSet::Add, not the array Add

// RULES.DAT is extremely important, since it now contains much of the program's function.
// you could make this test < rather than != , to allow for manual additions,
// but it could allow some errors to slip by.
uint count = rs.GetItemsInContainer();
if(!count || (count != targetcount))	// an empty ruleset is also a fatal error
{
	ostrstream os;
	os 	<< "Data file not found, empty, or corrupt.  Will not be written on exit." << endl;
	os	<< "Expected rule count = " << targetcount << ".  Actual count = " << count << endl;
	os	<< "If you recently made manual changes, update the count manually" << endl
		<< "on 2nd line of file, and rerun.  Last rule read was: " << endl;
	if(count)
		os << *(rs[count - 1]) << endl;
	else
		os << "None." << endl;
	string s = os.str();
	delete [] os.str();
	logerror(s);
	rs.writeonexit = FALSE;
}
return(is);						// it will always return as failed
}              			 		// operator >> read
//--------------------------------------------------------------------------
uint RuleSet::Load(const string& filename)
{
Array.Flush();
sourcefile = "";
writeonexit = FALSE;			// its final value is determined in operator >>
if(filename.length())
{
	FileArray FileList;			// makes filename fully qualified in case dir changes
	FileList.AddFile(filename);
	sourcefile = FileList.GetNext();
	ifstream(sourcefile.c_str()) >> *this;
}
return Array.GetItemsInContainer();
}         						//Load
//--------------------------------------------------------------------------
// previously only tested "successfully loaded?"; now tests for subsequent corruption.
BOOL RuleSet::IsOK()
{
if(!Array.GetItemsInContainer())	// in case accidentally emptied by manual editing
	writeonexit = FALSE;
return(writeonexit);
}                           	//IsOK
//--------------------------------------------------------------------------
// return highest bid any Rule has.  maybe useful for scaling screen display or other.
int RuleSet::HighestBid()
{
int highbid = 0;
for(int i = 0 ; i < Array.GetItemsInContainer() ; i++)
	highbid = max(highbid,Array[i]->bid);
return(highbid);
}               			//HighestBid
//--------------------------------------------------------------------------
// get pointer to rule with this id, or 0.
// primary use is to find whether a rule with the given id already exists.
Rule* RuleSet::Find(ulong ruleid)
{
for(uint i = 0 ; i < Array.GetItemsInContainer() ; i++)
	if(Array[i]->id == ruleid)
		return(Array[i]);
return(0);
}               			//Find
//--------------------------------------------------------------------------
// add a newed copy of the provided Rule to Array.  Because the actual new rule is newed here,
// the parameter r should be an auto variable, so it vanishes.
// Returns TRUE if rule was successfully added, FALSE if not.
// But since Array is expandable, the only reason it can fail is if out of memory,
// which will throw an exception and end pgm anyway.
BOOL RuleSet::Add(const Rule& r) 			// the rule to copy
{
if(Array.GetItemsInContainer() >= 16000)	// keep below 16k absolute limit
	KillWeakest(1);
Rule* newrule = new Rule(r);

// this makes operator >> slow, but auto-corrects any errors from manual file-editing.
if(Find(newrule->id) != 0)			// enforce unique rule id:
{
	ulong oldid = newrule->id;
	newrule->id = ++Rule::LastID;	// if it's a dup, give it the next available #.
	logerror("(RuleSet::Add) Duplicate RuleID# " + tostring(oldid) +
				" was renumbered to " + tostring(newrule->id));
}
if(Array.Add(newrule))				// add a newed COPY of r
	return(TRUE);
return(FALSE);
}               			//Add
//--------------------------------------------------------------------------
#if 0
// 5/18/2006 I believe these fns were also directly from Classify.cpp.
// select 2 parents by lottery and add their child to Array.
// strong rules are most often selected, but a weak rule has a chance.
// It can either select the parents from all the rules, or give preference to
// bidders.  If there aren't enough bidders, it will use any parent.  Theory
// is that it's pointless to create rules that aren't immediately useful.
// returns TRUE if a new child was created, FALSE if not
BOOL RuleSet::Addchild(BOOL preferbidders)
{
// #error you can't do this
while(Array.GetItemsInContainer() < 2)	// you must have at least 2 to mate
	Add(Rule());

BOOL firstabidder = FALSE;
BOOL secondabidder = FALSE;

if(preferbidders)
{
	int biddercount = CountBidders();
	firstabidder = (biddercount > 0);	// if any bidder, it can be first parent
	secondabidder = (biddercount > 1);  // if another, it can be the second
}

Rule *first = Lottery(firstabidder);  	// select the pair of rules to mate
Rule *second = first;
while(second == first)              	// prevent first mating with itself
	second = Lottery(secondabidder);

// constructor does the mating and possible mutation
Rule newrule(first,second,INITIALBID,MUTATIONRATE);

return(Add(newrule));						// it should always succeed
}                       //Addchild
//----------------------------------------------------------------------------
// make rule to respond to the exact current situation in State.
// And whose response is preferably based on the responses of
// existing rules also responsive to State: i.e. bidders!
BOOL RuleSet::AddCustom()
{
Rule r;
int N = State.GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
{
	StateNode s(State[i]);						// make a COPY
	// alter op, if desired (it's currently EQ)
	// or, to create a bunch of similar rules, add s multiple times with diff ops
	s.op = EQ;
	r.IfNodes.Add(s);

#error  do Actions: use method from Addchild. copy its code AND comments, and mod.
	// build response, from the 2 most similar rules in the set, as
	// parents, or the 1 most similar, or generated randomly as the last resort.

}
return(Add(Rule()));			// it should always succeed
}                       //AddCustom
#endif // 0
//--------------------------------------------------------------------------
// Destroy weakest non-locked rule(s) to make room for new one(s).
// returns number deleted
int RuleSet::KillWeakest(int dcount) 		// number of rules to eliminate
{
int deleted = 0;
for(int i = 0 ; i < dcount ; i++)
{
	int rulecount = Array.GetItemsInContainer();
	if(!rulecount)								// nothing there to delete
		return(deleted);

	Rule* weakest = 0;   				// starts undefined
	for(int j = 0 ; j < rulecount ; j++)
	{
		// start with first NON-LOCKED element as weakest
		if(weakest == 0)
		{
			if(!Array[j]->locked && !IsASource(Array[j]))
				weakest = Array[j];
			continue;
		}
		// only eligible for deletion if it's weakest AND unlocked AND not another rule's source
		if(!Array[j]->locked && (Array[j]->bid < weakest->bid) && !IsASource(Array[j]))
			weakest = Array[j];	   		// simultaneously mark it for deletion.
	}
	// we found the first (oldest) non-locked rule in Array with the lowest bid.
	// it had the longest to accumulate wealth, and didn't.
	if(weakest)
	{
		Array.Destroy(weakest);     	// kill the rule
		deleted++;
	}
}
return(deleted);
}							//KillWeakest
//--------------------------------------------------------------------------
// delete all non-locked, nonwinner, nonsource rules with bid < minimum.
// a winner is preserved so it gets one last pass as "previous winner",
// maybe getting its big break for survival through passback it receives.
BOOL RuleSet::Sweep(int minbid)
{
for(int i = 0 ; i < Array.GetItemsInContainer() ; )
	if(!Array[i]->locked && !Array[i]->wasawinner && !IsASource(Array[i]) &&
		(Array[i]->bid < minbid) )
			Array.Destroy(i);	// kill it, and i gets reused for the next element,
	else
		i++;				// else go to next
return(TRUE);
}               			//Sweep
//--------------------------------------------------------------------------
// count the number of currently-marked bidders.
int RuleSet::CountBidders()
{
int bidders = 0;
int N = Array.GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->isabidder)
		bidders++;
return(bidders);
}                      		//CountBidders
//--------------------------------------------------------------------------
// compute the total of all the rules's bid fields.  Used by Lottery().
// total won't exceed LONG_MAX: 32767 rules * 32767 each's bid is still less.
long RuleSet::SumBids(BOOL biddersonly) // TRUE=bidders only, FALSE=all rules eligible
{
long bidtotal = 0L;
int N = Array.GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->isabidder || !biddersonly)
		bidtotal += (long)(Array[i]->bid);
return(max(bidtotal,0L));			// just in case, prevent returning negative
}							//SumBids
//--------------------------------------------------------------------------
// lottery: create a pool into which each rule puts "bid" number of entries,
// so its chance of being chosen is proportional to its strength.
// only determines the lottery winner, so result can be used however needed.
// returns pointer to the rule that won the lottery, or 0 if no bidders
Rule* RuleSet::Lottery(BOOL biddersonly)   // TRUE=bidding, FALSE=mating
{
long bidtotal = SumBids(biddersonly);	// sum of all ->bid fields
if(bidtotal <= 0L)						// no point in a lottery if no bidders OR no rules
	return(0);							// (and a < 0 test to avoid catastrophe)
long r = lrandom(bidtotal);				// random number from 0 to bidtotal-1

// run through rules to find which one "exhausts" the random number
// (it's more likely to happen while subtracting a large bid)
int N = Array.GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
{
	if(Array[i]->bid <= 0)        	  	// convenient place for test
		logerror("ERROR in RuleSet::Lottery: A bid <= 0");

	// if it's a bidder OR if we're allowing all rules
	if(Array[i]->isabidder || !biddersonly)
	{
		r -= (long)(Array[i]->bid);    	// subtract the bid
		if(r < 0L)						// did it exhaust the random number?
		{
			// a possibility, but what if lottery was for mating only?
// 			Array[i]->wasawinner = TRUE;
			return(Array[i]);			// return pointer to winner
		}
	}
}
if(!biddersonly)  			// an all-rules lottery shouldn't fall through to here
	logerror("ERROR: No RuleSet::Lottery winner.");
return(0);
}               		   	//Lottery
//--------------------------------------------------------------------------
// pay all rules with wasawinner marked true.
// if passback denominator is provided, rules also pay their sources now.
// (not a good idea for general use).
// returns number of winners paid
int RuleSet::PayAllWinners(int amount, int denominator)
{
int winners = 0;
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->wasawinner)
	{
		Array[i]->AdjustBid(amount);

		// paying after receiving reward means source also gets its share of reward now.
		// which speeds passback up by 1 pass.

		if(denominator)
			Array[i]->paysource(denominator);
		winners++;
	}
return(winners);
}                  			// PayAllWinners
//--------------------------------------------------------------------------
// pay (by a percentage) all rules with wasawinner marked true.
// if passback denominator is provided, rules also pay their sources now.
// (not a good idea for general use).
// returns number of winners paid
int RuleSet::PayAllWinners2(double factor, int denominator)
{
int winners = 0;
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->wasawinner)
	{
		Array[i]->AdjustBid2(factor);

		// paying after receiving reward means source also gets its share of reward now.
		// which speeds passback up by 1 pass.

		if(denominator)
			Array[i]->paysource(denominator);
		winners++;
	}
return(winners);
}                  			// PayAllWinners
//--------------------------------------------------------------------------
// pay using price support
// returns number of winners paid
int RuleSet::SupportAllWinners(int amount)
{
int winners = 0;
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->wasawinner)
	{
		Array[i]->bid = max(Array[i]->bid,amount);
		winners++;
	}
return(winners);
}                  			// SupportAllWinners
//--------------------------------------------------------------------------
// returns pointer to the Rule that won the last lottery, or 0 if none found.
// It returns the first found, so you must know that there is only one winner.
Rule* RuleSet::Winner()
{
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->wasawinner)
		return(Array[i]);
return(0);
}                  			//Winner
//--------------------------------------------------------------------------
// returns number of bidders marked.
// a rule only becomes a bidder if ALL of its IfNodes conditions are satisfied.
// if the value of a needed varname can't be determined, the node is not satisfied.
int RuleSet::MarkBidders()
{
int bidders = 0;
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)			// for each rule
{
	// if the rule is already a bidder on entry, just leave it as a bidder.
	// You can call MarkBidders multiple times to accumulate bidders before the lottery.
	if(!Array[i]->isabidder)
	{
		// it stays TRUE if IdNodes is empty, meaning ALWAYS a bidder.
		Array[i]->isabidder = TRUE;  		// any unmet node turns it FALSE
		for(int j = 0 ; j < Array[i]->IfNodes.GetItemsInContainer() ; j++)	// for each node
		{
			// look up the CURRENT value of the variable that this IfNode tests
			string currval = State.GetValue(Array[i]->IfNodes[j].varname);

			// get the comparison value that this IfNode tests FOR.
			// it can be a constant OR the name of another VARIABLE.
			// (currently, rules testing against another variable can only be created manually)
			string testval = Array[i]->IfNodes[j].value;

			// try looking it up in State.  It seems unlikely that any state variable
			// will have the same name as any constant value.  So, IF it's listed there,
			// assume that it WAS a variable name, and get THAT variable's value.
			// This may become less safe if FACTs ever get translated into StateNodes.
			string temp = State.GetValue(testval);
			if(temp.length())
				testval = temp;
			if(currval.length())	// state of varname was found, so do comparison test
			{
				// see if the varname's value that was found
				// meets the condition specified by the IfNode's clause.
				switch(Array[i]->IfNodes[j].op)
				{
					case 1:										// LT
						if(!(currval < testval))
							Array[i]->isabidder = FALSE;
						break;
					case 2:										// LE
						if(!(currval <= testval))
							Array[i]->isabidder = FALSE;
						break;
					case 3:										// EQ
						if(!(currval == testval))
							Array[i]->isabidder = FALSE;
						break;
					case 4:										// GE
						if(!(currval >= testval))
							Array[i]->isabidder = FALSE;
						break;
					case 5:										// GT
						if(!(currval > testval))
							Array[i]->isabidder = FALSE;
						break;
					case 6:										// NE
						if(currval == testval)
							Array[i]->isabidder = FALSE;
						break;
					case 7:										// CONTAINS
						if(!currval.contains(testval))
							Array[i]->isabidder = FALSE;
						break;
					case 8:										// DNCONTAIN
						if(currval.contains(testval))
							Array[i]->isabidder = FALSE;
						break;
					default:
						logerror("Unhandled op inRuleSet::MarkBidders()");
						break;
				}               // switch
			}
			else	// varname not found, IfNode automatically FALSE, Rule can't be a bidder
				Array[i]->isabidder = FALSE;
			if(!Array[i]->isabidder)		// if node false, rule can't be a bidder,
				break;						// so quit trying nodes, and go to next Rule.
		}									// for(each ifnode)
	}         								// if(!Array[i]->isabidder)
	if(Array[i]->isabidder)                 // whether it already was, or was just set.
		bidders++;
}						// for(each rule)
return(bidders);
}                    			// MarkBidders
//--------------------------------------------------------------------------
// is r a source for any other rule?
// this is useful in determining whether a rule can be deleted.
BOOL RuleSet::IsASource(const Rule* r)
{
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
	if(Array[i]->source == r)
		return(TRUE);
return(FALSE);
}             			// IsASource
//--------------------------------------------------------------------------
// reset flags whose corresponding parameters are TRUE.
// no default values.  you must specify all parameters on each call.
void RuleSet::ZeroFlags(BOOL waw, BOOL iab, BOOL src)
{
int N = GetItemsInContainer();
for(int i = 0 ; i < N ; i++)
{
	if(waw) Array[i]->wasawinner = FALSE;
	if(iab) Array[i]->isabidder = FALSE;
	if(src) Array[i]->source = 0;
}
}             				//ZeroFlags
//--------------------------------------------------------------------------
// 						end class RuleSet
////////////////////////////////////////////////////////////////////////////

 

 

Valid HTML 4.01 Transitional Valid CSS
View content labeling at ICRA.
Copyright ©2008 Steven Whitney. Last modified 10/07/2008.