C++ Decision Tree Implementation Question: Think In Code
I've been coding for a few years but I still haven't gotten the hang of pseudo-coding or actually thinking things out in code yet. Due to this problem, I'm having trouble figuring out exactly what to do in creating a learning Decision Tree.
Here are a few sites that I have looked at and trust me there were plenty more
Decision Tree Tutorials
DMS Tutorials
Along with several books such as Ian Millington's AI for Games which includes a decent run-down of the different learning algorithms used in decision trees and Behavioral Mathematics for Game Programming which is basically all about Decision Trees and theory. I understand the concepts for a decision tree along with Entropy, ID3 and a little on how to intertwine a genetic algorithm and have a decision tree decide the nodes for the GA. They give good insight, but not what I am looking for really.
I do have some basic code that creates the nodes for the decision tree, and I believe I know how to implement actual logic but it's no use if I don't have a purpose to the program or have entropy or a learning algorithm involved.
What I am asking is, can someone help me figure out what I need to do to create this learning decision tree. I have my nodes in a class of their own flowing through functions to create the tree, but how would I put entropy into this and should it have a class, a struct I'm not sure how to put it together. Pseudo-code and an Idea of where I am going with all this theory and numbers. I can put the code together if only I knew what I needed to code. Any guidance would be appreciated.
How Would I Go About This, Basically.
Adding a learning algorithm such as ID3 and Entropy. How should it be set up?
Once I do have this figured out on how to go about all this,I plan to implement this into a state machine that goes through different states in a game/simulation format. All of this is already set up, I just figure this could be stand-alone and once I figure it out I can just move it to the other project.
Here is the source code I have for now.
Thanks in advance!
Main.cpp:
int main()
{
//create the new decision tree object
DecisionTree* NewTree = new DecisionTree();
//add root node the very first 'Question' or decision to be made
//is monster health greater than player health?
NewTree->CreateRootNode(1);
//add nodes depending on decisions
//2nd decision to be made
//is monster strength greater than player strength?
NewTree->AddNode1(1, 2);
//3rd decision
//is the monster closer than home base?
NewTree->AddNode2(1, 3);
//depending on the weights of all three decisions, will return certain node result
//results!
//Run, Attack,
NewTree->AddNode1(2, 4);
NewTree->AddNode2(2, 5);
NewTree->AddNode1(3, 开发者_开发技巧6);
NewTree->AddNode2(3, 7);
//Others: Run to Base ++ Strength, Surrender Monster/Player,
//needs to be made recursive, that way when strength++ it affects decisions second time around DT
//display information after creating all the nodes
//display the entire tree, i want to make it look like the actual diagram!
NewTree->Output();
//ask/answer question decision making process
NewTree->Query();
cout << "Decision Made. Press Any Key To Quit." << endl;
//pause quit, oh wait how did you do that again...look it up and put here
//release memory!
delete NewTree;
//return random value
//return 1;
}
Decision Tree.h:
//the decision tree class
class DecisionTree
{
public:
//functions
void RemoveNode(TreeNodes* node);
void DisplayTree(TreeNodes* CurrentNode);
void Output();
void Query();
void QueryTree(TreeNodes* rootNode);
void AddNode1(int ExistingNodeID, int NewNodeID);
void AddNode2(int ExistingNodeID, int NewNodeID);
void CreateRootNode(int NodeID);
void MakeDecision(TreeNodes* node);
bool SearchAddNode1(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
bool SearchAddNode2(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
TreeNodes* m_RootNode;
DecisionTree();
virtual ~DecisionTree();
};
Decisions.cpp:
int random(int upperLimit);
//for random variables that will effect decisions/node values/weights
int random(int upperLimit)
{
int randNum = rand() % upperLimit;
return randNum;
}
//constructor
//Step 1!
DecisionTree::DecisionTree()
{
//set root node to null on tree creation
//beginning of tree creation
m_RootNode = NULL;
}
//destructor
//Final Step in a sense
DecisionTree::~DecisionTree()
{
RemoveNode(m_RootNode);
}
//Step 2!
void DecisionTree::CreateRootNode(int NodeID)
{
//create root node with specific ID
// In MO, you may want to use thestatic creation of IDs like with entities. depends on how many nodes you plan to have
//or have instantaneously created nodes/changing nodes
m_RootNode = new TreeNodes(NodeID);
}
//Step 5.1!~
void DecisionTree::AddNode1(int ExistingNodeID, int NewNodeID)
{
//check to make sure you have a root node. can't add another node without a root node
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
return;
}
if(SearchAddNode1(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type1 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
//check
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.1!~ search and add new node to current node
bool DecisionTree::SearchAddNode1(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
//if there is a node
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch1 == NULL)
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
//for a third, add another one of these too!
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode1(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 5.2!~ does same thing as node 1. if you wanted to have more decisions,
//create a node 3 which would be the same as this maybe with small differences
void DecisionTree::AddNode2(int ExistingNodeID, int NewNodeID)
{
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
}
if(SearchAddNode2(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type2 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.2!~ search and add new node to current node
//as stated earlier, make one for 3rd node if there was meant to be one
bool DecisionTree::SearchAddNode2(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch2 == NULL)
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode2(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 11
void DecisionTree::QueryTree(TreeNodes* CurrentNode)
{
if(CurrentNode->NewBranch1 == NULL)
{
//if both branches are null, tree is at a decision outcome state
if(CurrentNode->NewBranch2 == NULL)
{
//output decision 'question'
///////////////////////////////////////////////////////////////////////////////////////
}
else
{
cout << "Missing Branch 1";
}
return;
}
if(CurrentNode->NewBranch2 == NULL)
{
cout << "Missing Branch 2";
return;
}
//otherwise test decisions at current node
MakeDecision(CurrentNode);
}
//Step 10
void DecisionTree::Query()
{
QueryTree(m_RootNode);
}
////////////////////////////////////////////////////////////
//debate decisions create new function for decision logic
// cout << node->stringforquestion;
//Step 12
void DecisionTree::MakeDecision(TreeNodes *node)
{
//should I declare variables here or inside of decisions.h
int PHealth;
int MHealth;
int PStrength;
int MStrength;
int DistanceFBase;
int DistanceFMonster;
////sets random!
srand(time(NULL));
//randomly create the numbers for health, strength and distance for each variable
PHealth = random(60);
MHealth = random(60);
PStrength = random(50);
MStrength = random(50);
DistanceFBase = random(75);
DistanceFMonster = random(75);
//the decision to be made string example: Player health: Monster Health: player health is lower/higher
cout << "Player Health: " << PHealth << endl;
cout << "Monster Health: " << MHealth << endl;
cout << "Player Strength: " << PStrength << endl;
cout << "Monster Strength: " << MStrength << endl;
cout << "Distance Player is From Base: " << DistanceFBase << endl;
cout << "Disntace Player is From Monster: " << DistanceFMonster << endl;
//MH > PH
//MH < PH
//PS > MS
//PS > MS
//DB > DM
//DB < DM
//good place to break off into different decision nodes, not just 'binary'
//if statement lower/higher query respective branch
if(PHealth > MHealth)
{
}
else
{
}
//re-do question for next branch. Player strength: Monster strength: Player strength is lower/higher
//if statement lower/higher query respective branch
if(PStrength > MStrength)
{
}
else
{
}
//recursive question for next branch. Player distance from base/monster.
if(DistanceFBase > DistanceFMonster)
{
}
else
{
}
//DECISION WOULD BE MADE
//if statement?
// inside query output decision?
//cout << <<
//QueryTree(node->NewBranch2);
//MakeDecision(node);
}
//Step.....8ish?
void DecisionTree::Output()
{
//take repsective node
DisplayTree(m_RootNode);
}
//Step 9
void DecisionTree::DisplayTree(TreeNodes* CurrentNode)
{
//if it doesn't exist, don't display of course
if(CurrentNode == NULL)
{
return;
}
//////////////////////////////////////////////////////////////////////////////////////////////////
//need to make a string to display for each branch
cout << "Node ID " << CurrentNode->m_NodeID << "Decision Display: " << endl;
//go down branch 1
DisplayTree(CurrentNode->NewBranch1);
//go down branch 2
DisplayTree(CurrentNode->NewBranch2);
}
//Final step at least in this case. A way to Delete node from tree. Can't think of a way to use it yet but i know it's needed
void DecisionTree::RemoveNode(TreeNodes *node)
{
//could probably even make it to where you delete a specific node by using it's ID
if(node != NULL)
{
if(node->NewBranch1 != NULL)
{
RemoveNode(node->NewBranch1);
}
if(node->NewBranch2 != NULL)
{
RemoveNode(node->NewBranch2);
}
cout << "Deleting Node" << node->m_NodeID << endl;
//delete node from memory
delete node;
//reset node
node = NULL;
}
}
TreeNodes.h:
using namespace std;
//tree node class
class TreeNodes
{
public:
//tree node functions
TreeNodes(int nodeID/*, string QA*/);
TreeNodes();
virtual ~TreeNodes();
int m_NodeID;
TreeNodes* NewBranch1;
TreeNodes* NewBranch2;
};
TreeNodes.cpp:
//contrctor
TreeNodes::TreeNodes()
{
NewBranch1 = NULL;
NewBranch2 = NULL;
m_NodeID = 0;
}
//deconstructor
TreeNodes::~TreeNodes()
{ }
//Step 3! Also step 7 hah!
TreeNodes::TreeNodes(int nodeID/*, string NQA*/)
{
//create tree node with a specific node ID
m_NodeID = nodeID;
//reset nodes/make sure! that they are null. I wont have any funny business #s -_-
NewBranch1 = NULL;
NewBranch2 = NULL;
}
Please correct me if I'm wrong but judging from the images at http://dms.irb.hr/tutorial/tut_dtrees.php and http://www.decisiontrees.net/?q=node/21 the actual decision logic should go in the nodes, not in the tree. You could model that by having polymorphic nodes, one for each decision there is to be made. With a bit of change to the tree construction and a minor minor modification for the decision delegation your code should be fine.
Basically you need to break everything down into stages and then modularise each part of the algorithm that you are trying to implement.
You can do this in pseudo-code or in code itself using functions/classes and stubs.
Each part of the algorithm you can then code up in some function and even test that function before adding it all together. You will basically end up with various functions or classes that are used for specific purposes in the algorithm implementation. So in your case for tree construction you will have one function that calculates entropy at a node, another that partitions the data into subsets at each node, etc.
I am talking in the general case here rather than specifically with respect to decision tree construction. Check out the book on Machine Learning by Mitchell if you need specific information on decision tree algorithms and the steps involved.
The pseudocode to implement a decision tree is as follows
createdecisiontree(data, attributes)
Select the attribute a with the highest information gain
for each value v of the attribute a
Create subset of data where data.a.val==v ( call it data2)
Remove the attribute a from the attribute list resulting in attribute_list2
Call CreateDecisionTree(data2, attribute_list2)
You will have to store nodes at each level with some code like
decisiontree[attr][val]=new_node
精彩评论