After I build my Huffman tree the root's weight is 700k when I've read through 5megs of data
// Huffman Tree.cpp
#include "stdafx.h"
#include <iostream>
#include <string>//Necessary to do any string comparisons
#include <fstream>
#include <iomanip>
#include <cstdlib>//for exit() function
using namespace std;
class BinaryTree{
private:
struct treenode{
char data;
int weight;
treenode *LChild;
treenode *RChild;
};
treenode * root;
int freq[256];
treenode* leaves[256];
string path[256];
string longestpath;
void BuildHuffmanStrings(treenode *p, string path);
public:
void InitializeFromFile(string FileName);
void EncodeFile(string InFile, string OutFile);
void DecodeFile(string InFile, string OutFile);
BinaryTree()
{
for(int i=0;i<256;i++){
freq[i]=0;
leaves[i] = new treenode;
}
root=NULL;
}
};//Class end
/*Takes supplied filename and builds Huffman tree, table of encoding strings, etc.
Should print number of bytes read.*/
void BinaryTree::InitializeFromFile(string Filename){
int CHAR_RANGE = 256;
ifstream inFile;
inFile.open(Filename.c_str(), fstream::binary);
if(inFile.fail()){
cout<<"Error in opening file "<<Filename;
return;
}
char c;
inFile.get(c);
int bytesread = 0;
while(!inFile.eof()){
bytesread++;
freq[(int)c] ++;
inFile.get(c);
}
for(int i=0;i<CHAR_RANGE;i++){//makes a leafnode for each char
leaves[i]->weight=freq[i];
leaves[i]->data=(char)i;
}
int wheremin1, wheremin2, min1, min2;
/*Builds the Huffman Tree by finding the first two minimum values and makes a parent
node linking to both*/
for(int k=0;k<256;k++){
wheremin1=0; wheremin2=0;
min1 = INT_MAX; min2 = INT_MAX;
//Finding the smallest values to make the branches/tree
for(int i=0;i<CHAR_RANGE;i++){
if(leaves[i] && freq[i]<min1){
min1=leaves[i]->weight; wheremin1=i;
}
}for(int i=0;i<CHAR_RANGE;i++){
if(leaves[i] && freq[i]<min2 && i!=wheremin1){
min2=leaves[i]->weight; wheremin2=i;
}
}
if(leaves[wheremin1] && leaves[wheremin2]){
treenode* p= new treenode;
p->LChild=leaves[wheremin1]; p->RChild=leaves[wheremin2];//Setting p to point at the two min nodes
p->weight=min1 + min2;
leaves[wheremin2]=NULL;
leaves[wheremin1]=p;
root=p;
}
}//end for(build tree)
cout<<" Bytes read: "<<bytesread;
cout<<" Weight of the root: "<<root->weight;
}
/*Takes supplied file names and encodes the InFile, placing the result in OutFile. Also
checks to make sure InitializeFromFile ran properly. Prints in/out byte counts. Also
computes the size of the enc开发者_StackOverflow中文版oded file as a % of the original.*/
void BinaryTree::EncodeFile(string InFile, string OutFile){
}
/*Takes supplied file names and decodes the InFile, placing the result in OutFile. Also
checks to make sure InitializeFromFile ran properly. Prints in/out byte counts.*/
void BinaryTree::DecodeFile(string InFile, string OutFile){
}
int main(array<System::String ^> ^args){
BinaryTree BT;
BT.InitializeFromFile(filename);
return 0;
}
So my bytesread var = around 5mil bytes, but my root's weight is = to 0 by the end of all this code.
If you can't figure it out(I'm going to be spending at least another hour looking for the bug before bed) could you give me some tips for improving efficiency?
Edit: The problem was if(freq[i]<min1)
. First it should be leaves[i]->weight comparison to min1 because that's the array I'm actually manipulating to create the tree(freq[] just has the weights, not treenode pointers). So to fix it I made that line and the if statement after it: if(leaves[i] && leaves[i]->weight<=min1)
and if(leaves[i] && (leaves[i]->weight)<=min2 && i!=wheremin1)
If you have more suggestions for cleaning up my code(ie. more comments in certain places, different ways to compare, etc.), please suggest. I'm not a great coder, but I'd like to be and I'm trying to work towards having good code.
Edit2: I posted the new/fixed code. My root's weight is now equal to bytesread. I'm still open to suggestions to clean up this code.
Few thing I could find:
if(freq[i]<min1){
should be
if(freq[i]<=min1){
as you cant say for sure all you frequencies will be less than INT_MAX. Similarly:
if(freq[i]<min2 && i!=wheremin1){
should be:
if(freq[i]<=min2 && i!=wheremin1){
as min1
and min2
can be equal too.
Once you start combining the nodes, you take care of deleting the combining nodes and inserting the combined new node by changing the leaves
array. But you are not changing the freq
array, which needs to change as wells so that the frequencies of the deleted nodes do not participate again.
A couple hints:
1) Write a function "DumpState()" that produces output (to cout) looking roughly like this:
============START==================
freq[0] = <some number>
freq[1] = <some number>
...
freq[255] = <some number>
leaves[0] = null
leaves[1] = { data = 'B', weight = 3 }
...
leaves[255] = null
============= END ================
Put this function in before your main loop, after one iteration, after two iterations, etc.
2) Create an input file that's really, really simple. Something like:
aabc
Run your program, and save the log file (created with 1 above). Work through what should be happening before the first loop, in the first loop, etc. Compare that with your log file, of what actually is happening. You might want to print some other variables too (min1, min2, wheremin1, wheremin2).
I don't have solution yet, but have few comments. This is quite a long piece of code. And to be honest little clumsy. I would suggest to refactor your code into proper methods. (Many times, problems just get solved while refactoring!)
For example, the following lines in BinaryTree::InitializeFromFile()
for(int i=0;i<256;i++){
freq[i]=0;
leaves[i] = new treenode;
}
may be more appropriate in BinaryTree constructor. Also, there are both of the following in BinaryTree
treenode * root;
treenode * leaves[256]
Can you comment which one is for what? The magic number is 256 is present in multiple places. Can you have a suitably named variable for that?
精彩评论