开发者

After I build my Huffman tree the root's weight is 700k when I've read through 5megs of data

// Huffman Tree.cpp

#include "stdafx.h"
#include <iostream>
#include <string>//Necessary to do any string comparisons
#include <fstream>
#include <iomanip>
#include <cstdlib>//for exit() function

using namespace std;

class BinaryTree{

private:
    struct treenode{
        char data;
        int weight;     
        treenode *LChild;
        treenode *RChild;
    };
    treenode * root;
    int freq[256];
    treenode* leaves[256];
    string path[256];
    string longestpath;
    void BuildHuffmanStrings(treenode *p, string path);

public:
    void InitializeFromFile(string FileName);
    void EncodeFile(string InFile, string OutFile);
    void DecodeFile(string InFile, string OutFile);


BinaryTree()
{
    for(int i=0;i<256;i++){
        freq[i]=0;
        leaves[i] = new treenode;
    }
    root=NULL;
}
};//Class end

    /*Takes supplied filename and builds Huffman tree, table of encoding strings, etc.
    Should print number of bytes read.*/
void BinaryTree::InitializeFromFile(string Filename){
    int CHAR_RANGE = 256;
    ifstream inFile;
    inFile.open(Filename.c_str(), fstream::binary);
    if(inFile.fail()){
        cout<<"Error in opening file "<<Filename;
        return;
    }
    char c;
    inFile.get(c);
    int bytesread = 0;
    while(!inFile.eof()){
        bytesread++;
        freq[(int)c] ++;
        inFile.get(c);
    }
    for(int i=0;i<CHAR_RANGE;i++){//makes a leafnode for each char
        leaves[i]->weight=freq[i];
        leaves[i]->data=(char)i;
    }
    int wheremin1, wheremin2, min1, min2;
    /*Builds the Huffman Tree by finding the first two minimum values and makes a parent
    node linking to both*/
    for(int k=0;k<256;k++){
        wheremin1=0; wheremin2=0;
        min1 = INT_MAX; min2 = INT_MAX;
        //Finding the smallest values to make the branches/tree
        for(int i=0;i<CHAR_RANGE;i++){
            if(leaves[i] && freq[i]<min1){
                min1=leaves[i]->weight; wheremin1=i;
            }
        }for(int i=0;i<CHAR_RANGE;i++){
            if(leaves[i] && freq[i]<min2 && i!=wheremin1){
                min2=leaves[i]->weight; wheremin2=i;
            }
        }
        if(leaves[wheremin1] && leaves[wheremin2]){
            treenode* p= new treenode;
            p->LChild=leaves[wheremin1]; p->RChild=leaves[wheremin2];//Setting p to point at the two min nodes
            p->weight=min1 + min2;
            leaves[wheremin2]=NULL;
            leaves[wheremin1]=p;
            root=p;
        }
    }//end for(build tree)
    cout<<" Bytes read: "<<bytesread;
    cout<<" Weight of the root: "<<root->weight;
}

/*Takes supplied file names and encodes the InFile, placing the result in OutFile. Also
checks to make sure InitializeFromFile ran properly. Prints in/out byte counts. Also 
computes the size of the enc开发者_StackOverflow中文版oded file as a % of the original.*/
void BinaryTree::EncodeFile(string InFile, string OutFile){

}

/*Takes supplied file names and decodes the InFile, placing the result in OutFile. Also
checks to make sure InitializeFromFile ran properly. Prints in/out byte counts.*/
void BinaryTree::DecodeFile(string InFile, string OutFile){

}

int main(array<System::String ^> ^args){
    BinaryTree BT;
    BT.InitializeFromFile(filename);
    return 0;
}

So my bytesread var = around 5mil bytes, but my root's weight is = to 0 by the end of all this code.

If you can't figure it out(I'm going to be spending at least another hour looking for the bug before bed) could you give me some tips for improving efficiency?

Edit: The problem was if(freq[i]<min1). First it should be leaves[i]->weight comparison to min1 because that's the array I'm actually manipulating to create the tree(freq[] just has the weights, not treenode pointers). So to fix it I made that line and the if statement after it: if(leaves[i] && leaves[i]->weight<=min1) and if(leaves[i] && (leaves[i]->weight)<=min2 && i!=wheremin1)

If you have more suggestions for cleaning up my code(ie. more comments in certain places, different ways to compare, etc.), please suggest. I'm not a great coder, but I'd like to be and I'm trying to work towards having good code.

Edit2: I posted the new/fixed code. My root's weight is now equal to bytesread. I'm still open to suggestions to clean up this code.


Few thing I could find:

if(freq[i]<min1){

should be

if(freq[i]<=min1){

as you cant say for sure all you frequencies will be less than INT_MAX. Similarly:

if(freq[i]<min2 && i!=wheremin1){

should be:

if(freq[i]<=min2 && i!=wheremin1){

as min1 and min2 can be equal too.

Once you start combining the nodes, you take care of deleting the combining nodes and inserting the combined new node by changing the leaves array. But you are not changing the freq array, which needs to change as wells so that the frequencies of the deleted nodes do not participate again.


A couple hints:

1) Write a function "DumpState()" that produces output (to cout) looking roughly like this:

 ============START==================
 freq[0] = <some number>
 freq[1] = <some number>
 ...
 freq[255] = <some number>
 leaves[0] = null
 leaves[1] = { data = 'B', weight = 3 }
 ...
 leaves[255] = null
 ============= END ================

Put this function in before your main loop, after one iteration, after two iterations, etc.

2) Create an input file that's really, really simple. Something like:

aabc

Run your program, and save the log file (created with 1 above). Work through what should be happening before the first loop, in the first loop, etc. Compare that with your log file, of what actually is happening. You might want to print some other variables too (min1, min2, wheremin1, wheremin2).


I don't have solution yet, but have few comments. This is quite a long piece of code. And to be honest little clumsy. I would suggest to refactor your code into proper methods. (Many times, problems just get solved while refactoring!)

For example, the following lines in BinaryTree::InitializeFromFile()

for(int i=0;i<256;i++){
    freq[i]=0;
    leaves[i] = new treenode;
}

may be more appropriate in BinaryTree constructor. Also, there are both of the following in BinaryTree

treenode * root;
treenode * leaves[256]

Can you comment which one is for what? The magic number is 256 is present in multiple places. Can you have a suitably named variable for that?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜