Build a huffman tree in scheme
I'm suffering with this problem for a few days now. How can you build a tree with the da开发者_运维知识库ta as specified on the following site:
http://www.impulseadventure.com/photo/jpeg-huffman-coding.html, under the topic:
The actual DHT in the JPEG file
I'll reexplain it shortly here so,
You have :
- a table with lengths (bytesvector)
- a table with data (bytesvector as well)
Now I want to build a binary tree with these two arguments. Filled everytime from left to right with the data for the corresponding length. The deeper you go into the tree the longer your lengths. Lengths vary from 1 - 16. Take a look at the site and it should become clear.
Now I want to make such a tree in Scheme/Racket so I can walk to the tree and build a table for each encoded value.
The tree I have in my mind would look like:
'((x01 x02)((x03 (x11 x04))(((x00 ...)(...)))))
That was fun!
Okay, I really hope that wasn't homework.
It turns out that there's a very simple recursive solution. What you want at each level is to take a list of trees, gather them by pairs into one-deeper trees, then append the new leaves at this level. This could be written using 'foldr', but I thought it would be a bit less clear.
I should clarify the input a bit; on the page you mention, the specifications look like
leaves at level 0 :
leaves at level 1 :
leaves at level 2 : x23, x42, x23
leaves at level 3 : x24, x23
This would correspond to the input
'(() () (x23 x42 x23) (x24 x23))
to the program below.
Also, the only thing going on here is the mapping of this table to the binary tree, which will be of help only when decoding. For encoding, this binary tree will be useless.
Finally, a big shout-out to How To Design Programs; I followed the design recipe carefully, dotting all my i's and crossing all my t's. Test cases first, please!
Cheers!
John Clements
#lang racket
(require rackunit)
;; a tree is either
;; a symbol, or
;; (list tree tree)
;; a specification is
;; (listof (listof symbol))
;; spec->tree : specification -> tree
;; run spec->treelist, ensure that it's a list of length 1, return it.
(define (spec->tree spec)
(match (spec->treelist spec)
[(list tree) tree]
[other (error 'spec->tree "multiple trees produced")]))
;; spec->treelist : specification -> (listof tree)
;; given a *legal* specification, produce
;; the corresponding tree. ONLY WORKS FOR LEGAL SPECIFICATIONS...
(define (spec->treelist spec)
(cond [(empty? spec) empty]
[else (append (first spec) (gather-pairs (spec->treelist (rest spec))))]))
;; go "up one level" by grouping each pair of trees into one tree.
;; The length of the list must be a number divisible by two.
(define (gather-pairs trees)
(match trees
[(list) empty]
[(list-rest a b remaining) (cons (list a b) (gather-pairs remaining))]
[other (error 'gather "improperly formed specification")]))
;; TEST CASES
(check-equal? (gather-pairs '(a b c d)) '((a b) (c d)))
(check-equal? (spec->treelist '((top))) '(top))
(check-equal? (spec->treelist '(() (two-a two-b))) '((two-a two-b)))
(check-equal? (spec->treelist '(() (two-a) (three-a three-b)))
'((two-a (three-a three-b))))
(check-equal? (spec->treelist '(() () (three-a three-b three-c) (four-a four-b)))
'(((three-a three-b) (three-c (four-a four-b)))))
(check-equal? (spec->tree '(() () (three-a three-b three-c) (four-a four-b)))
'((three-a three-b) (three-c (four-a four-b))))
First count every symbol then sort the resulting list then make a node out of the first 2 entries in the sorted listed and delete them out of the list. Continue until your list is empty. Building a tree is quite simple: If you have all symbols and frequency you can group 2 symbols to a node and make the left value the number of the left frequency and the right number the number of the left + right frequency. This also called a nested set or a Celko-Tree.
#lang r6rs
(library
(huffman-table)
(export make-table find)
(import (rnrs base (6))
(rnrs io simple)
(only (racket base) bytes bytes-length bytes-ref make-hash hash-set! hash-ref do)
(rnrs mutable-pairs (6)))
(define (make-node left right)
(list left right))
(define (left node)
(car node))
(define (right node)
(cadr node))
(define (left! node left)
(set-car! node left)
left)
(define (right! node right)
(set-car! (cdr node) right)
right)
(define (node? object)
(eq? (car object) 'node))
(define (make-leaf value)
(list 'leaf value))
(define (value leaf)
(cadr leaf))
(define (leaf? object)
(eq? (car object) 'leaf))
(define (generate-pairs lengths data)
(define length (bytes-length lengths))
(let out-loop ((l-idx 0)
(d-idx 0)
(res '()))
(if (= l-idx length)
(reverse res)
(let in-loop
((t 0)
(amt (bytes-ref lengths l-idx))
(temp-res '()))
(if (= t amt)
(out-loop (+ l-idx 1)(+ d-idx (bytes-ref lengths l-idx))(cons temp-res res))
(in-loop (+ t 1) amt (cons (bytes-ref data (+ d-idx t)) temp-res)))))))
(define (add-nodes node-lst)
(let loop ((added-nodes '())
(node-lst node-lst))
(cond ((null? node-lst) (reverse added-nodes))
(else (let ((node (car node-lst))
(left-child (make-node '() '()))
(right-child (make-node '() '())))
(if (null? (left node))
(begin (left! node left-child)
(right! node right-child)
(loop (cons right-child (cons left-child added-nodes))
(cdr node-lst)))
(begin (right! node right-child)
(loop (cons right-child added-nodes)
(cdr node-lst)))))))))
(define (label-nodes! node-lst values)
(let loop ((node-lst node-lst)
(values values))
(cond ((null? values) node-lst)
((null? (cdr values))(if (null? (left (car node-lst)))
(left! (car node-lst) (car values))
(right! (car node-lst) (car values)))
node-lst)
(else (if (null? (left (car node-lst)))
(begin (left! (car node-lst) (car values))
(right! (car node-lst) (cadr values))
(loop (cdr node-lst)(cddr values)))
(begin (right! (car node-lst)(make-leaf (car values)))
(loop (cdr node-lst)(cdr values))))))))
(define (make-tree pairs)
(define root (make-node '() '()))
;(define curr-nodes (list root))
(let loop ((curr-nodes (list root))
(pairs pairs))
(cond
((null? pairs) root)
(else (loop (add-nodes (label-nodes! curr-nodes (car pairs)))
(cdr pairs))))))
(define (atom? el)
(not (pair? el)))
(define (add bit bitstr)
(if bitstr
(string-append (number->string bit) bitstr)
#f))
(define (code symbol tree)
(cond ((null? tree) #f)
((atom? tree) (if (= tree symbol)
""
#f))
(else (or (add 0 (code symbol (left tree)))
(add 1 (code symbol (right tree)))))))
(define (make-table lengths data)
(define pairs (generate-pairs lengths data))
(define tree (make-tree pairs))
(define table (make-hash))
(do ((i 0 (+ i 1)))
((= i (bytes-length data)) table)
(let ((val (bytes-ref data i)))
(hash-set! table (code val tree) val))))
(define (find table bitstring)
(hash-ref table bitstring #f))
)
精彩评论