首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >哈夫曼编码和数据压缩Java

哈夫曼编码和数据压缩Java

原创
作者头像
成都小展
修改2020-07-21 17:57:00
5040
修改2020-07-21 17:57:00
举报
文章被收录于专栏:CS-Data StructureCS-Data Structure

哈夫曼的实现和代码:

Huffman Node 的类定义:

class HFMTreeNode {
    int freq; //权值
    String s; //节点数据

    HFMTreeNode parent;
    HFMTreeNode left;
    HFMTreeNode  right;

    public  HFMTreeNode(int freq, String s){  //构造函数
        this.freq = freq;
        this.s = s;
    }
}

Huffman Tree 的类定义:

map: Map, 储存每个string及其对应的频率

encode_map: Map, 储存每个string及其编码

pq: 优先队列,每次插入按照HFMTreeNode的频率,从小到大排列

public class HFMTree {
    private  String[] data;
    private HFMTreeNode root;

    //Use dictionary to store: string-frequency 
    private Map<String, Integer> map = new HashMap<>();
    
    //Use dictionary to store: string-encoded string
    private Map<String, String> encode_map = new HashMap<>();
    
    //Use priority queque to store: HFMTreeNode 
    private PriorityQueue<HFMTreeNode> pq = new PriorityQueue<>(
            (HFMTreeNode a, HFMTreeNode b) -> {
                if( a.freq < b.freq)
                    return -1;
                else
                    return 1;
            }
    );
    
    //构造函数
    public HFMTree(String[] data){
        this.data = data;
    }
    
    //类方法:
    public void count_freq();                                                       //第一步
    public void create_node_in_pq();                                                //第二步
    public HFMTreeNode build_Tree();                                                //第三步
    public void setEncode_map(HFMTreeNode root, String code);                       //第四步
    public String encoded(String[] data);                                           //第五步
    public String decoded(String encoded_string);                                   //第六步
    public String search_encoded_map(String value);                  //第六步的hepler function 
    
    //Spilt string into string array
    static public String[] String_to_StringArr(String string);
    public void tree_print(HFMTreeNode root); //traversal the tree 
    public void map_print(); //print the string-freq information
    public void encode_map_print(); //print the string-encoded string information  

}

具体实现:

第一步:计算每个string的权值(出现的频率)

第二步:构造优先队列

第三步:构造Huffman Tree

第四步:遍历Huffman Tree, 计算并保存每个string对应的编码

第五步:加密

第六步:解密

第一步:计算每个string的权值(出现的频率), 存储到map结构里面

//Step1: Count each string freq and put them in the map set
    public void count_freq(){
        for(int i = 0; i<data.length; i++){
            if(map.containsKey(data[i]) ){
                int freq = map.get(data[i]) + 1;
                map.put(data[i], freq);
            }else
                map.put(data[i], 1);
        }
    }

第二步:创建新的HFMTreeNode, 在优先队列里面添加每一个Huffman Node

//Step2 : Create each  tree node from map into pq
    public void create_node_in_pq(){
        for( String s: map.keySet()){
            HFMTreeNode hfmTreeNode = new HFMTreeNode(map.get(s), s);
            pq.add(hfmTreeNode);
        }
    }

第三步:每次去优先队列里面的前两个,组成一个新的Huffman Node, 添加到优先队列里面,再重复第三步,直到优先队列只剩下一个Huffman Node

//Step3: return the root node of Huffman tree
    public HFMTreeNode build_Tree(){
        if(pq.size() == 0)
            return null;
        if(pq.size() == 1)
            return pq.peek();

        while(pq.size() >= 2){
            HFMTreeNode first = pq.poll();
            HFMTreeNode second = pq.poll();

            int new_freq = first.freq + second.freq;
            HFMTreeNode new_node = new HFMTreeNode(new_freq, null);
            new_node.left = first;
            new_node.right = second;

            first.parent = new_node;
            second.parent = new_node;

            pq.offer(new_node);
        }

        //最后只剩一个,就是 root;
        this.root = pq.peek();
        return pq.peek();

    }

第四步:遍历Huffman Tree, 找到每个叶节点,存入每个string对应的编码

//Step4: encode the string and store them into map
    public void setEncode_map(HFMTreeNode root, String code){
        if(root ==null)
            return;

        if(root.left == null && root.right ==null){
            encode_map.put(root.s, code);
        }

        setEncode_map(root.left, code + "0");
        setEncode_map(root.right, code+"1");

    }
    

第五步:对输入的string(data)进行加密,并输出保存到新的文件

//Step5: encode the string 
    public String encoded(String[] data){
        String out = "";
        for(int i = 0;i<data.length;i++){
            String temp = data[i];
            out += encode_map.get(temp);
        }
        return out;
    }

第六步:对加密的文件进行解密,并输出保存到新的文件

//Step6: decode the encoded string
    public String decoded(String encoded_string){
        String de_code = "";
        String temp = "";
        
        for(int i  = 0;i<encoded_string.length();i++){
            temp += encoded_string.charAt(i);
            //只要在encode_map 里面有这个value,我们就添加对应的key
            if(encode_map.containsValue(temp)){
                de_code+= search_encoded_map(temp);
                temp = "";
            }
        }
        return de_code;
    }

    //Helper function:
    //Traversal encoded_map, return the key by searching the value
    //通过value来找key
    public String search_encoded_map(String value){
        for(String key: encode_map.keySet()){
            if(encode_map.get(key).equals(value))
                return key;
        }
        return null;
    }

各种信息的打印:

//BFS print the tree information
    public void tree_print(HFMTreeNode root){
        System.out.println("Tree Print: ");
        Queue<HFMTreeNode> queue= new LinkedList<>();
        if(root == null){
            return;
        }

        queue.add(root);
        while( ! queue.isEmpty()) {
            int size = queue.size();
            for(int i= 0; i<size;i++){

                HFMTreeNode temp = queue.poll();
                if(temp.left != null ) queue.add(temp.left);
                if(temp.right != null) queue.add(temp.right);

                System.out.print(temp.s + " "+ temp.freq+" ");
            }
            System.out.println();
        }
    }
    
    
    //print the map information
    public void map_print(){
        System.out.println("Map Print: ");
        for(String k : map.keySet())
            System.out.println( k + " " + map.get(k));
    }

    //print the encode map information
    public void encode_map_print(){
        System.out.println("Encode_map Print: ");
        for(String k : encode_map.keySet())
            System.out.println( k + " " + encode_map.get(k));
    }


    //print the pq information
    public void pq_print(){
        System.out.println("Pq Print: ");
        HFMTreeNode cur = pq.poll();
        while(! pq.isEmpty()){
            System.out.println(cur.s + " "+ cur.freq);
            cur = pq.poll();
        }
    }
    
    
    

把String 分成一个一个的,组成String[]

//Spilt string into string array
static public String[] String_to_StringArr(String string){
    char[] chars = string.toCharArray();
    String[] s = new String[chars.length];

    for(int i = 0; i< chars.length;i++){
        s[i] = String.valueOf(chars[i]);
    }

    return s;
}

后面会有ReadFile类来帮助我们读取和存储文件,我在这儿就不一一介绍啦~

完整代码

1. HFMTree类:

package 哈夫曼;

import java.util.*;

public class HFMTree {
    private  String[] data;
    private HFMTreeNode root;

    //using dictionary to store the freq of each string
    private Map<String, Integer> map = new HashMap<>();

    private Map<String, String> encode_map = new HashMap<>();

    //
    private PriorityQueue<HFMTreeNode> pq = new PriorityQueue<>(
            (HFMTreeNode a, HFMTreeNode b) -> {
                if( a.freq < b.freq)
                    return -1;
                else
                    return 1;
            }
    );


    public HFMTree(String[] data){
        this.data = data;
    }

    //-----------------------核心代码-----------------------
    //Step1: Count each string freq and put them in the map set
    public void count_freq(){
        for(int i = 0; i<data.length; i++){
            if(map.containsKey(data[i]) ){
                int freq = map.get(data[i]) + 1;
                map.put(data[i], freq);
            }else
                map.put(data[i], 1);
        }
    }


    //Step2 : Create each  tree node from map into pq
    public void create_node_in_pq(){
        for( String s: map.keySet()){
            HFMTreeNode hfmTreeNode = new HFMTreeNode(map.get(s), s);
            pq.add(hfmTreeNode);
        }
    }


    //Step3: return the root node of Huffman tree
    public HFMTreeNode build_Tree(){
        if(pq.size() == 0)
            return null;


        if(pq.size() == 1)
            return pq.peek();

        while(pq.size() >= 2){
            HFMTreeNode first = pq.poll();
            HFMTreeNode second = pq.poll();

            int new_freq = first.freq + second.freq;
            HFMTreeNode new_node = new HFMTreeNode(new_freq, null);
            new_node.left = first;
            new_node.right = second;

            first.parent = new_node;
            second.parent = new_node;

            pq.offer(new_node);
        }

        //最后只剩一个,就是 root;
        this.root = pq.peek();
        return pq.peek();

    }

    //Step4: encode the string and store them into map
    public void setEncode_map(HFMTreeNode root, String code){
        if(root ==null)
            return;

        if(root.left == null && root.right ==null){
            encode_map.put(root.s, code);
        }

        setEncode_map(root.left, code + "0");
        setEncode_map(root.right, code+"1");

    }

    //Step5: encode the string
    public String encoded(String[] data){
        String out = "";
        for(int i = 0;i<data.length;i++){
            String temp = data[i];
            out += encode_map.get(temp);
        }
        return out;
    }

    //Step6: decode the encoded string
    public String decoded(String encoded_string){
        String de_code = "";
        String temp = "";
        for(int i  = 0;i<encoded_string.length();i++){
            temp += encoded_string.charAt(i);
            //只要在encode_map 里面有这个value,我们就添加对应的key
            if(encode_map.containsValue(temp)){
                de_code+= search_encoded_map(temp);
                temp = "";
            }
        }
        return de_code;
    }
    
    //Helper function:
    //Traversal encoded_map, return the key by searching the value
    //通过value来找key
    public String search_encoded_map(String value){
        for(String key: encode_map.keySet()){
            if(encode_map.get(key).equals(value))
                return key;
        }
        return null;
    }


    //-----------------------核心代码-----------------------




    //BFS print the tree information
    public void tree_print(HFMTreeNode root){
        System.out.println("Tree Print: ");
        Queue<HFMTreeNode> queue= new LinkedList<>();
        if(root == null){
            return;
        }

        queue.add(root);
        while( ! queue.isEmpty()) {
            int size = queue.size();
            for(int i= 0; i<size;i++){

                HFMTreeNode temp = queue.poll();
                if(temp.left != null ) queue.add(temp.left);
                if(temp.right != null) queue.add(temp.right);

                System.out.print(temp.s + " "+ temp.freq+" ");
            }
            System.out.println();
        }
    }


    //print the map information
    public void map_print(){
        System.out.println("Map Print: ");
        for(String k : map.keySet())
            System.out.println( k + " " + map.get(k));
    }

    //print the encode map information
    public void encode_map_print(){
        System.out.println("Encode_map Print: " );
        for(String k : encode_map.keySet())
            System.out.println( k + " " + encode_map.get(k));
    }


    //print the pq information
    public void pq_print(){
        System.out.println("Pq Print: ");
        HFMTreeNode cur = pq.poll();
        while(! pq.isEmpty()){
            System.out.println(cur.s + " "+ cur.freq);
            cur = pq.poll();
        }
    }


    //Spilt string into string array
    static public String[] String_to_StringArr(String string){
        char[] chars = string.toCharArray();
        String[] s = new String[chars.length];

        for(int i = 0; i< chars.length;i++){
            s[i] = String.valueOf(chars[i]);
        }

        return s;
    }

    public static void main(String[] args) {
        ReadFile readFile = new ReadFile();

        //从in.txt 读取String,保存到read_string
        String read_string = readFile.readFileByArray("in.txt");
        //对数据进行预处理
        String[] data = String_to_StringArr(read_string);

        //创建对象
        HFMTree hfmTree = new HFMTree(data);
        //计算权值
        hfmTree.count_freq();
        //构造优先队列
        hfmTree.create_node_in_pq();
        //创建树
        HFMTreeNode root = hfmTree.build_Tree();

        //编码
//        hfmTree.tree_print(root);
        hfmTree.setEncode_map(root, "");
        //打印加密map
        hfmTree.encode_map_print();

        //加密
        String encoded_string = hfmTree.encoded(data);
        //把编译码输入到文件里面
        readFile.save(encoded_string, "encode_out.txt");


        //解码
        String decoded_string = hfmTree.decoded(encoded_string);
        //把解码输入到文件里面
        readFile.save(decoded_string, "decode_out.txt");

    }

}


class HFMTreeNode {
    int freq;
    String s;

    HFMTreeNode parent;
    HFMTreeNode left;
    HFMTreeNode  right;

    public  HFMTreeNode(int freq, String s){
        this.freq = freq;
        this.s = s;
    }
}

2. ReadFile类

package 哈夫曼;


//文件数据读写的基本模式java/c++/go
//    I/O模型

//1.读写一下文件,熟练,写,出错,然后改正,是个固定的模式
//2. 读取一个文本文件(统计每个字符的出现频率),输出每个字符的哈夫曼编码
//    读取一个文件,统计每个字节的出现频率,输出不同频率字节的哈夫曼编码


import java.io.FileInputStream;
import java.io.FileOutputStream;

public class ReadFile {


    /**
     *
     * @param fileName 要读取的文件名
     * @return  文件字符串(只能是文本文件)
     */

    public  String  readFileByArray(String fileName){
        try{
            //用输入流对象
            java.io.FileInputStream  fins=new FileInputStream(fileName);
            //得知道文件有多长,多少个字节
            int len=fins.available();
            //无论什么文件,在盘上,都是字节序列
            byte[] data=new byte[len];
            fins.read(data);//将文件中的字节,读进缓冲区
            //将字节数组,转码为文本
            String txt=new String(data);
//            System.out.println("读到的文件是 "+txt);
            return txt;
        }catch(Exception ef){
            ef.printStackTrace();//打印出错的信息
            System.out.println("读取文件出错");
        }
        return null;
    }


    //保存文件
    public boolean  save(String content,String fileName){
        try{
            //输出流
            FileOutputStream fous=new FileOutputStream(fileName);
            //将字符串转为字节数组
            byte[] data=content.getBytes();
            //写入
            fous.write(data);
            fous.close();//水管子用完了,要关
            return true;
        }catch(Exception ef){
            ef.printStackTrace();//输出错误
        }
        return false;

    }

}

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 哈夫曼的实现和代码:
    • Huffman Node 的类定义:
      • Huffman Tree 的类定义:
        • 具体实现:
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档