Huffman Coding Analysis

It is optimal code that is commonly used for lossless data compression (reduces file sizes). It uses greedy algorithms approach. Lossless means without loss of information.

There are mainly two major parts in Huffman Coding
1) Build a Huffman Tree from input characters.
2) Traverse the Huffman Tree and assign codes to characters.

Let's say we are considering a chart with few characters. Every char is 8 bits long.

Now after Huffman Coding we get the chart like this:
Character       Frequency     Occurance
A    1100                 4
B    1101                 4
C    100                  3
D    101                  3
E    111                  3
F    0                      1

Total Occurance only 4+4+3+3+3+1 = 18 bits only

So, without Huffman Coding total bits = 144
After Huffman Coding total bits = 18

This way Huffman Coding reduces /compress data (file) size.

Comments

DFS Performance Measurement

Completeness DFS is not complete, to convince yourself consider that our search start expanding the left subtree of the root for so long path (maybe infinite) when different choice near the root could lead to a solution, now suppose that the left subtree of the root has no solution, and it is unbounded, then the search will continue going deep infinitely, in this case , we say that DFS is not complete. Optimality Consider the scenario that there is more than one goal node, and our search decided to first expand the left subtree of the root where there is a solution at a very deep level of this left subtree , in the same time the right subtree of the root has a solution near the root, here comes the non-optimality of DFS that it is not guaranteed that the first goal to find is the optimal one, so we conclude that DFS is not optimal. Time Complexity Consider a state space that is identical to that of BFS, with branching factor b, and we start the search fro...

Regularization in Deep Learning / Machine Learning - Prevent Overfitting

image source: mlexplained Overfittng happens in every machine learning (ML) problem. Learning how to deal with overfitting is essential to mastering machine learning. The fundamental issue in machine learning is the tension between optimization and generalization. Optimization refers to the process of adjusting a model to get the best performance possible on the training data (the learning in machine learning ), whereas generalization refers to how well the trained model performs on data it has never seen before . The goal of the game is to get good generalization, of course, but you don’t control generalization; you can only adjust the model based on its training data. The processing of fighting overfitting is a way called regularization . [1]. How do you know whether a model is overfitting? The best initial method is to measure error on a training and test set. If you see a low error on the training set and...

Compare Static and Dynamic Binding

Connecting a method call to the method body is known as binding. There are two types of binding: -static binding (also known as early binding). -dynamic binding (also known as late binding). Static binding in occurs during compile time while dynamic binding occurs during runtime. private , final and static methods and variables use static binding and are bonded by compiler while virtual methods are bonded during runtime based upon runtime object. Static binding uses Type ( class in Java) information for binding while dynamic binding uses object to resolve binding. Overloaded methods are bonded using static binding while overridden methods are bonded using dynamic binding at runtime.

Soumik's Tech Blog

Search This Blog