10/08/2007

Huffman encoding algorithm24Jan07

The first version of Huffman encoding


C++ CodeC Code



#include <iostream>

#include <map> // a multimap will be used so we have to include it


using namespace std;


void binary(int number) { //thise procedure is used to convert a

int remainder; //number into a binary number


if(number <= 1) {

cout << number;

return;

}


remainder = number%2;

binary(number >> 1);

cout << remainder;

}


int main()

{

char ch, alphabet[31]; //store each char in ch, 31 chars used because of 26 alphabet, dot, comma, space

typedef multimap<int,char> huffman; //make a multimap that will be sorted by integers, we use a multimap

huffman sentence; //because some characters may have the same number of occurences


int i=0,j=0;


for(int j=0; j<31; j++) //alphabet[0..30] = 0;

alphabet[j]=0;


while((ch=getchar())!=\n) //insert chars into ch until enter(return) is pressed

{

if(ch==‘ ‘)

alphabet[28]++; //increment number of spaces

else if(ch==‘,’)

alphabet[29]++; //increment number of commas

else if(ch==‘.’)

alphabet[30]++; //increment number of dots

else{

ch = toupper(ch); //convert all chars to upper case

alphabet[(static_cast<int>(ch))-65]++;//increment (char ASCII value) - 65 (this is the index number)

}

}


for(int j=0; j<31; j++)

{

if(alphabet[j]>0) //check if char occured in the text

{

i++; //count the number of found chars, commas, dots, spaces

if(j==28)

sentence.insert(make_pair(alphabet[j],‘ ‘)); //make the pair from space and its occurences

else if(j==29)

sentence.insert(make_pair(alphabet[j],‘,’)); //make the part from comma and its occurences

else if(j==30)

sentence.insert(make_pair(alphabet[j],‘.’)); //make the pair from dot and its occurences

else

sentence.insert(make_pair(alphabet[j],static_cast<char>(65+j))); //make the pair from chars and its occurences

}

}


huffman::iterator pos; //make an iterator for the multimap

for(pos = sentence.begin(); pos!=sentence.end(); ++pos)

{

j++; //number of different chars

cout << “Letter “ << pos->second << ” “; //print out the letter and then its occurence code

binary(i-j); //convert the number to binary

cout << endl;

}

return 0;

}

What is a Binary seach algorithm and how does it work?

A binary search algorithm (or binary chop) is a technique for finding a particular value in a linear array, by ruling out half of the data at each step, widely but not exclusively used in computer science.

A binary search finds the median, makes a comparison to determine whether the desired value comes before or after it, and then searches the remaining half in the same manner. Another explanation would be: Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half.
Its running time is O(logn).

#include
using namespace std;

//Divide and conquer strategy - Binary Search
/*This algorithm is the simplest application of divide and conquer.
It finds a particular value in a linear array by ruling out half
of the data at each step.A binary search finds the median, makes
a comparision to determine whether the desired value comes before or
after it, and then searches the remaining half in the same manner.
However, the binary search algorithm applies only to sorted arrays.*/

int sorted_array[100];
int i=0;

int binary_search(int target)
{
int first = 0;
int last = i-1;
int mid;
while(first<=last) //while we haven’t reached the end of
{
mid = (first+last)/2; //rule out half of the data by spliting the array
if(sorted_array[mid]==target) //if we have found the target
return mid;
else if(sorted_array[mid]>target)
last = mid-1; //the desired value is in the lower part of the array
else
first = mid+1;//the desired value is in the upper part of the array
}
return -1; //there is no such element in the array
}

int main()
{
int j, n;
while(cin>>j)
sorted_array[i++]=j;
while(cin>>n) //search for an element n in the array
{
int result = binary_search(n); //position of the element n in the array
printf("%d",result);
}
}

What are Hash Tables and how do they work?


As we saw with binary search, certain data structures such as a binary search tree can help improve the efficiency of searches. From linear search to binary search, we improved our search efficiency from O(n) to O(logn). We now present a new data structure, called a hash table, that will increase our efficiency to O(1), or constant time.

In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person’s name), find the corresponding value (e.g. that person’s telephone number). It works by transforming the key using a hash function into a hash, a number that is used to index into an array to locate the desired location (”bucket”) where the values should be.

Hash tables are often used to implement associative arrays, sets and caches. Like arrays, hash tables provide constant-time O(1) lookup on average, regardless of the number of items in the table. (O(1) means that it takes a constant amount of time independent of the number of items involved.) However, the rare worst-case lookup time can be as bad as O(n). Compared to other associative array data structures, hash tables are most useful when large numbers of records are to be stored, especially if the size of the data set can be predicted.

Hash tables may be used as in-memory data structures. Hash tables may also be adopted for use with persistent data structures; database indexes sometimes use disk-based data structures based on hash tables, although balanced trees are more popular.

//this program creates a hash table as a set of linked lists
// here 10 pointers are used to address 10 linked lists which form a hash tablw
#include
#include

int main()
{
int i,k,num;
struct node
{
int n;
struct node* next;
};

struct node *temp,*p1,*AOP[10];

for(i=0;i<10;i++)
AOP[i]= NULL;
for( i=0;i<20;i++)
{
scanf(“%d”, &num);
k=num%10;
p1=( struct node*) malloc ( sizeof(struct node));

p1->n=num;
p1->next=NULL;

if(AOP[k]==NULL)
AOP[k]=p1;
else
{
p1->next=AOP[k];
AOP[k]=p1;
}
}

for(i=0;i<10;i++)
{
temp=AOP[i];
printf(“List number %d :”, i);
while(temp!=NULL)
{
printf(“%5d”,temp->n);
temp=temp->next;
}
printf(“\n“);
}
}