ELEC 301 Projects Fall 2005
Collection edited by: Richard Baraniuk and Rice University ELEC 301
Content authors: Danny Blanco, Elliot Ng, Charlie Ice, Bryan Grandy, Sara Joiner, Austin
Bratton, Ray Hwong, Jeanne Guillory, Richard Hall, Jared Flatow, Siddharth Gupta, Veena
Padmanabhan, Grant Lee, Heather Johnston, Deborah Miller, Warren Scott, _ _, Chris
Lamontagne, Bryce Luna, David Newell, William Howison, Patrick Kruse, Kyle Ringgenberg,
Michael Lawrence, Yi-Chieh Wu, Scott Novich, Andrea Trevino, and Phil Repicky
Online: < http://cnx.org/content/col10380/1.3> This selection and arrangement of content as a collection is copyrighted by Richard Baraniuk and Rice University ELEC 301.
It is licensed under the Creative Commons Attribution License: http://creativecommons.org/licenses/by/2.0/
Collection structure revised: 2007/09/25
For copyright and attribution information for the modules contained in this collection, see the " Attributions" section at the end of the collection.
ELEC 301 Projects Fall 2005
Table of Contents
Chapter 1. Steganography - What's In Your Picture
1.1. Abstract and History*
Abstract and History
For years, people have devised different techniques for encrypting data while others have
attempted to break these encrypted codes. For our project we decided to put our wealth of DSP
knowledge to use in the art of steganography. Steganography is a technique that allows one to hide
binary data within an image while adding few noticeable changes. Technological advancements
over the past decade or so have brought terms like “mp3,” “jpeg,” and “mpeg” into our everyday
vocabulary. These lossy compression techniques lend themselves perfectly for hiding data. We
have chosen this project because it gives a chance to study several various aspects of DSP. First,
we devised our own compression technique which we loosely based off jpeg. There have been
many steganographic techniques created so far, which compelled us to create two of our own
strategies for hiding data in the images we compress. Our first method, zero hiding, adds the
binary data into the DCT coefficients dropped in compression. Our other method, which we called
bit-o-steg, uses a key to change the values of coefficients that remain after compression. Finally,
we had to find ways to analyze the success of our data hiding strategies, so through our research
we found both DSP and statistical methods to qualitatively measure our work.
A Brief History of Steganography
Steganography, or “hidden writing” can be traced back to 440 BC in ancient Greece. Often they
would write a message on a wooden panel, cover it in wax, and then write a message on the wax.
These wax tablets were already used as writing utensils, so the hiding of a message in a commonly
used device draws very little suspicion. In addition to use by the Greeks, the practice of
steganography was utilized by spies in World War II. There were even rumors that terrorists made
use of steganography early in 2001 to plan the attacks of September 11
1.2. Compression Framework*
There are many picture file formats to save images to, however much of the research in
steganography is done using the JPEG format. JPEG is a very common and uses a relatively
straightforward compression algorithm. Although there are several JPEG compression scripts
written for MATLAB, customizing them for our purposes and getting the output to work with the
JPEG format would have shifted the focus of our project from steganography to implementing
JPEG compression. Thus we decided to implement our own custom image framework that would
be similar to JPEG but much more straightforward.
1.3. Compression - Dropping the DCT Coefficients*
Dropping DCT Coefficients
Our framework and JPEG are both based around the discrete cosine transform. Just like with
sound, certain frequencies in an image are more noticeable than others, so taking them out of the
image doesn’t change the image much. We used the 2D discrete cosine transform (DCT) as seen
in equation 1 to take an image and converts it into the frequencies that make up the image, in
other words it takes us into the frequency domain.
There are several transforms that could have been utilized to get the image into the frequency
domain. The DCT, however, is a purely real transform. Thus, manipulating the frequencies is
much more straightforward compared to other transforms. From here we could take the DCT of
the entire image and then throw away frequencies that are less noticeable. Unfortunately this
would make the image blurry and cause the image to lose edges. To solve this problem the image
is divided into 8x8 blocks, to preserve the integrity of the image. To drop insignificant
frequencies, JPEG compression utilizes a quantization matrix. We simplified this process by using
a threshold value and dropping frequencies below the threshold. Thus our compression algorithm
models the basic functionality of the JPEG standard.
The result of taking the DCT. The numbers in red are the coefficients that fall below the specified threshold of 10.
1.4. Compression - Zeros Grouping*
The second part to our image framework is zeros grouping. Just like the JPEG standard, the
algorithm utilizes a zig-zag pattern that goes through each DCT matrix and creates a 64-length
vector for each matrix. The advantage of the zig-zag pattern is that it groups the resulting vector
from low frequencies to high frequencies. Groups of zeros are then replaced with an ASCII
character representing how many zeros are represented within that group.
Zig-zag method traverses the matrix and vectorizes the matrix. After grouping zeros the resulting bitstream is sent to a file.
With this simple framework in place, we are able to model a real world image compression
algorithm and focus on implementing steganography.
1.5. Zeros Hiding Method*
Data Hiding Methods
We arrived at our first data hiding method, which we called “zero hiding,” quite intuitively. If you
recall, our compression algorithm removed the least important DCT coefficients. It follows, then,
that we could put the bit stream we wish to hide back into these dropped coefficients without
changing the image drastically. To do this though, there must be a way to distinguish a zero which
resulted from a dropped coefficient and a coefficient that is zero. To do this, we ran the image
through a modified compressor that, instead of dropping coefficients below the specified
threshold, replaced them with either a plus or minus one, depending on the sign of the coefficient.
The DCT is taken and then each coefficient under the specified threshold (10) will be dropped. These are coefficients are shown in blue in the picture on the right.
Next the hiding algorithm is given a binary data stream and the threshold value. The data stream is
then divided up into words. However, the maximum decimal value of the word must be less than
the threshold, since values over the threshold signify an important coefficient in the picture. We
then increment each word’s decimal value by one to avoid putting in zero valued coefficients,
which would otherwise be indistinguishable from zero valued coefficients in the original image.
We then go back to the original coefficients matrix and replace the ones with the new value of the
data word, maintaining the sign throughout.
The dropped coefficients are replaced with words created from the data stream. The IDCT is then taken, transforming the coefficient matrix back to a picture matrix.
To recover the hidden data the recovery script is given the threshold, and subtracts one from all
DCT coefficients blow that threshold and tacks their binary values together, forming the original
1.6. Bit-O-Steg Method - Background*
Data Hiding Methods