ELEC 301 Projects Fall 2005 by Danny Blanco, et al - HTML preview
Download the book in PDF, ePub, Kindle for a complete version.
ELEC 301 Projects Fall 2005
Collection edited by: Richard Baraniuk and Rice University ELEC 301
Content authors: Danny Blanco, Elliot Ng, Charlie Ice, Bryan Grandy, Sara Joiner, Austin
Bratton, Ray Hwong, Jeanne Guillory, Richard Hall, Jared Flatow, Siddharth Gupta, Veena
Padmanabhan, Grant Lee, Heather Johnston, Deborah Miller, Warren Scott, _ _, Chris
Lamontagne, Bryce Luna, David Newell, William Howison, Patrick Kruse, Kyle Ringgenberg,
Michael Lawrence, Yi-Chieh Wu, Scott Novich, Andrea Trevino, and Phil Repicky
Online: < http://cnx.org/content/col10380/1.3> This selection and arrangement of content as a collection is copyrighted by Richard Baraniuk and Rice University ELEC 301.
It is licensed under the Creative Commons Attribution License: http://creativecommons.org/licenses/by/2.0/
Collection structure revised: 2007/09/25
For copyright and attribution information for the modules contained in this collection, see the " Attributions" section at the end of the collection.
ELEC 301 Projects Fall 2005
Table of Contents
Chapter 1. Steganography - What's In Your Picture
1.1. Abstract and History*
Abstract and History
For years, people have devised different techniques for encrypting data while others have
attempted to break these encrypted codes. For our project we decided to put our wealth of DSP
knowledge to use in the art of steganography. Steganography is a technique that allows one to hide
binary data within an image while adding few noticeable changes. Technological advancements
over the past decade or so have brought terms like “mp3,” “jpeg,” and “mpeg” into our everyday
vocabulary. These lossy compression techniques lend themselves perfectly for hiding data. We
have chosen this project because it gives a chance to study several various aspects of DSP. First,
we devised our own compression technique which we loosely based off jpeg. There have been
many steganographic techniques created so far, which compelled us to create two of our own
strategies for hiding data in the images we compress. Our first method, zero hiding, adds the
binary data into the DCT coefficients dropped in compression. Our other method, which we called
bit-o-steg, uses a key to change the values of coefficients that remain after compression. Finally,
we had to find ways to analyze the success of our data hiding strategies, so through our research
we found both DSP and statistical methods to qualitatively measure our work.
A Brief History of Steganography
Steganography, or “hidden writing” can be traced back to 440 BC in ancient Greece. Often they
would write a message on a wooden panel, cover it in wax, and then write a message on the wax.
These wax tablets were already used as writing utensils, so the hiding of a message in a commonly
used device draws very little suspicion. In addition to use by the Greeks, the practice of
steganography was utilized by spies in World War II. There were even rumors that terrorists made
use of steganography early in 2001 to plan the attacks of September 11
1.2. Compression Framework*
There are many picture file formats to save images to, however much of the research in
steganography is done using the JPEG format. JPEG is a very common and uses a relatively
straightforward compression algorithm. Although there are several JPEG compression scripts
written for MATLAB, customizing them for our purposes and getting the output to work with the
JPEG format would have shifted the focus of our project from steganography to implementing
JPEG compression. Thus we decided to implement our own custom image framework that would
be similar to JPEG but much more straightforward.
1.3. Compression - Dropping the DCT Coefficients*
Dropping DCT Coefficients
Our framework and JPEG are both based around the discrete cosine transform. Just like with
sound, certain frequencies in an image are more noticeable than others, so taking them out of the
image doesn’t change the image much. We used the 2D discrete cosine transform (DCT) as seen
in equation 1 to take an image and converts it into the frequencies that make up the image, in
other words it takes us into the frequency domain.
There are several transforms that could have been utilized to get the image into the frequency
domain. The DCT, however, is a purely real transform. Thus, manipulating the frequencies is
much more straightforward compared to other transforms. From here we could take the DCT of
the entire image and then throw away frequencies that are less noticeable. Unfortunately this
would make the image blurry and cause the image to lose edges. To solve this problem the image
is divided into 8x8 blocks, to preserve the integrity of the image. To drop insignificant
frequencies, JPEG compression utilizes a quantization matrix. We simplified this process by using
a threshold value and dropping frequencies below the threshold. Thus our compression algorithm
models the basic functionality of the JPEG standard.
The result of taking the DCT. The numbers in red are the coefficients that fall below the specified threshold of 10.
1.4. Compression - Zeros Grouping*
The second part to our image framework is zeros grouping. Just like the JPEG standard, the
algorithm utilizes a zig-zag pattern that goes through each DCT matrix and creates a 64-length
vector for each matrix. The advantage of the zig-zag pattern is that it groups the resulting vector
from low frequencies to high frequencies. Groups of zeros are then replaced with an ASCII
character representing how many zeros are represented within that group.
Zig-zag method traverses the matrix and vectorizes the matrix. After grouping zeros the resulting bitstream is sent to a file.
With this simple framework in place, we are able to model a real world image compression
algorithm and focus on implementing steganography.
1.5. Zeros Hiding Method*
Data Hiding Methods
We arrived at our first data hiding method, which we called “zero hiding,” quite intuitively. If you
recall, our compression algorithm removed the least important DCT coefficients. It follows, then,
that we could put the bit stream we wish to hide back into these dropped coefficients without
changing the image drastically. To do this though, there must be a way to distinguish a zero which
resulted from a dropped coefficient and a coefficient that is zero. To do this, we ran the image
through a modified compressor that, instead of dropping coefficients below the specified
threshold, replaced them with either a plus or minus one, depending on the sign of the coefficient.
The DCT is taken and then each coefficient under the specified threshold (10) will be dropped. These are coefficients are shown in blue in the picture on the right.
Next the hiding algorithm is given a binary data stream and the threshold value. The data stream is
then divided up into words. However, the maximum decimal value of the word must be less than
the threshold, since values over the threshold signify an important coefficient in the picture. We
then increment each word’s decimal value by one to avoid putting in zero valued coefficients,
which would otherwise be indistinguishable from zero valued coefficients in the original image.
We then go back to the original coefficients matrix and replace the ones with the new value of the
data word, maintaining the sign throughout.
The dropped coefficients are replaced with words created from the data stream. The IDCT is then taken, transforming the coefficient matrix back to a picture matrix.
To recover the hidden data the recovery script is given the threshold, and subtracts one from all
DCT coefficients blow that threshold and tacks their binary values together, forming the original
1.6. Bit-O-Steg Method - Background*
Data Hiding Methods
Previous Work and Background
In our research we found a steganographic method known as JSteg, created by Derek Upsham. The
basic premise behind JSteg is that its algorithm hides the data sequentially within the least
significant bits of the DCT coefficients (Niels and Honeyman). The problem with JSteg is that it is
not very secure; there is no secret key with which it is encrypted. Therefore, anybody that knows
an image contains data with JSteg hiding can easily retrieve the hidden message. Our second
hiding method, which we have called bit-o-steg, improves upon the JSteg algorithm since we
employ the use of a key when hiding the data.
1.7. Bit-O-Steg Hiding*
Data Hiding Methods
As you should recall, our zeros hiding method inserts data into the dropped coefficients of the
DCT. The bit-o-steg algorithm hides data within the coefficients that were not dropped. The
critical part of bit-o-steg is the key used to encrypt the data. This user defined key selects which
nonzero coefficients to change and which bits to change within each coefficient. The simplest key
would be a key of . This would change each coefficient sequentially and change the last bit in
The key is what makes bit-o-steg unique from other algorithms. Here a key of [1 2] is applied to hide the data.
As you can see in figure 1, we chose a key of [1 2]. The key will select the first coefficient and its
least significant bit and input the first bit of the hidden data into that coefficient bit. Then the key
will count two coefficients and take the second least significant bit and repeat the hiding process.
Since this is the end of the key, it repeats, selecting the next coefficient. The length of this key has
no real bound, but it must ensure that all data is hidden before reaching the last DCT coefficient in
the image. There is, however, a range of values that must be selected for the key to work. Since
the key alters bits, values between one and eight must be used. However, if larger values are used,
it will alter the image greatly since it changes more and more significant bits.
Minimal changes have been made to the picture matrix after the application of the bit-o-steg algorithm
Retrieving the Data
Retrieving the data is impossible unless you have the special key used to hide the data. Once you
get the key you simply reverse apply the key, extracting rather than inputting the bits and
reconstruct your hidden data stream from those bits.
1.8. Importance of Steganalysis*
Importance of Steganalysis
Image steganalysis is the science of analyzing images in order to discover methods of discovering
and detecting hidden messages and data within the images. Statistical digital signal processing is
often used in order to detect data within images.
It is important to detect hidden messages within the images. On the steganography side, this is
important in order to find methods in order to improve the algorithm implementing
steganography. By exposing the flaws to the algorithm, the user can further improve the algorithm
in order to make it more difficult to detect whether or not data is hidden in the images.
Steganalysis is also especially important in the security aspect, namely monitoring a user’s
communication with the outside world. In the age of Internet, images are sent via email or by
posting on websites. Detecting whether or not data is hidden in the images will allow the monitor
to further analyze the suspicious images in order find what the hidden message is.
Can you tell if there is hidden data?
1.9. Steganalysis - Zeros Hiding Detection*
Zeros Hiding Detection
In order to find data hiding with our zeros hidden method, we first analyzed the histogram of the
DCT coefficients of an uncompressed image, compressed image without data, and compressed
image with hidden data. The histogram of the DCT coefficients reveals the number of times each
DCT coefficient value appears within the DCT matrix. From the analysis of an uncompressed
image (Figure 1), the histogram has a smooth curve. In the histogram of compressed image
(Figure 2), values before the threshold are dropped. Therefore, those values dropped to zero in the
histogram. The histogram of compressed image with data (Figure 3) shows a similar shape to an
uncompressed image. However, the values are much lower which makes sense since we are
replacing the values that were originally going to be dropped with data. Therefore it is statically
less likely to replace the dropped value with the same value.
Therefore, after analyzing the histogram of the different types of images, we did an analysis of the
l2 norm in the DCT matrix. If the analysis results in no power in the one valued DCT coefficients,
it is a compressed image. This is due to the fact that ones are the minimum value that can be
dropped. If there is power in the ones, then the image is either uncompressed or contains hidden
data. The key difference between the two is the magnitude of the power in the ones. Statistically,
it is less likely that every dropped coefficient gets replaced with a one. Therefore, the magnitude
of the power in the ones in an image with data is lower than a compressed image. An image with
hidden data will on average fall below a certain threshold. This threshold is dependent on the
image size. Figure 4 shows the plots of the power without data, the power with data, and the
threshold. Clearly, the power without data is greater than the power with data. We found our
detection program to have a 90% success rate but resulted in a false-positive 12% of the time.
L2 Norm Equation
1.10. Steganalysis - Bit-O-Steg Detection*
Due to the complexity of bit-o-steg, we turned to previous research to find a viable detection
method. Each entry in the 8x8 blocks has a specific probability distribution. The distribution is
found by looking at the values of that entry slot across the entire image. Figure 1 shows a
histogram of an entry without data. The histogram looks at the DCT coefficient value and counts
how often that value appears within that entry slot. Figure 2 shows a histogram of an entry with
data. Comparing the two figures, there is a sudden drop around the 0 value in the histogram of an
entry with data. The histogram of an entry with data also appears to smooth out.
These distributions are defined by their own characteristic functions. The bit-o-steg hiding distorts
that distribution by randomly changes certain entries thus altering the function. Using the inner
product, we could test for a match between the characteristic function and the suspect image’s
probability distribution. Unfortunately, the distribution functions vary based on the subject of the
picture. Furthermore, we lack the statistical background necessary to classify these distributions
and properly identify the characteristic functions. Thus, implementing bit-o-steg detection proved
to be beyond the scope of this project.
1.11. Future Considerations and Conclusions*
Future Considerations and Conclusion
Due to time and computing limitations, we could not explore all facets of steganography and
detection techniques. As you saw, we studied the power in our pictures to test for hidden data.
Another method which we were unable to explore was to analyze the noise of the pictures. Adding
hidden data adds random noise, so it follows that a properly tuned noise detection algorithm could
recognize whether or not a picture had steganographic data or not.
We explored several steganography techniques and the various detection algorithms associated
with them. By using the properties of the DCT and our understanding of the frequency domain we
developed the zeros hiding method. Zeros hiding proved to be easier to analyze than bit-o-steg and
can hide significantly more data. Unfortunately its ease of detection makes it a less secure
method. After researching various techniques already implemented, we chose to improve upon
one, thus creating our bit-o-steg method. Bit-o-steg can only hide data in coefficients that were
not dropped, thus limiting the amount of data we can hide. However, it greatly enhances the
effectiveness of the steganography since it uses a key, making it much more challenging to detect.
In the end we found both effective, but the complexity of bit-o-steg makes it more promising.
Detection of our methods was critical to the breadth of our project. By investigating the power in
various components of our images we discovered how to detect data hidden via the zero hiding
method. Detecting bit-o-steg required us to draw on past steganography research and statistically
analyze the effects of this type of data hiding. The methods and accompanying detection schemes
we developed broadened our understanding of steganography, which, unlike encryption, allows
secret data to be traded hands without raising an eyebrow.
1.12. Works Cited*
Cabeen, Ken, and Peter Gent. “Image Compression and the Discrete Cosine Transform.” College
of the Redwoods.
Johnson, Neil F., and Sushil Jajodia. “Exploring Steganography: Seeing the Unseen.” George
Mason University. < http://www.jjtc.com/pub/r2026.pdf> Johnson, Neil F., and Sushil Jajodia. “Steganalysis: The Investigation of Hidden Information.”
Judge, James C. “Steganography: Past, Present, Future.”
Provos, Niels, and Peter Honeyman. “CITI Technical Report 01-11. Detecting Steganographic
Content on the Internet.” University of Michigan.
Provos, Niels, and Peter Honeyman. “Hide and Seek: An Introduction to Steganography.”
University of Michigan. < http://niels.xtdnet.nl/papers/practical.pdf>
Sallee, Phil. “Model-based Steganography.” University of California, Davis.
< http://redwood.ucdavis.edu/phil/papers/iwdw03.pdf> Silman, Joshua. “Steganography and Steganalysis: An Overview.”
Wang, Huaiqing, and Shuozhong Wang. “Cyber warfare: Steganography vs. steganalysis.”
1.13. Steganography Matlab Code*
detect_data_power.m Detects if the given image has data hidden in it with the zeros hiding method
hidden_zeros_read.m Reads data hidden by zeros hiding method from an image
image_hist.m Creates histogram of DCT coefficients
image_stat.m Determines the threshold value for zero hiding detection
invimageproc.m Takes tiled matrix and converts to image matrix
jvector.m Traverses 8x8 block using the zig-zag pattern
mat2DCEB.m Compresses image and returns tiled matrix
readdata.m Reads in binary data from a file
secread.m Reads in data hidden by bit-o-steg from an image
signed_mat2DCEB.m Modified compressor used in zeros hiding
std_stegcompress.m Groups zeros together and saves image
stegcompress.m Takes a compressed image and hides data in it using bit-o-steg
Figure 1.14. Elliot Ng, Jones 2007
Figure 1.15. Bryan Grandy, Brown 2007
Figure 1.16. Charlie Ice, Brown 2007
Figure 1.17. Danny Blanco, Jones 2006
Chapter 2. Investigation of Delay and Sum
Beamforming Using a Two-Dimensional Array
2.1. Delay and Sum Beamforming with a 2D Array:
Beamforming is the discipline that takes a set of microphones, usually in an array, and a set of
point source signals (in a space that we assumed to be R3 or very nearly so) and tries to focus on
one signal source to the exclusion of both noise and other signal sources. In this project, we assume a single source and use the old and powerful technique of delay and sum beamforming
implemented over a 2-dimensional array arranged as a 3 by 3 square with the center missing.
Having a 2-dimensional array allows the location of a source to be determined up to an ambiguity
of reflection across plane of the array.
2. Delay and Sum Beamforming
Delay and sum beamforming is quite true to its name as it merely takes the set of signals, delays
and maybe weights them by varying amounts, and then adds them all together. The size of the
delays is determined by the direction (for farfield) or point (for nearfield) at which the set of
microphones is aimed. Despite its simplicity, delay and sum manages to achieve optimal noise
suppression for the case of a point source in a background of white noise. Of course, normal signal
processing applies, and one can do better than just delay and sum if information about the signal
other than location is known a priori. For example, if it is known that the signal is bandlimited and
baseband, then a suitable lowpass filter can be applied to further suppress noise.
2.1 Nearfield Processing
Though not implemented, nearfield calculations are both more computationally intensive and
accurate. If it is assumed that the microphones have some sort of center for distance, then the
center can be designated as the origin for the coordinate system. A point source at a point (xs, ys,
zs) would then emit a signal s(t). A microphone at a point (xm, ym, zm) would then receive a
signal m(t). Assuming that signal propogates uniformly with speed v and that signal strength is
equal to the original signal strength divided by the square of the distance, we can conclude that the
received signal is:
2.2 Farfield Processing
In this project, it was assumed that the array was always operating in farfield, an approximation in
which the source is assumed to be far enough away that the spherical waves it emits can be
approximated with plane waves. It is accurate in the limit where the distance between the
microphones and the source is large enough so that the angle between the source and each
microphone does not change significantly.
3.1 Time Quantization
Since all of the processing is done in a digital environment, we must work with samples of the
signals and not the signals themselves. Because of this, it is not possible to implement an arbitrary
time shift as any shift must be done in increments of the sample period. To remedy this, the
signals were interpolated digitally by upsampling them and then putting them through a lowpass
filter with cutoff corresponding to the amount of upsampling. An equiripple filter was chosen for
the lowpass filter as there appears to be no constraints as to the exact shape of the filter and
because an equiripple filter would avoid the Gibb’s phenomena found in a direct approximation of
an ideal lowpass filter. Using this interpolation, greater resolution can be achieved in the time
shifts, though the drawback is the large amount of additional data that must now also be
processed. In fact, even though the concept of delay and sum is incredibly simple, the amount of
computation that must be done because of the upsampling is often prohibitively high. It is
impossible for the amount of interpolation to be too high, but if it is too low, then it is entirely
possible that the direction of the source will be inaccurate or entirely wrong as the algorithm will
be unable to shift the signals to where they match enough.
3.2 Aliasing, Resolution, and Sampling Frequency
Given a fixed sampling frequency, there is always the ”normal” aliasing associated the Nyquist
Theorem, restricting the fully reconstructable signals to those that are bandlimited to half of the
sampling frequency. Something similar occurs with array spacing, and if proper care is not taken,
aliasing may occur in spatial dimensions. Using the spatial analogue of the Nyquist Theorem, the
minimum spacing between microphones must be at most half the wavelength corresponding to
highest frequency present. Thus, to achieve any resolution at all for higher frequency signals,
smaller arrays must be used; however, with a smaller array, the precision with which a direction
can be determined is diminished. It appears that there is an uncertainty principal at odds with
beamforming in its spatial dimensions.
3.3 Unknown Source Location
This is the main focus of the project: to try to locate a source using an array of microphones and
then focus the array in the direction of the source, obtaining greater suppression of noise than
would be possible using only one microphone. Since the direction of the source is unknown, we
decided to scan for the source by sweeping all possibilities. This is where the far field
approximation significantly reduces computational complexity. Using nearfield, any algorithm
would be forced to evaluate all possible combinations of three coordinates. With farfield, there are
only two angles to deal with as opposed to three coordinates so there is far less to compute.
Due to lack of computing power, we were forced to make a few, less-than-desirable assumptions
in order to make the algorithm run at all without crashing. One of these simplifications was using
only three of the microphones to perform the sweep of possible angles. A further simplification
was to assume that the three microphones could be broken into two pairs in the calculations for
determining the pair of angles from which the maximum was coming from. Further, hardware and
computer limitations limited sampling to a rate of 8000 Hz from each of eight microphones and
made the processing cost of upsampling prohibitive beyond a factor of around 10.
2.2. Hardware Setup for Perimeter Array of
Hardware Setup for Perimeter Array of Microphones
Before you can do cool things such as beamforming by processing signals from microphones, you
need a way to gather the signals. A few essentials components in this are the microphones
themselves, the analog-to-digital converter, and in many cases a preamplifier for the microphones.
The type of microphones we used in our project were Electret Microphone Elements. We chose these because they:
have a good, even, frequency response
run off a battery
are small (and inexpensive too)
The analog-to-digital converter we used was a DAQPad device generously lent to us by National
The DAQ device requires an input signal of 50 mV to 10V, and the Microphone Elements output
signals in the 20-200 μV range. So we built preamplifiers that took in the signal from the
microphones and amplified it 3158.44x. We used LM324 quad-operational amplifiers, with 56.2
kOhm and 1 kOhm resistors. At the inputs of each amplifier is a 2.2 uF capacitor designed to
eliminate a slight DC offset produced by each microphone.
Figure 2.2. PreAmplifier Schematic with LM324
The configuration for preamplifiers with a LM324 opamp.
Putting it all Together
The next step is putting everything together. We built the perimeter array of microphone in a
shallow box. To make a perimeter array we put 8 microphones in a pattern around the edge of a
square 5.5 centimeters apart. The preamps and batteries were housed under the box, with the
outputs coming from the side to connect to the DAQ card.
Figure 2.3. Microphone Perimeter Array
The box housing the perimeter microphone array and preamps.
2.3. Labview Implementation of 2D Array Delay and
Our project involves using a two-dimensional array of microphones to determine the direction from which a mystery signal comes. This involves taking data from the microphones, doing analysis of the data, and then outputting the results. The second step we chose to implement in
Labview. In order to interface correctly with the DAQ card we had available, we used Labview
The labview implementation of our project involved several stages: First, we wrote a vi designed
to get the input from the microphones by sampling the eight inputs to the DAQ card, and separate
the resulting data into eight arrays, each holding a digitized signal. We then upsampled each
The main analysis vi tests three of the signals by taking two of them and testing them individually
against the third. This test involves delaying the two signals and taking the norm of the delayed
signal and the third signal. Each norm is collected in an array, from which the max norm --
correspondent to the correct delay between the two signals -- can be found.
If we know the correct delays between three signals, we can do some mathematics (explained in
greater depth in the section on the delay generation vi) and derive the angles the signal is coming from.
As a two-dimensional array has the ability to discern a point in three-dimensional space, there
are two angles found here: theta, the angle along the xy plane, and phi, the angle relative to the z axis.
From the angles found, we can then calculate the appropriate delays to be applied to the other five
signals. Finally, we take all eight signals, delay them appropriately, and add them together to get
our final result. This is known as delay and sum beamforming.
Waveform Generation VI
Most of the work here was done for us already, through Labview's Generate Waveforms VI, a
module that, given certain information about an attached DAQ card, sampling rate, time to be
sampled, etc., will seek out that DAQ card, sample the requested channels, and return the results
in a two-dimensional array of doubles, where one dimension corresponds to the sample of the
signal at one particular point in time, and the other to which channel sampled from.
Figure 2.4. Waveform Generation VI
VI we created to sample the microphones and upsample the resulting arrays
Our module took the data from said VI and separated it into eight one-dimensional arrays, one for
each microphone. (This was an essential step, as many of the array analysis functions that we
wished to use would only work with one-dimensional arrays.) Using our Upsampling VI
(discussed below), we then upsampled the signals, lowpass filtered them to interpolate the signal, and set the eight filtered and upsampled signals as the output of this VI. This module takes as an
input N, the amount by which the signals should be upsampled, and an input fs, the sampling
From the beginning, we knew that there would be restricted sampling rate of the DAQ card, and
the buffer would be effectively decreased by a factor of eight for any one signal (since the data
from all eight signals comes into the same buffer). Whatever sampling rate was left would meet
the most basic Nyquist requirements and avoid aliasing in that fashion; however, the resultant
signal was unlikely to possess much resolution beyond that. Thus, upsampling would be a
We initially searched Labview itself for a premade upsampling VI, presuming that one would
exist, as it is a fairly common signal processing algorithm. However, we were unable to find one
and so set about creating a module that would do the job. Our module takes as inputs the signal
(array of points) to be upsampled and N, the amount the signal was to be upsampled, and passes as
an output the upsampled signal.
Figure 2.5. Upsampling VI
Sub-VI used to upsample a signal
Following upsampling theory discussed in class (ELEC 301: Signals and Systems), the first step to our upsampler was to zeropad, that is, add zeros in between each point on the signal being
upsampled. Instead of attempting to implement a dynamic array, this was accomplished by
creating a new array of the appropriate length (N times the length of the original array, where N is
the amount the signal is being upsampled) and using a for loop to place the original signal
elements into the new array spaced N points apart.
This enlarged array of data is then passed back to the Waveform Generation VI where it is lowpass filtered in order to fill in ( interpolate) the new zeroed out positions, and passed onward as an output of the Waveform Generation VI. The filter used in this operation is the Equi-Ripple
FIR low pass filter.
Delay Generation VI
This VI does the bulk of the mathematical analysis of the input signals. It takes as inputs the two
delays between microphones one and two, and one and four (derived from the calculations of max
norms in the Main Analysis VI) and outputs an array that contains theta, phi, and the corresponding delays for the seven microphones (the delay of the first microphone is
automatically set to zero). In all cases, the delays are scaled to correspond to the number of
indexes the corresponding signal should be shifted, instead of the actual real-time delay. (As we
cannot shift a signal by fractional indexes.)
d12 = k12 / (fs * N);
d14 = k14 / (fs * N);
phi = acos( sqrt (v^2 * (d12 ^ 2 + d14 ^ 2) ) / d ) * sign (d12 * d14);
theta = atan (d14 / d12);
d13 = d12 * 2;
d16 = d14 * 2;
d18 = d13 + d16;
d15 = d13 + d14;
d17 = d16 + d12;
The first part of the above code is calculates the angles based on the spatial relations between the
three microphones (microphone 1, used, as we said before, as the origin, and microphones 2 and 4,
which can be found directly adjacent to microphone 1 in both directions). As you can see, it is
fairly simple geometry, complicated primarily by the scaling necessary to match the 'k' values
(integer values used to iterate the for loop) to their corresponding 'd' value (actual delay in time).
The second part of the code uses the angles mentioned above to calculate the delay values,
although again due to the regular nature of our array, it is possible to calculate only two of the
delays outright and extrapolate the rest of the delays from those two. (Which is indeed what we
have done in an effort to reduce calculations and make the algorithm more efficient.)
The final part of the code, not shown in the code above but which can be seen in the function node
in the figure below, involves the recipropcal of those first two lines; that is, rescaling all the 'd'
values found to 'k' values that can actually be used when shifting the signals prior to adding them
Figure 2.6. Delay Generation VI
Sub-VI that, given the values 'k12' and 'k14', will generate the shifts (in indices) of the seven microphones (with the shift of microphone 1 assumed to be zero) and the angles theta and phi that the signal came from.
Main Analysis VI
This is our top-end module, where all the modules mentioned in the previous section are brought
together in the same vi and linked together in the proper ways so as to create a working project.
Figure 2.7. Main Analysis VI
The culmination of our struggles with Labview 5.1, our top-end module which does ... well ... everything.
First, not unexpectedly, there is a call to the Waveform Generation VI, which provides us with our collected and upsampled signals. From that sub-VI, the signals from microphones 1, 2, and 4
are taken, microphones 1 and 2 passed to one for loop and 1 and 4 passed to the other. Within the
for loop, as mentioned before, one signal is shifted relative to the other, and the norm taken, for
all delay values possible. The result of this is concatenated into an array, the maximum norm
found, and from the location of the maximum norm, the value of the delay, or as close as we can
get with the sampling resolution we have.
These shift values (the integer index corresponding to as close as we can get to the ideal time
delay) are passed to the Delay Generation VI, which then returns an array of values. The theta and pi values function as outputs to the front panel, and then the delay (shift) values are used to
set the necessary shift for their corresponding microphone. Finally, the shifted output arrays are
all summed (using a for loop, as a point by point summing module also seemed to be among those
useful things not premade in Labview 5.1), and the output of the for loop, the array that is the sum
of all the previous ones, is then attached to a waveform graph, also on the front panel.
Figure 2.8. An Example Result
As the titles state, the upper waveform is that of the first signal (unmodified in any way), and the second that of the final, delayed and summed signal. Note how the latter signal is somewhat smoother and the noise level reduced in comparison to the signal itself (a series of claps). The two numbers at the bottom correspond to the computer's calculation of what direction the signal is coming from.
Phi is measured such that straight up is at zero, along the xy plane at 90 degrees. Theta is
measured with the "bottom" of the array (although it can of course be reoriented as the user
pleases), that is, the negative y direction, as zero degrees. The signs of the angles indicate the
direction of propagation of the wave, and are thus opposite to conventional intuition, and the
sign of phi is, of course, impossible to determine with any degree of accuracy due to the up-
down ambiguity inherent in a two-dimensional array.
Success! (For a deeper exploration of our results, please continue to the results module Labview Code
2.4. Results of the Testing of 2D Array Beamformer*
In this section, we discuss the results that we received upon testing our two-dimensional array
delay and sum beamformer. The output we designed for our system was relatively simple: the waveform for the delayed and summed signal, the waveform for the first signal (displayed for
comparison purposes), and the calculated values for theta and phi, the two angles used to express where in space our signal could be found.
Our first example of successful output.
Note how the delayed and summed signal, as it should, bears a striking similarity to the initial
signal. However, due to the nature of the delay and sum algorithm, the final result has a great deal
higher magnitude, and the signal-to-noise ratio is lower. This is due to the fact that the signal
contributes the most to the inner product calculations done to find the proper delays, and thus the
signal is matched up properly; the noise, on the other hand, being essentially random, is additive
in some places and destructive in others, leading to an overall relative decrease in noise.
A similar example, with a somewhat different signal
This signal, as can be seen, was somewhat closer to zero degrees and a bit higher off the xy plane
(smaller phi) than the previous example. We noticed rather quickly that, although it is a relatively
simple matter to vary theta (as that just involves moving around), it is less simple to gain
significant variation in phi: at far field distances, one must get a great deal higher off the ground
before the angle of the signal reaching the array changes significantly.
As it is, this particular phi value (at a significantly higher angle than before), was accomplished
by coming a great deal nearer to the array -- thus bringing in possible complications due to the
fact that the signal source could now be reasonably considered as near field. However, even for
middling-near field sources, the far field approximation still holds to a certain extent, as can be
shown by the accuracy this data maintained.
Did we mention we happened to chose the computer nearest to a loud fan?
Speaking of noise ... due to the existence of a relatively noisy fan quite near to our work area, the