AIVPNathanStrang: October 2014

Tuesday, 21 October 2014

Audio Lab 1

Task 1

The first part of this lab was to look at the chosen sound wave as it originally was and then how it was once we had zoomed in and zoomed out.

Here is the wave originally:

Once we zoom in the wave we can see that the track disappears off screen. This is because originally we started by seeing all 1 minute and 4 seconds of the track on the screen but after zooming in we can only see 0.25 seconds of full track. Looking at the amplitude of the original and the zoomed in track we can see that the crests and troughs are visibly clearer to the eye. The wavelength can also be seen now that we have zoomed in. Overall by zooming in it makes the wavelength and amplitude much more easily spotted and observed.

Once we zoom out it is immensely difficult, if not impossible, to distinguish between each wave as the track length has decreased meaning that we can see the full length in a smaller view. The amplitude has been compressed meaning that it is nigh impossible to differentiate between each individual crest and trough. Unlike the previous example, where the wave was stretched, this has been compressed.

Task 2

For task 2 we were to add our own effects in to our chosen sound file.

I chose to add a "Fade in" and "Fade out" effect into the file. I added the fade in at the beginning and if you compare it with the original you can see that the amplitude of the waves are lower than the rest of the piece but they gradually build up. The build up will be demonstrated in an increase of volume. Similar to the beginning I added a fade out effect at the end so that it will have the opposite effect on the piece. At the end the amplitude grows smaller in size which is conveyed with the decrease in volume towards the end of the piece.

Task 3

Task 3 wanted us to display the plot spectrum of our piece.

Below is the analysis of the piece. Rather than looking at time lines and waves, amplitudes, troughs and crests we can see the measurement of the piece in Decibels and Hertz. A Decibel is the unit we use to measure the power of sound. If we change from linear frequency to log frequency we can immediately see a change of shape. The chart displays a more curved and rounded shape rather than peaks and points.

Task 4

Task 4 is to apply a reverb to our original piece and compare the results.

Small room reverb shows that the wave has increased in amplitude from the original. The tonation and sound of the piece remains mostly unchanged but there is an echo effect on the sound that changes the sound of the piece.

Large room reverb will show a complete opposite from a small room. The amplitude seems to have decreased dramatically. Once again though the piece of music's tonation does not change that much but there is a change in how it sounds because of the effect that was added.

Wednesday, 15 October 2014

Lecture - Hearing

Audio,Image and Video processing

Overview

1. The Human Ear
2. How The Ears Work
3. Impedance Matching Mechanisms
4. Human Hearing

The Human Ear

There are three levels of hearing that we will experience as a human.

The first is called Primitive Level and an example of this would be if you are sitting in a room with a window that faces passing traffic and the ear can find it mellow and comforting or harsh and disruptive. Regardless of your reaction your ear will still process the sound and it is the level that is Primitive Level.

The second is called Symbolic Level and an example of this would be that if you are making a decision about going out or leaving the house and you hear a report that a road is busy but that as you come nearer to this road the noises of cars become louder. It is this background noise of the cars and the vocal noise of the report that will be processed in your brain that the road will be congested with traffic. This is Symbolic Level.

Finally the last level is call Warning Level and a prime example of this is if you are driving and you hear the siren of an Ambulance or a Police car you instinctively know to move aside or move out of the way has the siren has warned you that they will need to move quickly to their destination unhindered. Your brain will process this sound in the same manner as an instruction. This level of noise and sound is called the Warning Level.

How The Ears Work

The human ear is composed of three main parts. The outer ear is made up of the Pinna, Ear Canal and the Eardrum. The middle ear is made up of the Ossicles and the Eardrum. The inner ear is made of up the Cochlea, the Auditory Nerve and the Brain.

source - http://pikeslaneprimary.weebly.com/uploads/2/3/9/5/23958176/7089296_orig.jpg

Sound is taken in when the ear flap funnels sound waves into your Outer ear canal. The waves then travel along the passage until the reach the Eardrum and vibrate. Once these vibrations are picked up the Ossicles start to move and the vibrations are sent on to a thin layer of tissue of the inner ear called the Oval Window. The movement that is passed through the Oval Window creates a wave motion in the fluid in the Cochlea.

Inside the Cochlea there is a spiral shaped organ called the Corti. The Corti is made up of thousands of tiny sensory hair cells attached to a single membrane. These hair cells also have another set of sensory hair cells that are attached to their own membrane. When the fluids of the Cochlea have the wave like motion, created by the Oval Window, they press the hairs against the second membrane and the movement of these hairs are changed into nerve impulses which travel along the Cochlear nerve and are sent to the brain.

The fact that humans have two ears allows us to have the ability to locate where a sound originates from. If a sound is louder on the left ear and quieter on the right our brain will tell us that the sound is originating to our left.

source - http://classconnection.s3.amazonaws.com/184/flashcards/1904184/jpg/51352566999599.jpg

Impedance Matching Mechanisms

Impedance matching is an important function that is used in the middle ear. The middle ear's primary function is to transfer all incoming vibrations and sort them from the large, low impedance membrane to the smaller, high impedance oval window. The middle ear acts as a transformer of sorts as it converts low pressure, high displacement vibrations into high pressure, low displacement vibrations that will be at a suitable level for the cochlear fluids.

Auditory processing is the two channel set of contiguous frequency bands. The separation of the ear's left and right signals, the low and high frequency information and then timing and intensity information.

Ohm's law stated that the perception of the tone of a sound is a function of the amplitudes of the harmonics rather than the phase relationships between them.

Human Hearing

Human hearing can cover a variety of frequencies that can start at 20 Hz and go all the way up to 20,000 Hz. Although unless you are a bat then the much higher frequencies of 18,000 Hz - 20,000 Hz will be useless as they are outwith most of our day to day life.

source - the lecture slides

Humans have a Decibel range as well as a Hertz range. An example of normal dB range would be having a conversation with a friend. This will be at roughly 60 dB whereas a 110 dB noise would be the sound of a Jack hammer on concrete or a rock concert. At 120 dB the ear will start to experience discomfort and we will not want to listen to the noise any longer but at 140 dB we will start to feel physical pain from the sound.

Wednesday, 1 October 2014

Lecture - Digital Signal Processing

Audio,Image and Video processing

Overview

1. Digital Signal Processing
2. Filters
3. Pitch
4. System Needs of DSP
5. Sound Cards
6. Sampling
7.Audio file sizes and contents

Digital Signal Processing

Digital Signal Processors take every day signals such as voice, video, pressure, audio, temperature or the position that the signal has been digitized. Each of these signals is then taken and using a mathematical sequence they are manipulated. A DSP is designed so that functions such as adding, subtracting, multiplying and dividing are completed very efficiently and quickly. Signals will first be processed so that the information can be displayed, analysed or converted to another signal that may be used.

Analogue to digital converters take the real world signals (analogue) and convert them into a binary form of 1's and 0's (digital). After the signals are captured the information is then taken for processing and after it has been processed the information is sent back for use. The conversion will happen at a very high speed.

An example of how a DSP will be used would be an MP3 player. First the audio is recorded and sent through a receiver. The signal is then converted to digital by a analogue to digital converter and then passed to the DSP and it is encoded and finally saved. When playing the file from the memory the DSP decodes the file and coverts it back to an analogue signal so it can be played through a speaker.

An example of a DSP system

source - the lecture slides

Filters

Electronic filters are circuits which are used for processing functions specifically for removing unwanted sounds or to boost and enhance wanted ones. A low pass filter is an electronic filter that will pass low frequency signals but will reduce the amplitude of signals with frequencies higher tan the cutoff frequency. In smoothing points of the data are changed and modified so that particular points that are higher than the points either side of it are lowered and reduced so they are lower than the surrounding points. This will produce a smoother looking signal and it means that the signal will not be distorted by the smoothing but the noise will be controlled and reduced. An input signal must be band-limited meaning that high frequency waves will have been recorded as lower frequency waves. This will prevent aliasing.

source - the lecture slides

Pitch

A sound with a high frequency will have a high pitch and a low frequency will have a low pitch. Some people who have been taught musical theory and practice can detect a difference in frequency between two different sounds. This difference can be as small as 2 Hertz. When learning musical theory it is important to learn that every note that can be played is represented by the first seven letters of the alphabet.
As shown below

Each line on the bass clef and the treble clef will represent a note, or a pitch. Just as each space on both clefs will represent a note, or a pitch. In between each note is a semi-tone such as a sharp or a flat. These are half the distance between the next note. For example A sharp is a semi-tone lower than B.

source - the lecture slides

System Needs of DSP

The precision of DSP is limited only by the conversion process taking place at analogue to digital and the digital to analogue converter. The limitation can be modified through the sampling rate and the word length restrictions. Although if you increase the operating speed and the word length then this will allow more areas of application of the digital logic.

The robustness of DSP is about digital systems as commonly they are less likely to pick up electrical noise or components tolerance variations. Any component aging adjusting or electrical drift adjusting are basically removed.

The flexibility of DSP means that the processing operations can be upgraded and expanded without needed to change a lot of hardware.

Sound Cards

Before sound cards were created computers could make one sound which was a beep. However the frequency and the duration of the beep could be altered but the volume could not and it could only create that one beep and no other sounds. Initially the beep was a warning sign but after the first few PC games were created developers incorporated the beeps into music for the games. It wasn't all that good or realistic but it was a start. Once the 1980's hit sound cards were developed meaning more sounds could be produced. Now PCs can use sound cards for surround sound and they can be used to capture and record sound.

An example of an Analogue wave is shown below

This wave is a recording of the word "Hello". The diagram is showing that the vibrations of the wave are moving at an incredible speed of 1000 oscillations per second.

source - the lecture slides

A pure tone (shown below) is a wave that vibrates at a specific frequency. This wave is in the shape of a Sine wave. The wave below is a 500 Hz wave and by using this we know that this wave also has 500 oscillations per second.

source - the lecture slides

Sampling

When sampling a wave with an ADC you will have control over the sampling rate and the sampling precision. The sampling rate is the control of how many samples will be taken in one second. The sampling precision controls how many different gradations will be possible when taking the sample.

source - the lecture slides

To understand sampling we need to know more about the two different sample rates.
There is a low sample rate and a high sample rate. A low sample rate will distort the original sound wave (as shown in diagram A) A high sample rate will reproduce the original sound wave exactly the way it was (as shown in diagram B).

To reproduce a frequency you will have to take the sample rate and ensure it is at least double the frequency.

A CD has a sample rate of 44100 samples per second meaning that they will be able to reproduce a frequencies of up to 22050 Hz.

source - the lecture slides

Below is an example of the most commonly found sampling rates and frequency ranges.

source - the lecture slides

Bit depth is when a sound wave will be sampled and each of those samples is given an amplitude value which is closest to the original amplitude value. The higher the bit depth the more available amplitude values meaning a greater dynamic range can be produced.

source - the lecture slides

Audio file sizes and contents

An audio can be saved as a WAV, MP3 and AVI on a hard drive. When saving a WAV file it will consist of a small header which will indicate what the sample rate is and what the bit depth is. Quite often WAV files can be large in size. An example would be that at 44,100 samples/sec and 16 bits per sample a mono file will require 86Kb per second. Which is roughly 5Mb per minute. Since stereo has two channels the value will be doubled meaning the 5Mb will become 10Mb.

MIDI files tend to be much smaller in size because MIDI are synthesised sounds and don't use real instruments or voices in the file. MIDI files can be as small as 10Kb per minute of audio. MIDI files work in a way where the sound card takes all the information saved and uses it to synthesise the sound produced and then recreates the note and on what instrument it was meant to be played on.

A sound card which supports 16 bit word length coding of sample values will allow 65536 (2 to the power of 16) different signal levels within the input voltage range of the sound card.

Lecture - Sound

Audio,Image and Video processing

Overview

1.Sound Waves

2.Wavelength and Amplitude
3.Frequency, Velocity and Wavelength

4.Sound Intensity and Level

5.Echoes and Reverberation
6.Anechoic Chamber

Sound Waves

A sound wave is created when a vibration is transmitted as a wave through either a gas, solid or a liquid. An example of this can be found in musical instruments when a plectrum strums a guitar string and the vibration creates a sound wave. Sound waves are catagorised by their direction of displacement. These two classifications are transverse and longitudinal.

An example of transverse waves would be the ripples that appear on the surface of a liquid. When a transverse wave encounters a material it causes the particles to move back and forth in the direction of the wave. However the particles do not travel with the sound wave they move in and down as the wave passes.

Below is an example of a transverse wave.

source - the lecture slides

An example of longitudinal waves would be someone knocking on a door. With a longitudinal wave it is the individual air molecules that move back and forth as the vibrations radiate from the original disturbance.

Below is an example of a longitudinal wave.

source - the lecture slides

Wavelength and Amplitude

The amplitude for a Transverse wave is the distance between two repeating Crests or Troughs whereas with Longitudinal waves it is the shortest distance between two Peak Compressions.

Here is an example of a Transverse wave

source - the lecture slides

Here is an example of a Longitudinal wave

source - the lecture slides

Frequency, Velocity and Wavelength

Velocity

Sound waves travel at different speeds through different materials, objects, substances etc These are called mediums. Velocity is the rate of change of the objects position equivalent to a specification of its speed and then the direction it is travelling in.
Sound travels at its fastest through a solid.
An example is that through Steel sound moves at just below 5000 m/sec
Sound travels slower when moving through a liquid.
In water sound moves at about 1500 m/sec
Sound travels the slowest when moving through a Gas.
In air sound only travels at about 1/3 of a Kilometer which is roughly 333 m/sec.
So to travel 3m sound will take:
t=3/333s
=0.009 seconds
= 0 milli seconds

Frequency

The Frequency of a wave is the number of vibrations that occur in one second. We measure this in Hertz. The speed which the waves moves at is known as the Velocity of the Wave.

Velocity (metres/seconds) = wavelength (m) x frequency (Hz)
So v = Lambda x f
=333 / 1000m
=0.333m

Wavelength

Standing waves will disturb the solid, liquid or gas but will not travel through them. They vibrate similar to the vibrations found on stringed musical instruments such as a Violin or Guitar. A violin or guitar string will vibrate as one whole object. This creates a node at each end and an anti-node in the middle.

source - the lecture slides

The vibration of a string as a whole produces a fundamental tone. The other vibrations however will produce Harmonic tones. A harmonic is an integer multiple meaning that with each successive harmonic there will be another anti-node per harmonic. So a 2nd harmonic will have two anti-nodes and a 3rd harmonic will have three and so on.

source - the lecture slides

Sound Intensity and Level

Some measurements are very difficult to make so the intensity of a sound is commonly taken to be the sound level. This can be done by comparing a Standard Sound Intensity to any other sound. The Standard Sound Intensity is the quietest sound that human hearing can hear. This is known as the threshold of hearing and it is equivalent to a change in air pressure of 20 micro Pascals. Any sound that is at 10Pa is harmful to human hearing and can cause damage.

A day to day conversation will be about 100000 times the sound intensity of whispering. Sound intensity level can be defined by a logarithm that is measured in decibels (dB).

Sound Intensity level = 10logbase10 (I sound/Istandard) dB

This means that a multiplication of 10 in sound intensity will correspond to an extra 10dB of sound level.

Echoes and Reverberation

Echo
An echo is perceived as reflection of a sound from a surface. This means that the fraction of a sound level reflected can be put as thus 0 is greater or equal to a which is smaller than or equal to 1.
The time difference between the echo and direct sound will depend how far it travels and the speed in which sound travels.

Reverberation

Reverberation is the persistent sound in one space after the original sound has dissipated. When a sound is produced a large number of echoes will build up and after a length of time will begin to fade away.
The diagram below will provide a better explanation. You can see that the original sound still travels to the destination but it also bounces off all the other surfaces before reaching the destination, this is what reverberation is.

source - the lecture slides

Anechoic Chambers

An Anechoic chamber is a room that will completely absorb reflections and waves of sounds. More often than not these rooms are insulated to prevent from outside noises breaching the walls. This duo of absorbing reflections and wall insulation allows the generation of infinite dimensional open space. This is incredibly useful as exterior and outside influences would impede the final results.

There is no definite size of Anechoic Chambers. They can be as small and compact as a microwave found in a house or as big as an aircraft hanger. The size of the chambers will depend on what testing will be taking place and what frequency of signals will be used, however, sometimes a smaller chamber can be used when tested with shorter wavelengths.

The term 'Anechoic' was originally thought of by Leo Beranek. It was used in the context of sound waves minimising the reflections of a room. Now that this idea has been put into practice there are now many types of chambers and each one it built for its own unique purpose.