Fourier Analysis, Where to Start?

studiorat · 25-02-2010 8:56pm #1

Ok, I'm starting Masters project soon, involving FFT's and Wavlets. Conceptually I understand what's going on. I can pop an expression into a line of code but I really haven't a clue what's happening mathematically speaking.

Whenever I open a DSP book on the subject I may as well be looking into a ditch.

So I need to go right back to the begining regarding the Maths. So what do I really need to concentrate on, Algebra, Triginometery, Calclus, Complex Numbers. Anything else?

Can anyone suggest some good resources, been on the MIT courseware site all over, but I can't seem to find an in point anywhere. Most of it is just too advanced.

BTW, I'm willing to swap some recording studio time and some lessons in recording and mixing if anyone fancies giving me a few lessons.

Thanx.

bnt · 26-02-2010 2:00am

I suppose it depends on how detailed you want to get. I'm not an expert on the Maths, I have to say; some aspects of the problem are only becoming clearer now, and I wish I'd seen the "big picture" earlier. I've worked with audio for years, putting audio samples through Fast Fourier Transform (FFT) and playing with the results, but when you study the basic Fourier Transforms themselves, they are something else entirely.

The Fourier Transforms in their purest sense are Calculus i.e. they transform time domain functions in to frequency domain functions, and vice versa. For example, f(t)=sin(t) is a periodic function, while f(x)=x^2 is a non-periodic function. If you look at the Wikipedia page on Fourier Transforms, the basic transforms are in the Definition section, so you can probably tell you're going to need some Calculus to understand what's going on there and do anything with them directly.

They are not much help to you if you're trying to analyse e.g. sampled audio, which does not come in the form of a function (periodic or otherwise), but in a sequence of effectively random numbers. When you run a FT on real data, the formulas involved were derived from Calculus, but don't require Calculus to use. These are referred to as Discrete Time Fourier Transform (DTFT), and a typical DTFT formula looks like this:

[latex]\displaystyle X[k] = X(\omega_k) = \sum_{n=0}^{L-1} x[n] \,e^{-i 2 \pi \frac{k}{N} n}[/latex]

You'd do this summation (n = 0...L-1) for each frequency band (k) you want, which can mean a lot of number-crunching - hence the common use of the Fast Fourier Transform FFT, an algorithm for calculating a DTFT fairly efficiently. The limitation of FFTs is that the window size must always be a power of 2. The results are phasors: complex numbers with a real and imaginary component, though for audio purposes you usually want their amplitude (obtained by the Pythagoras formula on the real and imaginary components).

The way you select data to analyse needs a little explanation too, under the heading of "windowing". e.g. if you're analysing data in 256-sample chunks, simply grabbing a chunk of data is called "rectangular windowing", but it's been found that the accuracy of the results can be improved by tweaking the data before analysis using a different "windowing" method e.g. Hamming or Cheybyshev. (There are plenty of them to choose from!). Another buzzword you'll hear is "oversampling", when you overlap your data windows e.g. 256-sample chunks starting at 128-sample intervals is 2x oversampling. This gives smoother results, but takes longer.

If you want to try some number-crunching, I can recommend Octave, which is a near-clone of MATLAB. FFT is supported in the base pack, but you might want to add extra Packages e.g. "audio" to handle audio files, and "signal" for windowing functions.

studiorat · 26-02-2010 12:05pm

Detailed enough to look like I know what I'm doing! Hope to build an impulse generator. Was considering changing the window size during the analysis. Smaller at the start.

Thanks for the concise reply. Defining k was another part in the puzzle.

I only discovered phasors the other day, I guess that shows how far behind I am. Still trying to get my head around them. Starting to see the imaginary part if you will. Vector addition just went down on the list of things to do.

Meanwhile thanks for suggesting Octave, please someone tell me theres a front end available for it!!! Mac OSx Tiger 10.4

Hope to build an impulse generator. Was considering changing the window size during the analysis.

timbrophy · 26-02-2010 12:50pm

I don't think that there is a front end for it but there are very good instructions.

If you are using an intel mac, which I would say you are if it is Tiger, download one of the i386.dmg files. After you download Octave the .dmg file will expand itself and tell you to drag octave.app into an alias for the Applications folder. Do this. Go to the Applications folder. Hold the ctrl-key down when you select Octave with the mouse. Choose "Show Package Contents". Open the Contents folder. In here there is a Docs folder and in here are all the help files in pdf format. In particular, the octave.pdf file is a full manual.

Good luck.

Tim

bnt · 26-02-2010 1:11pm

studiorat wrote: »

Meanwhile thanks for suggesting Octave, please someone tell me theres a front end available for it!!! Mac OSx Tiger 10.4

There is OctaveDE, which I use on Linux and is said to work on Macs, but you'd have to compile it from the source code, since they don't provide binaries.

(That's what I get for drinking too much Pepsi in the early evening: up till 2AM writing about Fourier Analysis. :cool: )

DáireM · 26-02-2010 3:30pm

https://www.youtube.com/watch?v=gZNm7L96pfY

I watched a few lectures of this and thought it was ok, honours LC is as far as I've gone mathematically. It could get much harder later in the series though, I only watched 6 or so lectures.

studiorat · 26-02-2010 3:58pm

Thanks Daire, I watched some of that. Going over my head already. I blagged onto the course from experience, not my maths results.

Octave downloaded and ran like a dream. In the Terminal!!! Already have a stack of PDF's to get through. Really should be actually doing something than just looking for new software!!! My desktop is like a war zone again...

I was hoping for it to have some sort of a IDE or GUI or something. Although I'm wondering now if there was some way to use the mex.h in X-code, I'm comfortable enough with C. I digress...

Anyway, I found this in case anybody's interested http://www.lmac.utc.fr/~mottelet/Darwin/. It's a scilab front end for 10.4 ppc. The latest version is Leopard only.

If it doesn't work I'll try and build an OctaveDE X-code rig . Might be a nice thing to put out there for the greater good.

For the moment I'm going to try and hack together a simple FFT program that outputs something I can stick into Excel and get a graph or at least some numbers!

UPDATE:Ok I think I have it now... You write the code in a text editor and compile? No?

Update 2: Maybe, parse and syntax errors everywhere.

|

studiorat · 26-02-2010 5:53pm

So I got Scilab working with Octave. It uses X11 the API linky thing-a-mabob, luckily I had it installed already. Not quite used to having variables with no type but there ya go. It's got a decent enough text-editor, compiler and debugger too. The graphics are very OS 7 though!

Anyway I thought I'd post the code from my first little experiment, I've commented some questions and observations into it. If anybody's on the pepsi again hopefully they'd have a quick shufti at it.

The code writes a few sines and then analyses the function with what seems to be a FFT object.

Next? A way to get it to read a file, allocate memory, chop it into blocks and I dunno, define a window and hop size? Could I read values into the fft(s) with a while loop?

// 1khz Sample Rate
// sinusoids at 50hz and 70hz.

sample_rate=2000;

t = 0:1/sample_rate:0.6;

///what is t?

N=size(t,'*'); //number of samples. Is '*' an pointer value?

///printf("Enter frequency values:");
///scanf ("%d",&freq); /*syntax seems to work*/
/// Will this with "rb"? For a raw audio file?

create sinusoids
s=0.8*sin(2*%pi*50*t)+ 0.3*sin(2*%pi*70*t+%pi/4)+ 0.7*sin(2*%pi*800*t) + 0.2*grand(1,N,'nor',0,1);

/// 0.8 and 0.3 are amplitudes. What is 0.2 grand etc.?
/// NB: 1000hz gives fold over at 200Hz for 800Hz! Ah-Ha!.

y = fft(s);//the fft response is symetric we retain only the first N/2 points
//^^^ N/2 = Real values??? Is (s) the function?

f = sample_rate * (0:(N/2))/N ; // frequency vector

n = size(f,'*')

yy = y*2/N;// ??

clf()// opens screen

plot2d (f,abs(yy(1:n))) //draws graph. What are abs?

/// What does ":" mean?

bnt · 26-02-2010 5:59pm

Octave is a "scripting" language - no compilation involved. What I tend to do is enter the commands manually, and if they work as I like, I copy & paste them in to the text editor to form the script. To run the script you just CD to the directory and enter the filename (without extension).

I just remembered that I played with audio files in Octave a while back, and dug out a basic example that shows how to do a FFT on a WAV file. There are bits in there that are optional: I do a "resample" on the file so that there's an exact number of wave cycles in the data before it goes to the FFT. This means that the FFT results are real audio harmonics - something I wanted because I was playing around with additive synthesis at the time, trying to recreate samples. Here's the code I used: it needs the "audio" package added.

clear
clc

# Load the file, return the data (w1), sample rate (Fs), and bit depth (bits)
[w1,Fs,bits] = wavread ("ARPC4.wav");

l1 = length(w1)
subplot(2, 1, 1)
plot(w1)  # plot audio

c4 = 440*2^(3/12) # frequency of note C4 
q = 256*c4/Fs     # ratio to use for resampling

# resample to align wave cycles to 256-sample FFT window
w2 = resample(w1,ceil(q*Fs),Fs);
l2 = length(w2)

f=abs(fft(w2(1:256)));  # FFT returns phasor, use ABS to extract magnitude to plot

subplot(2, 1, 2)
plot(f(1:16))  # plot spectrum

studiorat · 26-02-2010 6:13pm

YA LEGEND!!!

Will try it as a command line so.

THinking out loud here, but I bet I have a bit of code that will stick those harmonic amplitudes into a csound instrument on the fly.

A way to use the Octave library to extract the values and pop em straight in might be a nice thing.

Of course csound already has pvsanal to do that!!

bnt · 04-03-2010 11:25pm

A couple of follow-up comments on those bits of code posted earlier:

The colon (:) is how you extract data from a matrix. w2(1:256) refers to the first 1... 256 samples of w2.

"Abs" of a simple number is its absolute value e.g. abs(-1.2) = 1.2. However, when you give it a complex vector or phasor, it returns the polar magnitude e.g. abs(3 + 4i) = 5. FFT returns complex phasors, where each value has a magnitude and a phase. If you're analysing spectra, the phase doesn't generally matter.

The line "t = 0:1/sample_rate:0.6;" returns a sequence in a matrix: from 0 to 1/sample_rate, step size 0.6. The semicolon at the end just means "don't print out the result on screen".

"N=size(t,'*'); " - "size" extracts the size of the matrix, the * seems to mean "all dimensions". Not sure that's there, I just used "length".

The matrix "s" is basically wave data, generated by that sine function. If you sent it to a WAV file it would sound like 3 sine waves mixed with a little noise (from the "grand" bit).

That comment about the FFT being symmetrical is correct, so it's sensible to use only the first half when doing a full spectrum analysis. (See also the Nyquist theorem - it's why CD-quality audio has 44100 samples per second, but can only represent audio up to 22050 Hz (44100/2) in theory, and less in practice due to filtering.

So, if you analyse 44100 Hz audio with a 256-bit FFT, you get 256 phasors back, spaced 44100/256 Hz apart, and the first half is the usable audio range (128 bands to plot). In the code I used, I resampled the audio so that the FFT magnitudes corresponded to the real audio harmonics of a sample tuned to C4, and I pulled out only the first 16 harmonics to plot.

"yy = y*2/N" - just scales the FFT results so that they range between 0 and 1, I think.

sponsoredwalk · 07-03-2010 1:19am

Proper Fourier analysis is a graduate topic & extremely scary.

You mentioned that you need to start at the basics, well I have a few online links that should help you out & keep you busy for quite a while (as they still are for me

).

www.khanacademy.org

This site will correct all the mathematical mistakes you learned in the leaving cert & really burn off all feelings of antipathy toward math you were taught to foster

.

http://www.uccs.edu/~math/vidarchive.html

This site is a goldmine for getting deeper into each particular mathematical topic. You must register but it's no problem, it's free. It's 10 times better than the MIT site & everything that is taught is actually taught correctly as opposed to the MIT teachers leaving you guessing & furiously researching to find out the extremely easy intuition behind each concept, (If you don't believe me, watch their calculus video on linear approximation :eek:).

I have attempted the lecture on the Fourier Transform by Stanford that were linked to in another post twice & just given up due to lack of mathematical understanding, maybe I was jumping the gun but they seemed pretty hard & I think you'd need a good book & a solid understanding of Analysis to really understand Fourier. I'd also advise working with Mathematica as it seems popular but I've never used it.

Mozerst · 18-03-2010 10:20pm

Thanks to everyone who's contributed to this thread so far - it was the only resource I could find on boards.ie that answered some of my questions. I have an idea for an upcoming public exhibition, and have a few implementation-related questions.

As this is a maths forum, I'll keep my tech questions to a minimum.

I'm wondering if it's possible to perform real-time FFT on up to 8 vocal inputs through an XLR interface with the simple goal of eliciting the underlying pitch being sung from each mic. Simply, is there enough processing horsepower in a recently built CPU to do that amount of DSP without a lag of more than 100ms? The other tasks required of the CPU would be relatively trivial in comparison.

Also, is it possible to run a package like Octave through Windows Vista, or do I need a leaner OS with fewer background apps?

I realise I'm being threadbare on the details here, but I don't want to start a full-scale discussion if I'm on the wrong thread. If any of you think you could help answer some basic questions, please PM me! I'd be very grateful.

studiorat, I've PMed you already :-)

studiorat · 18-03-2010 11:17pm

Hi,
There's a couple of patches in Pure Data that will do this, probably in Max too if you are in Trinity!! It depends what you need to with the data. I was playing with the sigmund~ object today this analyses an audio input and spits out pitch information. It's glitch city, in terms of the pitch tracking very closely, but you could mask it off a bit with a few arguments or a float to int type of thing. It seemed pretty fast certainly well under 100ms, which is a huge latency for a musician IMO.

Csound also has similar objects. So I didn't look any further at the Octave/Csound api thingy. I do have some resources for getting into the cSound api in C, and C++; which I must say I'm begining to loathe.

Would be happy to discuss the technical implementation on or off thread.

bnt · 19-03-2010 2:00am

Pitch detection has been the subject of quite a bit of academic research over the years. Have a look at this paper for a few ideas - it has a good description of some pitch detection methods. I think FFTs are a bit too clumsy and lack the precision you'd need for pitch detection.

I had a quick look for open source code, and there's a package called Aubio that looks like it could be quite useful. There are no Windows binaries (EXE) files available for download, but I think it can still be compiled on Windows if you install a UNIX environment. (I use Cygwin for this kind of thing, another known option is MingW32.) I'll give it a go tomorrow.

I know PureData for Linux (Debian/Ubuntu) has a pd-aubio module, don't know about Windows at the moment. I don't know Vista very well, but I don't know of any problems running Octave. However, it's not the right tool for real-time analysis - not sure it can even do real-time. PureData sounds like it's a better fit.

Mozerst · 19-03-2010 10:06am

Studiorat and bnt, thank you both - I'll give Aubio and PureData a go. It's useful to know the keywords I should be searching with, as the only audio processing program I'm familiar with is Cubase 4 LE

.

If anyone is interested, I'd be happy to keep you all updated on my progress. I'm going to try my idea out with one mic first on a Tascam US-144 interface, before moving on to two mics. One software challenge will be creating a program that can interface with PD (or whatever solution appears most apt) as well as performing some rudimentary pitch analysis (i.e. how far off from a given reference pitch each singer is). Then I'll need some kind of graphical front end that can display the results nicely - although I'm also considering a lighting display for each singer, if resources permit.

My overall goal is to create an interactive choral environment with no rules, other than the overriding imperative to stay in-tune with the people in your register (soprano, alto, tenor, or bass).

studiorat · 19-03-2010 10:31am

PD will run on Mac, Linux and Windows. You just download it and off you go, it's also graphical so it's not as daunting as some others.

Sigmund~ is a "sinusiodal analysis" and "pitch detection" object. It can detect the loudest partial or group of. And a few other useful things. There's also a Fiddle~ object. Either would be simple enough to put audio in one end and get MIDI and DMX (lighting midi!) out the other end.

The graphics environment is called GEM, it's alright. I suppose you could put some sort of graphical/frequency aspect in it.

Anyway, download pd, and in the help menu theres examples of these. There's also lots of stuff on You Tube for it.

bnt · 19-03-2010 11:18am

Or, if you have lots of money, you could run eight Auto-Tune VST plugins in Cubase. I last tried Auto-Tune years ago, but I think it can display the deviation from perfect pitch even if you don't correct it.

If you've never heard of Auto-Tune, here's an explanation:

https://www.youtube.com/watch?v=fYzv-AVi78E

Mozerst · 19-03-2010 2:27pm

bnt, I had heard of AutoTune, although I'd never experienced it directly until we recorded some demos in January. I would say our engineer used it judiciously, although I felt artistically undermined afterwards :pac:

As for using the AutoTune-style display to give each singer in the setup a sense of his/her pitch accuracy, I'm afraid it might be more of a distraction than a help given how detailed it would be. I was thinking of a far blunter mechanism for delivering feedback, so that a singer could focus more on vocal delivery than on the nuances of pitch correction. It was great to read that paper from Stanford as well to get some kind of theoretical framework in place for what the different algorithms achieve.

Thank you both for recommending pd - I think it will help a lot with what I'm trying to implement!

Fourier Analysis, Where to Start?

Comments