somewhere near the beginning.

2d convolution via frequency domain multiplication

Filed under: Mathematics — Alex @ 9:11 am 7/11/2006

I spent this morning tangoing with convolution and the discrete fourier transform. My goal when I first came in was to see if applying a linear filter to a Markov field gives you another Markov field (intuitively, it seems like it should give you one with neighborhoods of larger width). Since applying a linear filter is the same as convolution, I needed a nice way to handle the application of an energy function to convolution products– this seemed like the good time to apply the oft quoted ‘convolution is multiplication in the frequency domain’ trick: since we’re dealing with finite everything (the field is on a finite graph, and has finite states), I could use matrix multiplication to express the convolution.

Surprisingly, I couldn’t find a single reference which had a matrix formulation of the DFT; I’m recording it here for anyone who might find it useful. Let \omega = exp^{-j \frac{2\pi}{n}} be the n-th primitive root of unity, and define the Fourier matrix \Omega by \Omega_{ik} = \omega^{(i-1)(k-1)}, then the (2d) DFT of the matrix A \in M_n(\C) is

 \hat{A} = \Omega A \Omega

and the IDFT is

 \check{A} =  \frac{1}{n^2} \overline{\Omega}A\overline{\Omega}.

That is one step in finding a matrix formulation of convolution. Assume we have convolution kernel F \in M_m and we’d like to calculate F \star A where A \in M_n is an image, and  n > m as is usual. If we naively take the DFTs, then \hat{F} and \hat{A} are not compatible, but notice that convolving with F is the same as convolving with a bigger kernel \mathcal{Z}_n(F), where we define the zero-padding operator to satisfy

\mathcal{Z}_n(F)} = \begin{pmatrix} F & 0_{n-m} \\ 0_{n-m} & 0_{n-m} \end{pmatrix} \in M_n.

A further point to notice is that we need to worry about aliasing, since we are dealing with discrete sums of frequencies. In particular, if when we multiply in the frequency domain we get frequencies (u,v) that don’t fit in the range {0, \ldots, n-1}^2, when we attempt to reconstruct, we’ll get higher frequencies masquerading as lower frequencies. The solution is to see that the highest frequencies you can get are the sum of the highest frequencies in both the kernel and the matrix. So if you zeropad the two matrices appropriately and then perform the frequency domain multiplication, you should be set.

Possibly relevant posts:

1 Comment »

RSS feed for comments on this post. TrackBack URL

Leave a comment