Understanding Multitouch/Convolution and Deconvolution

Convolution Overview edit

Convolution mathematically is described as an operator which takes two functions f and g and produces a third function that represents the amount of overlap between f and g. Convolution is used quite a bit in signal analysis because convolution operations are often very cheap to perform, can be performed in parallel, and are easy to adjust to fit the situation.

In our case, we will be talking about convolution in the 2D sense, where our image and a known convolution function (known as a filter) will be combined to make an image that better represents the data we are looking for. As you should already be familiar with these concepts, this will be a brief overview of how convolution works, and the specific convolution filters we are interested in.

If you recall, our theoretical sensor model inputs data into the system as an 8x8 matrix of 8-bit data, looking similar to the following:

            Horizontal Address 
          0  1  2  3  4  5  6  7   
 V A   0 00 00 00 00 00 00 00 00   
 e d   1 00 00 00 00 00 00 00 00   
 r d   2 00 00 00 00 00 00 00 00   
 t r   3 00 00 00 00 00 00 00 00   
 i e   4 00 00 00 00 00 00 00 00   
 c s   5 00 00 00 00 00 00 00 00   
 a s   6 00 00 00 00 00 00 00 00   
 l     7 00 00 00 00 00 00 00 00

Our convolution kernels, however, will be much smaller in size. For the algorithms we will be using on the theoretical model, a 3x3 convolution kernel is sufficient.

The important thing to remember when it comes to implementing a convolution kernel is that we are taking two functions and generating a third. The original functions can be discarded, or they can go back into the system for other purposes, such as decreasing noise, so when we go to implement our filters in software, we will want to generate a function which will return a filled buffer to the user the same size as the input buffer.

A convolution operation should look very similar to this pseudocode:

function Convolution ( in_buffer[8][8], out_buffer[8][8], divisor )
{
   define kernel[3][3];

   for (x from 0 to 7)
      for (y from 0 to 7)
         out_buffer[x][y]  = 0; //make sure the buffer is zeroed

         if (x-1 >= 0 && y-1 >= 0)
            out_buffer[x][y] += in_buffer[x-1][y-1] * kernel[0][0];
         if (x-1 >= 0)
            out_buffer[x][y] += in_buffer[x-1][y] * kernel[0][1];
         if (x-1 >= 0 && y+1 <= 7)
            out_buffer[x][y] += in_buffer[x-1][y+1] * kernel[0][2];

         if (y-1 >= 0)
            out_buffer[x][y] += in_buffer[x][y-1] * kernel[1][0];
         //this is our "free" operation, the only one that will always work.
            out_buffer[x][y] += in_buffer[x][y] * kernel[1][1];
         if (y+1 <= 7)
            out_buffer[x][y] += in_buffer[x][y+1] * kernel[1][2];

         if (x+1 <= 7 && y-1 >= 0)
            out_buffer[x][y] += in_buffer[x+1][y-1] * kernel[2][0];
         if (x+1 <= 7)
            out_buffer[x][y] += in_buffer[x+1][y] * kernel[2][1];
         if (x+1 <= 7 && y+1 <= 7)
            out_buffer[x][y] += in_buffer[x+1][y+1] * kernel[2][2];

         if (divisor && divisor > 0)
            out_buffer[x][y] /= divisor;
         else
            out_buffer[x][y] /= 9;

      end for
   end for
}

The above may be optimized on your hardware, as often you can avoid the branch statements by insuring your buffer is big enough to catch possible overruns without needing to do the bounds testing, trading memory for branch operations. Your algorithm will generally require no temporary data besides the constant kernel, and you will probably already have functions similar to this in your toolkits if you have worked with signal processing in the past.

Often the question is asked on what to do at the edges of the filter (known as Edge Conditions). There are many to choose from, such as picking a known "background intensity" and apply it for that element (in this case, the data was a zero, and thusly would not affect the kernel operation; this is the exact same as truncating the filter in this special case), but there are many other ways of dealing with this condition, including extending the data over the edge of the image, truncating the kernel and not running operations over the region, or simply not applying the kernel to points which extend over the edge (often known as "Copy in Place").

You will also notice that we specified no else situations during the pixel operations; in all cases where we aren't doing an operation, we're assuming the output of that operation would have been zero. For example, if the convolution kernel's center term is zero as is often the case, there is no point to do the multiply and yield zero. Simply skip that calculation and move to the next.

In certain cases, for example if our data is in integer format, we need to apply a divisor to satisfy the constraints of the number format (in this case, our buffer consists of 8-bit data). The divisor is typically the kernel's width times the kernel's height (9 for a 3x3, 25 for a 5x5, etc), but there are cases in which other divisors may be desirable.

Sobel Kernel edit

The first kernel we will look at is a Sobel Kernel, also known as an Edge Detection kernel. A Sobel Kernel emphasizes high frequency changes in the image function, by reducing the intensity of pixels of low frequencies as it increases the intensity of pixels with a higher frequency. Essentially, the Sobel Kernel is taking the second derivative of the image's intensity at every coordinate, however, to those of us less mathematically minded and to some of us who are, it's much easier to understand as a convolution filter:

 -1 0 1
 -2 0 2
 -1 0 1

We can think about this by looking at the one dimensional second derivative of a constant function f(x): $f^{\prime \prime }(x)=1*f(x-1)-2*f(x)+1*f(x+1)$ . By taking the constants and storing them in a matrix, we construct our one dimensional convolution kernel: ${\begin{bmatrix}1&&2&&1\end{bmatrix}}$ . We make this into a 2-dimensional convolution kernel by matrix multiplication with the vertical matrix ${\begin{bmatrix}-1\\0\\1\end{bmatrix}}$ , which is simply the absolute difference from the point we are convolving.

There are several variations of this relatively simple filter. For example, one often adjustment is to the direction of emphasis:

 1 0 -1
 2 0 -2
 1 0 -1

This adaptation will change the leading edge of the filter to the right side rather than the left as is emphasized in the former example. It is the same filter as above "flipped" by multiplying the function by -1.

  1  2  1        -1 -2 -1 
  0  0  0   or    0  0  0
 -1 -2 -1         1  2  1

These are rotations to the original filter, which emphasize upwards and downwards respectively.

Understanding Multitouch/Convolution and Deconvolution

Contents

Convolution Overview edit

Sobel Kernel edit

Deconvolution edit

See also edit