Deconvolution in Image Processing

Can it make our lives less miserable?

It was only the last July, when getting familiar with the new Panasonic GX9 camera, I stumbled upon an interesting information: the GX9 is said to address the problem of lens diffraction in the firmware.

Wait a minute. For all those years I was taught to believe that diffraction is a fact of life, like paying taxes or dying. You can't avoid it, or reduce its effects by improving the lens design or using some special, exotic glass. the problem occurs at a much lower level; it is a part of the mechanism of wave propagation.

On Diffraction

Simplifying things just a little, an image of subject's point P is a superposition of partial images, created by light passing through all points, Ai, within the lens aperture. For a hypothetical, ideal lens there is one point, P*, where all those partial images overlap; it defines the focus plane for P. Placing the light sensor there, we get our point rendered sharp.

The light passing through a point Ai close enough to some obstruction (aperture, lens edge) behaves slightly differently. Some of it still travels straight to P*, while some is randomly (if with known probability distribution) spread around that direction, causing a diffraction blur. This is just a feature of the wave nature of light, and it can't be helped.

The final result for any P will be superposition of partial images of that point, created by light passing through all possible Ai. As it happens, for a circular aperture it is shaped like a circular blob, surrounded by a series of weakening rings.

The blob is called the Airy disk (after a 19-th century British astronomer, who also established the Greenwich meridian as reference); the rings form the diffraction pattern.

The Airy disk radius (measured up to the first minimum around the blob) can be approximated as r = 1.22 λF (where λ is the light wave length, and F stands for the F-number, like 2.8).

For the averaged visible wavelength of 550 nanometers and F/4.0, this yields 2684 nm or 2.7 micrometers. Assuming (Rayleigh criterion) the Airy radius to define image resolution, it follows that:

  • Spacing pixels (or photosites) by less than half this value will not further improve the actual resolution at this aperture, regardless of lens and camera used.
  • Closing aperture beyond this value at the resulting pixel spacing will not... [ditto]

Note, however, that cameras with various sensor (or film) sizes require various degrees of image magnification for the same viewing size. This enlarges the image detail, but also does the same for diffraction effects.

This is why nobody was worried about diffraction in full-plate cameras. With images sized at 8½×6½ inches (diagonal of 27cm) they required six times less magnification than today's "full frame" for the same viewing size. Ansel Adams and his buddies of the famous Group f/64 would be now Group F/11 if they were using "full frame" cameras (referred to, of course, as 1/40-plate!).

At the other extreme, most of digital cameras have really tiny sensors and. therefore, require huge magnifications. Most of the models referred to as "compact" or "superzoom" use 1/1.7" (focal length multiplier of 4.6), or even smaller imagers. When these images are properly magnified, there will be effectively as much diffraction at F/2.4 on this sensor as at F/11 on a "full frame" one, or at F/64 for the "full plate".

For the 20 MP μFT sensor, the pixel pitch is about 3.8 μm — almost exactly half of the Airy radius at F/11 (7.4 μm). For a full-frame sensor we reach the same pixel pitch by quadrupling the pixel count (80 MP!), an obvious advantage. Still, even for μFT there is some room left until we reach the limit at F/8 or wider apertures.

As it happens, on a "full frame" camera, the combined effect of diffraction and other lens flaws (disregarding any pixelization) is at minimum at about F/11. On μFT this will happen around F/6.0.

Then, for a compact camera's 1/1.7" sensor, the sweet spot will be about F/3.0 — stopping the lens down beyond that value does not improve its performance. For even smaller frames, the lens becomes diffraction-limited: performing best wide-open; no gain at all when closing the aperture.

Understandably, being able to correct diffraction computationally would be a blessing for designers of cameras with smaller sensors, starting from APS-C or μFT, but especially smartphones. And this may be the real reason for using a powerful processor to run a camera's firmware.

Well, we could always use some general-purpose sharpening/deblurring filter to get rid (partly, at least) of some image blur, regardless of source. This is, however not the same. One, there will be no restoration of diffraction-affected detail; two, we may not want to sharpen the areas which are slightly out of focus.

Those filters are mostly limited to narrowing the transition zone between areas of different brightness or color. They won't restore such a line lost to blur, but a line which is not lost will look "sharper"; the transition across it being now more rapid. While this may often make the image look "better", it will not reveal more detail.

Here is where signal deconvolution may help. A lot. Before, however, we go into deconvolution, we must know what convolution is.

Consider an ideal, no-flaws (whatever that means in the given case) image, P. When processed by some real-life imaging device, signal from any point p of P will affect not just one point q of the resulting image Q, but some area around it (with the effect fading away from the original q):

ΔQ = f(p)

The function f is often referred to as Point Spread Function (PSF), and in the Nineties I had the pleasure of learning about it — when some of my friends were trying to fix the Hubble Space Telescope mirror problem without actually fixing the mirror. (The attempt was a success.).

Obviously, the whole image is a superposition of inputs from all points p, with multiple ΔQ overlapping each other. This can be written as

Q = ΣΔQ = Σf(p)

where the sum (or, more accurately, the integral) goes over the whole image. This is more often shown as

Q = F(P)

or even Q = F⁎P — just a different notation, the meaning remains the same.

The process F of tangling these inputs from individual points of P into a composite Q is often referred to as convolution. By definition, deconvolution is the process F-1 inverting that, i.e.

P = F-1(Q)

Thus, if we knew the F-1, we could apply it to the recorded image to obtain a corrected one.

This is where a real mathematician gets up, says "Now you guys just work out the details" and leaves the room. And this is also where the real fun begins.

First, the good news. If we know the physical mechanisms of signal convolution (here: lens construction and settings), we can compute the Point Spread Function, f, for any arguments as needed. This, in turn, makes the computing of F and, more importantly, F-1 possible.

The most practical way to do that, at least at the moment, is to convert he whole problem into the terms of Fourier series and Fourier transforms, use one of the existing deconvolution procedures (preferably a software library) to find the solution, and then convert the result back to an RGB image.

While Fourier deconvolution dates back to Wiener's work of 1940's, it gained popularity only after computers became more commonly used in related applications. Doing it in pencil is madness.

I wish I had paid more attention to Fourier transforms during the advanced Mathematical Methods in Physics class, but his wad the most boring of all courses we had in five years of school. Thus, I remain illiterate in the subject, a Fourier virgin, so to say.

Now, the warts.

First of all, the relationship between P and Q has a stochastic component, usually referred to as noise (a sum of Poisson fluctuations of photon count on a photosite, dark noise, electrons generated, pick-op and amplification noise, and whatever else). The equation should rather be

Q = F(P) + E

The generally accepted rule of thumb assumes that the smaller the difference Q-F(P), the better the solution, P, and this is usually the case. Statistically, of course.

The same convoluted image Q can usually be deconvoluted not to just one original P, but to a continuous region (or sometimes regions) of such originals; each with a given likelihood (either in exact or figurative meaning; does not matter). Picking of those the one with the highest value results in P which is most likely (but not guaranteed) to be the one convoluted into the recorded Q.

If we are lucky, there may be a region of P's (in the continuum of possible solutions), for which the likelihoods are consistently and clearly higher than elsewhere. Picking the "best" solution (or one close enough for practical purposes) will be simple.

It may happen however, especially (but not only) if the noise is high, that very different solutions (or solution regions) show similar likelihood values; in rare cases the most likely solution may even be totally wrong.

Secondly, not always do we have a complete, physical convolution model, or PSF, at our disposal. This may require some knowledge of lens construction and, possibly, exposure data. While providing, no doubt, the best results, this would be a burden on software (including firmware) makers. And this is where things start getting a bit ugly.

Instead of an actual physical model of convolution, the software may use just a descriptive one (very much like a polynomial fit in some other applications), with results of f depending on one or (usually) more free parameters. The deconvolution process can be performed repeatedly in search of the combination providing the best results, with "best" defined by some goal function (for example, a measure of contour gradient). This is a kind of multi-dimensional optimization.

Such a deconvolution process results in not just the most likely original P, but also the parameter combination for which it was found.

Any deconvolution in which some parameters of the model (any kind, descriptive or physical) are not known up front and must be estimated during the process, is called blind deconvolution. It is always less effective and more problem-prone than the one providing full model data.

For a number of reasons, third-party postprocessing applications usually choose blind deconvolution: no need for individual lens models, "one size fits all". This, however, comes at a steep price of greatly reduced quality and robustness of correction. Software makers, obviously, are not eager to describe techniques they are using as "blind". Therefore it is safe to assume that any deconvolution application is of the "blind" kind, unless there is a clear statement (or other strong indications) to the contrary.

Sorry, somebody had to tell you.

Now, we may be lucky trying to use deconvolution to clean up lens diffraction: the PSF seems to have just one model parameter, which is the F-number. If this is known, we would not have to resort to blind deconvolution.

Devil is in the details, though. I have no clue, which of three possible definitions of F-number should be used here (two of them may require more knowledge of the lens geometry) and what (if any) difference may that make. (The options are: the definition using nominal focal length, or actual one for the given focus setting, and the effective distance from the aperture to image plane.)

Another possible problem: color dependency. An Airy disc for deep red is almost twice the size of a violet one. Perhaps deconvolution before demosaicing? Fourier transforms don't know or care, and this may be worth a try.

Back to Panasonic. Before the GX-9, two other Lumix models had this feature. Surprisingly, that went largely unnoticed in the μFT community. The only other similar effort I/m aware of was by Pentax, in a 2014 firmware upgrade for their K-3.

Still working on it...

While at this point the article (or perhaps the first part of it) is reasonably complete, after some time to catch my breath I will resume working on it, as there is more worth knowing on the subject,


Home: wrotniak.net | Search this site | Change font size

Photo News | The Gallery


Posted 2018/10/19; last updated 2018/10/29 Copyright © 2018 by J. Andrzej Wrotniak