The Misadventures of Quinxy truths, lies, and everything in between!


Lytro Alternative: Automatic, Intelligent Focus Bracketing

Now in my early post-Lytro days I've been wondering how I could achieve the same effect with better results, not wanting to wait the years it might take for them to come up with a suitable next generation model.  Lytro's only real selling point at this moment is it's ability to take "living pictures" (their parlance), which really just means a photo which is interactive in as much as you can focus on different items in the picture by tapping those items.  The technology may be capable of quite a bit more, but that's all it currently delivers, and it delivers that with poor resolution, graininess, and restrictive requirements on lighting/action.

Living Pictures without Lytro

Why couldn't I achieve the exact same effect with far better results using my existing digital camera? I could, and did! Here's my "living picture" proof, using just an ordinary digital camera and a bit of human assistance.

No Lytro was required for this "living picture", just an ordinary digital camera (in this case a Sony NEX 5).  Click on different objects in this scene to change focus depth!

...and now for Lytro's version...

Lytro's "living picture"!  Click on different objects in this scene to change focus depth!

It doesn't take an expert in photo analysis to see that the non-Lytro picture looks much better: sharper, higher resolution, less grainy, and more realistic colors.

Faking Lytro Manually

The simple fact is, if you can capture a succession of photos each at a different focal setting and then view these photos with a JavaScript or Flash program that responds to clicks by selecting the photos whose clicked region was in focus, you can create the pleasing "living picture" effect without any fancy light-field camera required. And that is just what I did for the above demo. I took a series of pictures focusing on the different major elements in the scene and then made the regions in the photos clickable by way of a simple image map which specifies which image in my sequence corresponds to which area. While the approach I took for the demo was manual, it's not hard to see how easily any camera in combination with some very simple software could be made to do this, producing final results better than is possible with the Lytro camera.

Making Cameras Support Lytro-Like Effects

Many cameras these days have a feature called "exposure bracketing" which takes reacts to a shutter button press by taking a series of pictures at different exposure settings.  You then review the photos later and determine which photo looked the best.  Why then could you not have a "focus bracketing" feature which does the same thing but with focus?  The simplest approach would be to take multiple photos as the camera automatically walks the focus back from infinity to macro taking as many photos as necessary to achieve a desirable effect, perhaps as few as 5 or 10 would be needed to achieve a reasonable effect; with the aperture appropriately set any given picture's depth of field is wide enough to allow significant ranges to be sharply focused.  You would then need some mechanism for assigning clickable regions to the photo frames which happened to be in focus in that region.  This would likely be a fairly trivial software problem to solve.  All of this could be done with minimal camera intelligence, since it would just be varying focus distance in a fixed manner.  A far better but slightly more complicated solution would be to do automatic, intelligent focus bracketing using the camera's built-in autofocus system. Many cameras (particularly in phones) allow you to select a region of the scene which should be in focus. It would thus be easy for the camera to break down the scene into a search grid it would scan looking for objects upon which to focus, taking a single picture at each focus depth (one photo per range, according to the depth of field). The camera could record which grid location contained an object at a certain focal distance away, this being usable later to relate clicks on an image to a particular focused frame. The advantage of this approach is that it might be far quicker and more efficient, needing only as many frames as a scene objects' depths require. A scene which had two people in the foreground hugging and a church in the background would probably require just two photos to make a "living picture", people around a table at a birthday table with a cake in the center and a bounce house in the background may require 5 or 6 photos to make a Lytro-like image.

Working with Motion

These approaches share one significant weakness which is the fact that using multiple sequentially taken images negates the ability to capture any action-oriented scenes.  While modern digital cameras take rapid-fire photos, and those with exposure bracketing take three or so shots in a half second, that's certainly slow enough to make any significant movement within the scene noticeable when switching between frames.  Still, as action is easily blurred with the first generation Lytro, this hardly seems any sort of argument against this alternate approach.  An interesting solution to this problem and that of the inability to easily alter most existing camera's firmware, would be to use a replacement lens that split a single digital frame into multiple differently focused reproductions of the scene.  Just as I use a Loreo 3-D lens to merge the images captured by two lenses onto one digital frame, so to could one produce a system that would use four or perhaps nine lenses to capture one instant onto one digital frame through small lenses focused at slightly different depths.  Software could then easily split apart the single digital image into its component frames and do an easy focus analysis to determine what regions in each were in focus, with viewer software showing those as appropriate in response to touch.  The limitations of this approach would be related to the increased lighting requirements (or decreased action) as a result of the smaller lens, the expected poorer quality of each lens (related to cost and it being more a novelty manufacture than embraced by lens giants), and the reduced resolution (as your effective megapixel image would be the original value divided by the number of lenses within the lens).  Many stereo photographer setups coordinate two cameras to take their photos in concert, getting around all these issues, which you could also do to solve this problem, though I can imagine nothing more cumbersome.

The Lomo Oktomat as seen on Lomography has eight lenses which it uses to 2.5 seconds of motion across a single analog film frame it has divided into 8 regions. The same multi-lens approach but used simultaneously, with each lens focused slightly differently, could capture motion with Lytro-like aesthetics.

Focus Bracketing leads to Focus Stacking (Hyperfocus)

As I began to look into the practicality of these approaches I was pleased to discover that "focus bracketing" was being done manually, though with an intriguingly different goal.  Rather than produce a living picture where you can focus on different elements in a scene, a process called focus stacking is used to take and then (using software) merge photos taken at different focus settings to produce a single image in which everything in the scene is in focus.  The software involved analyses each photograph in the stack, each of the identical scene where only the focus is varied, and uses the regions of each photo which are in focus to produce the combined image in which everything is in focus.   This approach produces very impressive results.  The only limitations to this system is the requirement for a still scene, and the strong recommendation (if not requirement) that you use a tripod when taking your shots so as little varies as possible.  

Series of images demonstrating a 6 image focus bracket of a Tachinid fly. First two images illustrate typical DOF of a single image at f/10 while the third image is the composite of 6 images. From Focus Stacking entry on Wikipedia.

The aesthetic of a photo in which most things are in focus is quiet different from one in which only those things you select are in focus, but from a technical standpoint they are quite similar, since both situations require one possess the data pertaining to every element in a scene being in focus.  And a viewer could (and likely would) be given the option of viewing such a photo as he/she wished.  Do they want to see the photo traditionally (one, non-interactive focus point), as a "living picture" where they can choose the object in focus, or as a photo in which everything is in focus?

Focus Bracketing Available Today on your Canon PowerShot

A little further research led me to find a rather intriguing ability to add automatic focus bracketing to an entire range of camera models, via the Canon Hack Development Kit (CHDK).  CHDK allows you to safely, temporarily use a highly configurable and extensible alternative  firmware in your Canon PowerShot.  And users have used this to add focus bracketing for the purposes of focus stacking, and included detailed instructions on just how you can do it, too.

Coming Soon as an iPhone & Android Camera App

This integration of camera and software is a natural fit for an iPhone and Android app, where the app can control the capturing of the image and intelligent variation of focus and then do the simple post-processing to make the image click-focus-able. While I haven't seen such an app, I'm sure it'll come soon. I'd write it myself if I had the time.

Until the Future Comes

The point is that until Lytro demonstrates just what can be done with a light field camera, beyond merely creating a low-resolution "living picture", there's really no technical justification for placing the technology in people's hands when the same problem could be solved as effectively with traditional digital cameras.  If demand existed (and perhaps it will come) for this image experience, no light fields need apply.  Hopefully traditional digital camera companies will see the aesthetic value and include support in their firmware (for intelligent focus bracketing) and co-ordinated desktop software, app developers will launch good living picture capturing app cameras, and hopefully Lytro will demonstrate the additional merits of capturing and reproducing images from light fields.

^ Quinxy

Comments (10) Trackbacks (0)
  1. “You would then need some mechanism for assigning clickable regions to the photo frames which happened to be in focus in that region. This would likely be a fairly trivial software problem to solve.”

    I wouldn’t agree with that, but maybe you’re an expert in image processing.


  2. Daniel, well, I’m no imaging processing expert, but from what I gather the algorithms are pretty basic and well known, a form of it is used for the automatic focus of most camera lenses. You know a region is in focus when that region is at its highest contrast. So I’m just suggesting the same approach could be used to sort through a series of images and build a small database of which regions are in focus in which images, then using that to pick the right image when the user clicks on that region.

  3. I suspect that mulitple focused images is exactly what Lytro is doing, just all at the same time with multiple lenses – thereby losing quality on all. After all, there is no such thing as a “ray of light”, except in the poet’s mind.

  4. Mike,
    I’ve never looked into how Lytro actually works, but thinking about it on a drive one day I did realize one very primitive way you could achieve the same thing… Capturing a “ray of light” is just measuring the amount, color, and direction of light hitting a particular point. So all you need to do is put a “tube” in front of that point and rotate the tube through all reasonable angles (hinged at the base, against the point you’re measuring) and record the amount and color of light for every angled position. The tube serves only to block indirect light. And then do this for every point on your frame. Obviously this tube system might have been appropriate to the late 1800s, but surely there’s some alternative system they can do today. Maybe it’s as simple as putting a trillion tube containing filtering screen on top of the image sensor and having it quickly move through a pattern as the shot is taken, or maybe it’s using a holographic disk which replaces the tubes with infinitely small mirrors and knowing which mirror is pointing where at any given moment. Or… Obviously I should just read about how Lytro actually works, but thinking about how it could be done or even effectively faked, seems like more fun. 🙂

  5. Thanks for the thoughts. Yes, the present-day way to do it is with many lenses, not tubes. That is what Lytro does according to their patent. Toshiba has demonstrated one with “tens of thousands” of lenses, each corresponding to about 20 pixels, to capture the same scene at slightly different angles. While this is technically feasible and probably would produce the result that Lytro claims, why would anyone want to lose all the quality? This still is not measuring “light rays” as Lytro claims, it’s just thousands of tiny, normal images at slightly different angles (limited by the size of the main lens), which can be processed and combined later (apparently with a huge cost in processor time).
    And another company, DigitalOptics, has demonstrated a lenses module that will “snap a series of photos with varying depth of fields in quick succession”, and combine them in processing into a single image. Sound like what you did. Didn’t you patent it?
    Every photographer knows that, with a lens that small, everything is in focus anyway. If the lens is getting that small, it’s called a “pinhole”. The old pinhole cameras didn’t need any focusing.
    I’d like to see Lytro focus on the background visible between the very small openings in the wicker chair in your test. Of course, you could by creating many more entries in the image map. I’d also like to see a Lytro photo of a tape measure stretched from the lens straight out, so we can see if it can truly focus after-the-fact on any point along that tape measure. I doubt it. Oh yes, I heard that it takes several minutes to process each picture after one has chosen a focal point, so it would take a long time to test my theory.
    I believe the Lytro patent says they are really taking a series of photos at different focal lengths. If each of the multiple photos has overlapping “depths of field” and the post-processing simply selects the best and then blurs everything at a distance other than the selected focal length, it would appear as if one were really selecting the focal length. The multiple images from different angles would allow the post processing to determine the distance to each small area of the image, in order to know how much to blur it, to give the appearance of selecting the focal length.

  6. Mike,

    Thanks for sharing with me how Lytro and the other solutions work. Very interesting. The loss of quality is what killed the deal for me. I couldn’t believe how bad my Lytro pictures were. I thought the “living”-ness of the pictures would make up for their low resolution, but wow, they sure didn’t for me. The photos which get demoed on their site are amazing, and clearly picking the right scene makes all the difference. I just don’t have the energy, time, or eye for it. Any photo I take that isn’t interesting is just lousy, which makes me avoid it.

    Of course since we’re talking about interesting camera designs… makes me long for some sort of quasi-infinite mega-pixel analog camera, where they find a way to slow down photons through some yet-to-be-invented medium so that they’re essentially frozen until you are ready to transfer the photos to a fixed resolution. I always think of the light from distant galaxies which has taken millions of years to reach us and wish I could take a photo and have the image bouncing between two mirrors until I was ready to view it (somehow correcting for or preventing (magically?) distortion). The quasi-infinite is the beauty of analog.

  7. Since samsung nx300 has an open source firmware based on linux tizen os, it would be interesting if someone could implement focus bracketing.

  8. Hi- what is the javascript you used for the Lytro effect? BTW did you see Lytro is planning to release a new camera model aimed at a more pro market? What I like best about the Lytro is simply seeing the 3 images slowly changing focus in a high rez video of some sort- ( or perhaps some sort of slideshow ). Of course I would have no problem creating this with my SLR and my 70-200 lens…

  9. It’s actually just a crude little script I wrote using the html tags area/map to load the right image. If I’d spent more time on it I could have added a nice fade in/out using opacity or one of the CSS filter effects. I did read about the Lytro pro model, sounds neat. My latest phone, the HTC One M8 has their “Duo Camera” which uses one lens for the picture and one for depth information so you can do a bokeh / Lytro like effect. With some pictures the effect is extremely impressive and stunningly Lytro-like, but with others the software gets confused and fails to de-focus all the right areas, leaving these weird areas which are sharply focused in an area which is clearly supposed to be out of focus (areas which definitely cause trouble seem to be areas with thin, whispy elements, like grass or mesh). I saw even the Google Camera app now includes bokeh/Lytro-like effects, you take a picture and it instructs you to angle the camera down for it to take another, and then in software it figures out depth information. Pretty cool.

  10. Wow- I have not needed to know my multiplication tables this much in years! re: 6 x eight – ____
    See this is the problem with computers – a bot does not even know how to answer a simple multiplication question! Quinxy- would it be possible to post the java script you wrote… in the meantime I am starting to experiment with Lytro like effects – bracketing focus and placing these in a short movie with transitions –
    BTW- has anyone else played with SEENE for the Android- iPhone- a bit hard to explain0 but uses an image stack acquired by waving the phone in an arc to prove a 3-d like effect…

Leave a comment

No trackbacks yet.