FOVO: A new 3D rendering technique based on human vision

FOVO is a new CG rendering technique that creates a more natural view of 3D space. It could potentially transform the way games are designed and played. We explain what FOVO is and why it works. By Robert Pepperell and Alistair Burleigh of Fovotec

Robert Pepperell, Blogger

May 27, 2020

11 Min Read

Here’s a question for your next virtual pub quiz: what do linear perspective projection, the first commercial photographic process, the first photographic negative, digital image transmission, and the bullet-time effect used in The Matrix all have in common? Answer: they were all invented by artists:

Image technologies invented by artists: The technique of linear perspective — the basic geometry that underpins all computer graphics engines — was discovered by the painter and architect Brunelleschi and his associates in Florence in the early 1400s, and later refined by Leonardo da Vinci. The first commercial photography process was invented by the scene painter Louis Daguerre as a way of speeding up production of his virtual-reality panorama displays that were a hugely popular form of entertainment in early nineteenth-century Paris. Around the same time, Henry Fox Talbot developed the photographic negative at his country house near Bath in England out of frustration at not being able to paint his plant specimens accurately enough. Samuel Morse, another painter and inventor of the eponymous Code, contributed to the commercial development of telegraphy, which quickly led to telegraphic image transmission. And Tim Macmillan was an art student when he invented the original ‘Time-Slice’ process, often dubbed ‘Bullet-Time,’ to create photographs that looked like Cubist paintings.

We now hope to extend this list with FOVO, a new form of image rendering based on the structure of visual perception and the secret techniques used by painters for hundreds of years to make their paintings look more real. FOVO stands for ‘Field of View Opened’ because one of its major properties is to enable 3D engines to render the ultra-wide fields of view we experience in our everyday vision far more naturally than conventional linear perspective graphics techniques. To see what we mean, look at the comparison between two renders of the same scene at 160 degree horizontal field of view in the animated GIF below. The one that looks like the space is being sucked into a vortex is linear perspective. The FOVO version, we think you’ll agree, looks more natural:

Two renders of a 3D scene, each with the same horizontal field of view (160º). One is rendered in standard linear perspective and one in FOVO. Which do you think looks more natural? Which would you prefer to see a game played in?

Photographs capture light, not how we see

FOVO has been invented by artists Robert Pepperell and Alistair Burleigh who share a deep interest in imaging technology and the science of visual perception. It started as a research project at Cardiff School of Art in the UK about 10 years ago when we began to realise that conventional images created with cameras and computer graphics engines weren’t doing a very good job of representing how we see. Maybe you can relate to the following example: Imagine looking at the full moon on a clear night. You whip out your smartphone to preserve the moment and are hugely disappointed by the results. The textured, luminous orb dominating the night sky is reduced to an insignificant smudge on your screen. Photography, as the philosopher Nelson Goodman quipped, is apt to make a molehill out of a mountain. When Paul Cézanne painted Mont Sainte-Victoire, as he did many times, he made it look big!

The reason for the discrepancy between how we see and what we snap is that, despite what is commonly thought, cameras are not really designed to emulate human vision but to capture rays of light and print them onto a flat plane. They do this job amazingly well, and are getting ever better. Most 3D rendering processes, including those used widely in gaming, emulate the optics of cameras using very smart maths and, again, they do it very well and are getting ever better. The problem is that people are not cameras, and while we do capture light with our eyes and focus it with lenses the similarity with cameras ends there. Our light sensors are not arrayed on flat planes but on hemispheres, and most of us have two, each made up of millions of intricately-designed neurons that interact with each other in astronomically complex ways. And that’s just the start. Once neural signals from the eyes get into the many areas of the brain devoted to vision things get very convoluted indeed.

The net result of all the visual processing going on in our eyes and brains is the rich and expansive experience we have when we look at the world. And that is not a photograph. It’s not flat, monoscopic, or surrounded by a rectangular frame, and nor is it actually very high resolution. The optical quality of most of the visual field is so poor that if we saw only that we’d be almost blind. In fact most of what we see is derived from a tiny region in the centre of the retina, and the rest is made up by the brain! Yes, most of what we see is not actually there, it’s what the brain thinks is most likely to be there based on what we’ve seen before. And the reason the moon or the mountain seem so big to us is that our visual systems massively magnify any object we are paying attention to while, conversely, shrinking any that we are not. With enough concentration you can observe this in your own visual field, as demonstrated here:

These two circles are the same size. Look at the right hand circle, focusing on the central cross. Now pay attention to the left hand one, without moving your eyes. After a short time you may notice the left hand circle appears smaller, and may even change shape. The effect gets stronger with practise.

Natural Rendering

What does all this mean for computer graphics in general and for games graphics in particular? Well, if you want to create an image on a flat screen that looks and feels like the deep world we see in natural vision then it is not enough to just capture the paths of light and project them into pixels. Or more accurately, it is just not practical with current technology. Even VR headsets have severe limitations in this regard. To achieve the widest, deepest and most immersive image on a flat, rectangular pixel grid it turns out (as we have discovered after years of experimentation) that you have to throw away the linear perspective rulebook (originally invented by artists) and start again. You have to begin not by modelling the incoming light but the structure of visual perception and, much like artists do, create your images accordingly.

This is what we’ve been doing in the dark in a lab in an art school in Cardiff, closely studying how humans actually see and how the visual system works, understanding how artists have translated visual experience onto the flat plane for centuries (and by the way, they have almost never strictly applied the rules of linear perspective), and rethinking 3D computer graphics from the bottom up. The result is FOVO, a new method of rendering 3D space that emulates the human visual system, not cameras. We call it ‘Natural Rendering’, and here’s another example:

This comparison between FOVO and linear perspective shows the ‘molehill out of a mountain’ problem with conventional imaging technologies. The FOVO version is the one that looks like what you’d expect to see when standing in the landscape and looking at the distant mountain range.

You can see the difference: the linear perspective render (which is mathematically and optically correct) looks unnaturally stretched, and the mountains in the middle of the image — usually the part people are most interested in — are tiny. In the perception-based FOVO render, by contrast, there is relatively little distortion of the space, and the mountain range in the distance is much larger. When we have tested this in the lab we find that people are (unsurprisingly) much better at judging the virtual size and distance of objects in the FOVO render compared to the linear perspective render. They also consistently tell us that FOVO images feel more immersive, and that they feel more involved in the space. The immersive effect gets stronger the longer you look and the larger the screen the image is viewed on.

Having spent much of the last decade developing this process, and getting it working in real time 3D engines like Unity3D and Unreal Engine, we’re excited to see how it can be used in applications like gaming. Gamers often complain about the constricted field of view they have for exploring game worlds — usually limited to between 90 and 120 degrees. And this is certainly something that can be improved with Natural Rendering. But there are other aspects of the FOVO technology that could enhance the way games are designed and played. One of the unexpected by-products of the modifications we made to the standard rendering maths is that we created a highly flexible 3D geometry manipulator. We were forced to do this (rather reluctantly!) because it was not mathematically possible to create the natural effects we wanted without binning most of the beautifully simple linear perspective maths that standard 3D engines use. The result is that 3D spaces can be controlled in ways that are not possible with standard projection techniques, allowing designers to create attention-grabbing spatial manipulations, as demoed below:

These subtle adjustments of the 3D space are being applied volumetrically, as can be seen from the way the occlusion paths in the scene are changing. This demonstrates that the FOVO process is not simply ‘warping’ a 2D render in screen space but is applying nonlinear transformations to the entire 3D geometry.

Confronting a killer robot might be a lot more scary in FOVO mode than in linear perspective mode when rendered in the same horizontal field of view. Note the occlusion path changes in the robot with respect to the background. Such changes cannot be generated using standard projection techniques.

What does it cost?

The two things we consistently get asked by games devs, and others in the 3D graphics industry, are: will it make my game run slower, and will it break my workflow? The technical answer in both cases is: it depends. On performance, ultra-wide FOVO renders at 170º are generally more taxing than standard 90º viewports optimised for linear perspective. But FOVO can actually be faster and (sometimes as importantly) is higher quality than currently available methods of generating ultra-wide fields of view. The more precise technical answer is that it depends on your budget for polygons and how they are distributed in the scene. On workflow, again it depends on how much your project relies on the standard features and tools that are ready-built into packages like Unity3D and Unreal Engine, and how much customization you want to do. FOVO currently runs on a range of platforms and modes, and we are continually adding more. But given the many different engines and rendering techniques out there, we anticipate a certain amount of custom integration work will be needed for the foreseeable future, and we are working with devs to make integration as smooth and fast as possible.

Unity’s First Person Shooter demo running in their HDRP mode, showing the same scene in an ultra-wide field of view in standard linear perspective and FOVO.

The future of 3D gaming

As ever, there’s lots of exciting developments in 3D graphics and imaging technology. Unity has its scriptable and high definition render pipelines for impressive high quality real time rendering (see example with FOVO implemented above). And Epic recently announced Nanite, and showed a compelling demo running on the PS5 with a scene made of so many triangles that you could barely see them (https://www.youtube.com/watch?v=qC5KtatMcUw). Apple is supporting 3D imaging with the new generation of iPads, and graphics hardware is accelerating in performance almost exponentially. Effective eye tracking, spatial tracking, volumetric displays, lightfield capture, and many other new toys are already on the horizon. We’re not going to predict the future, but it seems a reasonable bet that we’re going to be playing with gaming tech that is more immersive, higher quality, more responsive to our behaviour, and — we believe — more naturalistic in appearance. We think FOVO will be part of that curve, and are happy to talk to anyone in the games industry who might help us find out how.

Biogs

Robert Pepperell is a co-founder of Fovotec. He studied at the Slade School of Art at UCL and spent most of the 1990s working trying to create mind-blowing animations, music, and art on hopelessly underpowered microcomputers. ([email protected])

Alistair Burleigh is a co-founder of Fovotec. He studied a mixture of Art and Media Technology at the University of Wales, Newport in the 2000’s. He has worked on all manner of creative media technology projects for major clients on a global basis. ([email protected])

For games tech related inquiries please contact Kristofer Rose ([email protected])

www.fovotec.com

About the Author(s)

Robert Pepperell

Blogger

See more from Robert Pepperell

Related Topics

Related Topics

Recent in More

Related Topics

Related Topics

FOVO: A new 3D rendering technique based on human vision

About the Author(s)

Latest News

Trending

Featured Blogs