3D scanning is Dead, All Hail the NeRFs

28 January 2023

Neural Radiance Fields (NeRF) is the next big thing for visualizing the world around us. It is more powerful than Lidar/Laser/Aerial scanning on its own. NeRF has the ability to synthesize new views and fill in the gaps from photos or videos and create a much cleaner and more visually appealing output than what is being used today.

Current State of Things

A 360-degree field of view camera is the most basic form of site documentation. It can be mounted on a stick or stand, and a photo is taken. This photo captures the entire room, but it is limited in that you cannot move around the room and are stuck with the 360-degree view. This type of documentation works in all stages of construction, but it can be limiting.

That is where photogrammetry comes in, by either using an iPad with a lidar scanner or a laser scanner we can create a 3D room. An iPad isn't that expensive and gives an acceptable output in terms of a 3D model; it is functional and works, but the measurements are not reliable. A laser scanner is very expensive but gives a great output and is precise and reliable for measurements. However, both of them are slow to capture. In terms of carefully moving an iPad around or carefully setting up a laser scanner, it takes time to do this documentation. On top of that, the data requirements of photogrammetry are large; each room can be hundreds of photos and each model can have hundreds of thousands, if not millions, of vertices. Usually, the output is not perfect as well and requires cleanup to be actually workable. Nobody has time for this, which is why it is not used as often as 360-degree photos.

The New Thing

Neural Radiance Fields (NeRFs) and the next thing and many of your apps will update to using them instead of photogrammetry or Lidar. But what are they? Let’s start by defining photogrammetry and lidar so the difference can be better understood.

So photogrammetry does its magic by taking a series of photos, comparing them to each other, finding points of interest in the photos, and using that information to determine where the camera is taking the photos. Now that the computer knows where the camera is it can work backwards and use math to determine the depth of objects in the photos and create a point cloud and mesh.

Lidar works by using a Light Detection and Ranging (Lidar) module to do exactly what it says. Using infrared light and small movements the computer can take the depth information from the Lidar module and use parallax to find depth precisely. This is advantages over simple photogrammetry in most cases as it is using a module specifically designed for depth finding. Also the lidar scanners on iPad and iPhone are limiting in that their range is around 10ft.

Laser scanning works in the same way as lidar, using lasers (instead of lidar modules) and cameras to determine depth data very precisely. However, laser scanners are very expensive and more often can work for much larger areas than consumer lidar products.

NeRFs do not use lidar or laser scanning modules. Instead, they use photos like photogrammetry, but the way they process them is different. NeRFs take the photos and solve for position using machine learning. For each position found from the photos, a point is created, which is where the term "point cloud" comes from. In photogrammetry, these points are fused together into a 3D mesh. When creating a NeRF, this doesn't happen. Instead, each point gets its own vertex color that corresponds to the viewer's perspective. This is how NeRFs are able to do reflections, shadows, and transparency. Still, a NeRF can be converted to a mesh, but it loses the advantages of being a NeRF.

If you did not understand that, don’t worry, it is not something simple to understand. It is a lot better to observe the differences and see for yourself why NeRF is far better than any other 3d method out there. Linked here, and also embeded below, is a great video by Corridor Digital showing off how NeRFs will be amazing for the VFX and movie industry. So while the ideas and uses might not line up for architectural design and engineering, the concepts still work, and after all VFX and movie making is the art of telling a story and effective communication, something that is applicable in every industry.

Another note and something to consider if you did not watch that video, which I very much recommend you do (it showcases some great examples) is that NeRFs are very much in the early stages of growth. This means workflows for it do not yet exist. But I do have a few predictions. Polycam which is a great phone or tablet based photogrammetry app recently started working on implement NeRFs. Another startup Luma AI has their business based on making NeRFs accessible and provides an app and API to make this happen. Since the method for capturing NeRFs and photogrammetry based models or point clouds is similar I can see many other apps also following suit. So keep an eye out for updates from the developers for the apps you use.

The benefits are quite obvious for how useful these can be going into the future not just for construction but also for showing off virtual scenes from Twinmotion or other rendering tools in a lightweight manner. Reflections and transparency can add so much to a persons understanding of a scene since it is much more realistic than a static 3D model.

Please click this link and scroll for a bit, just think about how something like this could be of use to you or on your project.

Also here's some examples from Luma AI, note the three different modes in the bottom, the first is a true 3D model with a shader that simulates light reflections, the second mode is a true NeRF that uses the point cloud directly for lighting, and the third is a rendered video using the NeRF.