# The art of computing a building's volume

As part of our ongoing analysis of the 3D structure of buildings using open lidar datasets, we have stumbled upon the task of **volume computation**. While it may sound easy to compute the volume of a building when a 3D vector model is available, there are a few pitfalls to avoid when computing its volume.

Computing the correct volume starts with tackling a seemingly benign question:

### What is the height of a building?

Let’s start from the raw data: a polygon corresponding to the shape of the building on the ground and a point cloud collected with aerial lidar, classified into ground and non-ground points.

Each point has a height measured in meters above sea level. Obviously we want to know the height of the building above the ground. There are various off-the-shelf *Height Above Ground* algorithms (see for example the pdal documentation) which assign to every point the height difference to the closest ground point. This leads to a first approach at computing the height of a building: take the point with the highest height above the ground (or e.g. the 90th percentile to avoid outliers like chimneys or trees). This is the approach taken in the 3D GRB of Informatie Vlaanderen.

This method is nice for its simplicity, but raises a few questions:

- Are we sure that taking the 90th percentile gets rid of outliers? (For example large trees over the roof)
**What if the ground is not flat around the building?**How do we choose a reference ground height?

We approach this problem in a different way. First, we get rid of trees, chimneys and other outliers by building a 3D model of the building using flat roof panes. We subsequently decide that the *ground height* is the median height of the ground points within 5 meters of the building. Finally we compare the highest point of the 3D model with the ground height, and define it as the building height. In most cases, this works fine and does what we expect: disregard underground garage entrances, nearby ponds, …

### Computing the volume using the raw point cloud

Just like for a building’s height, it is possible to approximate its volume without having a 3D model.

The easiest way is to compute the average height above the ground of all points multiplied by the area of the ground polygon. This approach has a few downsides:

- The point cloud density is not the same everywhere, so taking the average point height might give more weight to certain areas.
- Some houses have very high and large trees extending above their roof; failing to remove those can have a significant impact.
- Non-flat ground around a building may provide unexpected results (e.g. garage entrance)

### Computing the volume using 3D models

Just as for height, our preferred approach is to set the ground height as the median height of ground points around the building, and we use our 3D model to sum up the volume below each roof pane.

So why do we use the **median** ground height as reference instead of using the **minimum** ground height? After all, an underground garage entrance signals the presence of a basement. Shouldn’t that be included in the volume computation as well?

Our reasoning is that many houses have a basement which is not visible at all from the outside. We seek a volume computation that is **as coherent as possible** across all houses. While we are able to detect the presence of certain (semi-)undergroud structures, we do not take into account (detectable) underground garages or (undetectable) underground basements when making volume computations. By doing so, we obtain a more coherent and comparable set of building volumes.