Opened 3 years ago

Closed 3 years ago

## #11928 closed Bugs (fixed)

# surveyor area method is inaccurate

Reported by: | Owned by: | Barend Gehrels | |
---|---|---|---|

Milestone: | Boost 1.61.0 | Component: | geometry |

Version: | Boost 1.58.0 | Severity: | Problem |

Keywords: | area | Cc: | charles.karney@… |

### Description

The following code illustrates a problem with the "surveyor" strategy
for computing planar areas. It computes the area of a square of area
2 and the same square translated by 10^{8} in x and y. bg::area
incorrectly computes the area in the second case as 0.

#include <iostream> #include <boost/geometry.hpp> namespace bg = boost::geometry; typedef bg::model::point<double, 2, bg::cs::cartesian> pt; int main() { bg::model::polygon<pt, false, false> poly; // A simple square of area 2 bg::read_wkt("POLYGON((1 0,0 1,-1 0,0 -1))", poly); // Computes the correct area (= 2) std::cout << "area " << bg::area(poly) << "\n"; // The same square shifted by (10^8 10^8) bg::read_wkt("POLYGON((100000001 100000000, \ 100000000 100000001, \ 99999999 100000000, \ 100000000 99999999))", poly); // Should give 2 but instead gives 0 std::cout << "area " << bg::area(poly) << "\n"; }

Running this code gives

area 2 area 0

Discussion:

- Surely it's a bad idea to embed the algorithm name "surveyor" into the API. Computing 2d areas is really straightforward. Users shouldn't need to worry about the underlying method. Maybe "planar_area" is a better name.
- In any case "surveyor" is a rather silly name of the type beloved of Wikipedia editors. There's no evidence that surveyors actually used this method.
- It's more expensive (at 2*n multiplies and 2*n additions) than applying the trapezium rule (otherwise known as "doing the integral"), i.e., summing (x[i+1] - x[i]) * (y[i+1] + y[i]) / 2, which clocks in at n multiplies and 3*n additions.
- Finally, as this example illustrates, it's subject to catastrophic loss of accuracy, while the trapezium rule will give the exact result. (Kahan contrasts evaluating x*x - y*y compared with (x - y) * (x + y) with x = 1e15+1, y = 1e15-1. The former result is wrong by 1.5% while the latter is exact.)

### Change History (4)

### comment:1 Changed 3 years ago by

### comment:2 Changed 3 years ago by

Thanks for the reference to Abel Flint (a great name!). You will see on
pp. 59-63 that he describes the trapezium rule and *not* the so-called
surveyor's method! Most of his contemporaries who were mathematicians
would have recognized this as such and not, therefore, have seen it as
some special technique invented by a surveyor.

The analysis of why

sum x

_{i}y_{i+1}- x_{i+1}y_{i}

is typically subject to worse round off errors than

sum (x

_{i+1}- x_{i}) (y_{i+1}+ y_{i})

goes as follows. Each formula contains

- products
- differences in forming each term in the sum
- differences in forming the sum itself

In the surveyors formula 2 follows 1 and so the mild round off errors
from the products are sometimes catastrophically magnified. In the
trapezium rule 2 precedes 1 and there is *no* round off error if x_{i+1}
and x_{i} are within a factor of 2 of each other (which is a common
situation).

Both formulas are subject to potentially dangerous round off with 3.
Centering the numbers might ameliorate this problem. Alternatively,
merely switching to the other trapezium rule involving (y_{i+1} - y_{i})
would help, e.g., in the case of UTM coordinates where often the
northings are much larger than the eastings.

### comment:3 Changed 3 years ago by

By the way, in GeographicLib, I use an Accumulator class to perform the summation. This effectively sums at double the normal precision.

### comment:4 Changed 3 years ago by

Milestone: | To Be Determined → Boost 1.61.0 |
---|---|

Resolution: | → fixed |

Status: | new → closed |

I tweaked the formula as you suggested. Thanks.

**Note:**See TracTickets for help on using tickets.

The name of the formula seems to be older than Wikipedia [1][2].

I haven't measured it but I'd rather guess that nowadays the most expensive part in both cases is data access.

Furthermore I guess that in the case of the trapezium rule the result is also not precise in general when coordinates are big. So I'd rather fight with this issue by moving all of the coordinates closer to the origin of the coordinate system before passing them into the strategy, so doing it on the fly while calculating the area with any strategy. Similar technique was used before to improve the precision of

`centroid()`

. This solution should improve the precision of a function no matter what strategy was used.With that said I'm not against changing the formula. But should trapezium rule be better in all cases or is there a reason why the old one could be left as it was?

[1] Flint, Abel. "A System of Geometry and Trigonometry: together with a treatise on surveying: teaching various ways of taking the survey of a field; also to protract the same and find the area. Likewise, Rectangular Surveying; Or, an accurate method of calculating the area of any field arithmetically, without the necessity of plotting it.", 1808

[2] Braden, Bart. "The surveyor’s area formula.", 1986