News & Commentary: 2008-08-10

Fast Distance Calculation in Two Dimensions

This is not really new, but for me it is new because I hadn't seen it before. The idea is that you have two points in a 2D space and you want to calculate the distance between them.

Ideal Method (but Slow)

Pytharogas had the answer a long time ago, and it looks like this:

dx = x1 - x2;
dy = y1 - y2;
dist = sqrt( dx * dx + dy * dy );

Now for the bad news, the sqrt() function can be slow to calculate. Low-end hardware does not support this operation directly and some people like to work in integer-only maths. Even modern PC style hardware (that does implement sqrt() in the CPU) still runs a little slower than simple linear integer maths. So we want a fast calculation that avoids the sqrt() function but gets the same result (and something that can easily be converted to integers).

Above is a plot of the Pythagoras formula. This one gives the correct answer, so compare all other methods to this one to get a feel for their accuracy.

Fast Method #1

The first fast method is a very simple octagonal formula that is easy to remember, moderately accurate and when implemented in integers requires no multiplication (because shift right will divide by 2). It always guesses the distance either correctly or a bit too large. This method is popular for "first cut" pathfinding because it is very fast (and with pathfinding, nearly all the results are rapidly thrown away). This example function is pure integer and returns a fixed-point result offset by 10 bits. Note that this method does not lose accuracy even when you greatly reduce the number of fractional bits in the fixed-point arithmetic, so it is also an excellent choice for small register widths.

static unsigned int distance( int dx, int dy )
{
	if( dx < 0 ) { dx = -dx; }
	if( dy < 0 ) { dy = -dy; }
	if( dx < dy )
	{
		return(( dy << 10 ) + ( dx << 9 ));
	}
	else
	{
		return(( dx << 10 ) + ( dy << 9 ));
	}
}

See below for a plot of this method.

Fast Method #2

If you don't mind doing real multiplies and want to tweak a little bit more accuracy out of the octagonal system, then adjusting the constants will allow a tradeoff between over-estimating some at angles while under-estimating at other angles. This gives a more accurate result at the penalty of multiply operations (most hardware has accelerated multiply in this day and age, but still it is slower than simple add). See below for the plot (with the magic constants in the formula). Converting this to integer calculations is relatively easy if you have reasonably wide registers with enough headroom to multiply up and shift down again. As above, we return a result as fixed-point with a 10 bit fractional part.

static unsigned int distance( int dx, int dy )
{
	if( dx < 0 ) { dx = -dx; }
	if( dy < 0 ) { dy = -dy; }
	if( dx < dy )
	{
		return( 970 * dy + 402 * dx );
	}
	else
	{
		return( 970 * dx + 402 * dy );
	}
}

Fast Method #3

Extending the previous method a little bit, we check the gradient and divide our program flow into two different directions. Then use even better tuned coefficients, each one tuned for a particular region of the space. This two-way split allows a better result than the octagon for only one additional branch and very slightly larger code. This runs a fraction slower but gets substantially lower error.

static unsigned int distance( int dx, int dy )
{
	if( dx < 0 ) { dx = -dx; }
	if( dy < 0 ) { dy = -dy; }
	if( dx < dy )
	{
		if(( dx + dx ) < dy )
		{
			return( 1006 * dy + 237 * dx );
		}
		else
		{
			return( 834 * dy + 602 * dx );
		}
	}
	else
	{
		if(( dy + dy ) < dx )
		{
			return( 1006 * dx + 237 * dy );
		}
		else
		{
			return( 834 * dx + 602 * dy );
		}
	}
}

Fast Method #4

So how to go one better? Using a quadratic provides considerably better accuracy but also requires a divide operation. The divide operation introduces a special case when the distance is zero (easy to handle the special case, if MAX is zero then distance is zero) so the special case test must be done on every cycle. Some hardware can accelerate divide operations, and extra multiplies are required here -- so the additional accuracy comes with some penalty. Should usually still be faster than a sqrt() function in most situations. Here is some example code, again returning a fixed-point result offset by 10 bits.

static unsigned int distance( int dx, int dy )
{
	unsigned int R = 0;

	if( dx < 0 ) { dx = -dx; }
	if( dy < 0 ) { dy = -dy; }
	if( dx < dy )
	{
		if( dy > 0 ) { R = dx; R <<= 10; R /= dy; }
		dx = dy; /* NB: dx always becomes MAX */
	}
	else
	{
		if( dx > 0 ) { R = dy; R <<= 10; R /= dx; }
	}
	return( dx * (((((( 375 * R ) >> 10 ) + 63 ) * R ) >> 10 ) + 1019 ));
}

See below for a plot of this method, pretty good huh?

This gives substantially better accuracy and distributes the error with a little on the over-estimate side and a little on the under-estimate side.

Compare with the ideal calculation at the top of the page, almost impossible to see the difference.

Score Card

Let's see how they stack up against each other. The test platform was a Celeron class machine with fsqrt assembler operation (i.e. sqrt() function is implemented by the CPU hardware). The error values use an RMS to get a typical percentage error. All integers are 32 bit.

MethodTypical ErrorSpeedup Factor
fsqrt assembler operation0%1x
Method #19.3%19x
Method #22.4%15x
Method #30.64%12x
Method #40.23%1.7x

Conclusion

The most accurate (Method #4, with the quardatic) is probably only useful in cases where your sqrt() implementation is hopelessly slow (e.g. an iterative sqrt() using Newton's method, implmented in software). Certainly it will beat a badly implemented sqrt(). However, with hardware acceleration, Method #4 is only marginally faster than the ideal answer, so might as well use sqrt() and get the answer correct.

For most other circumstances, I would recommend Method #3 which is better than 1 part in 100 and still substantially faster than using Pythagoras. Consider using Method #1 if you are working on very primitive hardware, short registers and speed is much more important than an accurate answer.

See Also

Creative Commons License
This work is licensed under a Creative Commons License.

Back to News Commentary Index