Comparing Floating Point Numbers in R

Comparing Floating Point Numbers in R

Floating Trap

Today we will discuss an important topic in programming in R especially when you are dealing with numerical methods, ex., comparing numbers with many decimals. The problem comes from the difference between machine and human’s understanding of real numbers.

Before we get started, let’s try to answer this question: what’s the result of “.3/1 == .3”? If you think the answer would be “TRUE”, then it’s necessary for you to read this tutorial. The correct answer should be “FALSE”. This is how it looks like in R:

(.3/.1) == 3

## [1] FALSE

What is Floating Point Number

There are usually 2 types of numbers in all kinds of programming languages, one is called an integer, the other a floating point. Floating point is used to represent fractional values (or values with decimals). Floating-point number formats include double-precision format and single-precision format:

1. Single-precision occupies 32 bits in computer memory.

2. Double-precision occupies 64 bits in computer memory.

For more information about the concepts, you can check out Wikipedia page. As suggested, double precision may be chosen when the range or precision of single precision would be insufficient. But R doesn’t have single precision type. That’s the reason why you see R is returning “double” when you check the type of number using “typeof()” in R. For example,

typeof(0.5)

## [1] "double"

You should know that floats should use about half as much memory as doubles. But the sacrifice is the accuracy, which many statisticians don’t like. If you think your data can work well without that level of accuracy, using floats could be suitable then.

We will not get involved in the debate of which one is better. I just wanted to say being a business (information systems specialization) researcher, I use the default (double) in R all the time to deal with floating point numbers. What I would focus in this tutorial is how to compare floating point numbers (by default, double-precision type) in R as this is a usual error in numerical methods. It’s listed as the first topic called the floating trap in The R Inferno, which is good book by the way.

Work with Float in R

If you want to work with float in R, I recommend you read this introduction of package “float”, in which there is nice an instruction to get started. Please remember to install single precision BLAS/LAPACK routines before you use “float” (the instruction suggests Microsoft R Open). Otherwise, you may not obtain that speed enhancement.

Comparing Floats

The issue of comparing floats results from the binary representation of decimal numbers. One option is you can use the “all.equal” function.

all.equal(.3/.1, 3, tolerance = sqrt(.Machine$double.eps))

## [1] TRUE

You see that, with tolerance of very very tiny error, two values are nearly equal. The tolerance is set up by “tolerance = sqrt(.Machine$double.eps)”.

The other option is that the “fpCompare” package offers relational operators to compare floats with a set tolerance. Let’s install the package and load it in the environment.

# install from CRAN
# install.packages("fpCompare")
library(fpCompare)

Let’s compare floats using relational operators. Here is a list:

1. b%<=%a (b <= a)

2. b%<=%a (b <= a)

3. b%>>%a (b > a)

4. b%==%a (b == a)

5. b%!=%a (b != a)

Now, you see the expected result. These two values are nearly equal, but not truly equal in machines.

(.3/.1) %==% 3
## [1] TRUE

Notice that, we can change the tolerance value based on our needs by setting fpCompare.tolerance, in options. This is using the same default tolerance value (.Machine$double.eps^.5, which is 0.00000001490116) used in all.equal() for numeric comparisons.

tol = .Machine@double.eps^.5 # default value
options(fpCompare.tolerance = tol)

# run the comparison again with tolerance
(.3/.1) %==% 3
## [1] TRUE

Last tip: sometimes you see scientific notation in R and want to avoid it. You can use the scipen argument in options. Decreasing the value of scipen will cause R to switch to scientific notation for larger numbers. Default is scipen = 0.

options(scipen = 4)
print(.Machine$double.eps^.5)

## [1] 0.000000001490116
options(scipen = 3)
print(.Machine@double.eps^.5)

## [1] 1.490116e-08

References

1. fpCompare: https://fpcompare.predictiveecology.org/

2. The R Inferno: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

3. What every computer scientist should know about floating-point arithmetic (Goldberg, David. 1991): https://dl.acm.org/doi/10.1145/103162.103163

Skip to toolbar