In R's data.table , when should one choose between %between% and %inrange% for subsetting operations? I've read the help page for ?between and I'm still scratching my head as to the differences.
library(data.table) X = data.table(a=1:5, b=6:10, c=c(5:1)) > X[b %between% c(7,9)] a b c 1: 2 7 4 2: 3 8 3 3: 4 9 2 > X[b %inrange% c(7,9)] a b c 1: 2 7 4 2: 3 8 3 3: 4 9 2
They look the same to me. Could someone please explain why there exist both operations?
66.7k 8 8 gold badges 101 101 silver badges 184 184 bronze badges
asked Mar 6, 2017 at 18:33
user7613376 user7613376
X[b %inrange% list(lower = c(6,9), upper = c(7,10))] -- Example of what Kristoferson said.
Commented Mar 6, 2017 at 18:42
Compare X[a %between% list(c, b)] vs X[a %inrange% list(c, b)] and then read the docs again.
Commented Mar 6, 2017 at 18:43
> X a b c 1: 1 6 5 2: 2 7 4 3: 3 8 3 4: 4 9 2 5: 5 10 1
Using the example in the comments:
> X[a %between% list(c, b)] a b c 1: 3 8 3 2: 4 9 2 3: 5 10 1 > X[a %inrange% list(c, b)] a b c 1: 1 6 5 2: 2 7 4 3: 3 8 3 4: 4 9 2 5: 5 10 1
It seems between looks at each row individually and checks to see if the value in a is such that c
inrange looks for the smallest scalar value in c , say cmin and the largest scalar value in b , bmax , forming a range [cmin, bmax] , and then checks to see if a lies in this range [cmin, bmax] , for each row in the a column.