Filtering Vectors

I saw a YouTube short demonstrating how to use filter() in Python along the lines of

nums=range(1,30)

def is_prime(num):
  for x in range(2,num):
    if (num%x) == 0:
      return False
    
  return True

  
primes=list(filter(is_prime, nums))
print(primes)
# [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

and as always, I like to think about how I’d do this in other languages (e.g. R).

The (deliberately brute-force) is_prime() function translates easily enough

is_prime <- function(x) {
  if (x %in% 1:2) return(TRUE)
  for (i in 2:(x-1)) {
    if (x %% i == 0) return(FALSE)
  }
  TRUE
}

and the most common way to run this over a vector of inputs is with something like sapply() returning a vector of logicals that can be used for square-bracket subsetting

x <- 1:30
primes <- x[sapply(x, is_prime)]
primes
[1]  1  2  3  5  7 11 13 17 19 23 29

We can get rid of the need for sapply() by vectorising the function which is done easily with Vectorize()

is_prime_v <- Vectorize(is_prime)
primes <- x[is_prime_v(x)]
primes
[1]  1  2  3  5  7 11 13 17 19 23 29

It’s usually at this point that I feel bad about having to write x twice and lament that while {dplyr} is great for data.frames, we don’t have something equivalent for filtering of vectors so easily… except we do (as I usually end up remembering). Filter() is a base function that works on vectors and is the equivalent of what we saw in Python (per the docs, “Filter corresponds to filter in Haskell”)

primes <- Filter(is_prime, x)
primes
[1]  1  2  3  5  7 11 13 17 19 23 29

In fairness, Filter() expands to much the same as the above

Filter
function (f, x) 
{
    f <- match.fun(f)
    ind <- as.logical(unlist(lapply(x, f)))
    x[which(ind)]
}
<bytecode: 0x12a9db8b0>
<environment: namespace:base>

but it’s a nice interface.

I need to remember to use that (as well as the rest of the gold-mine in the ‘funprog’ toolbox like Map and Reduce) more often.