This was just going to be a short toot, but it keeps growing so I moved it over to here.

I thought I was about to have the shortest “cheat” solution to an #Exercism problem - Roman Numerals - where we need to convert a number to roman. R already has as.roman() though it returns an object of class ‘roman’ and the tests expect a character result, so fine, as.character(as.roman(arabic)).

It works great - passes 25 tests with just that. But I get one test failure… as.roman(3999) produces NA which is documented

Only numbers between 1 and 3899 have a unique representation as roman numbers, and hence others result in as.roman(NA).

The Exercism problem states we can assume input is only up to 3999 which makes sense, because according to Wikipedia that’s the largest number that has a unique representation. So, why then does R stop at 3899???

I thought it was just enforced in the code, since utils::as.roman() calls the unexported utils:::.as.roman() which takes a check.range = TRUE argument (this function is not documented). Sure enough, within that function is

if (check.range) 
        x[x <= 0L | x >= 3900L] <- NA
    class(x) <- "roman"
    x

but if I set that argument to FALSE I still get NA

utils:::.as.roman(3999)
[1] <NA>
utils:::.as.roman(3999, check.range = FALSE)
[1] <NA>

so what’s the point of that?

I ran through this with my favourite debug tool debugonce() and the only line that actually does anything is immediately following this check

class(x) <- "roman"
x

Up until then, x is still the numeric input value. Setting the class on it seems to perform the conversion, but I don’t quite see how that is invoked. as.roman() isn’t being called - I set a debug flag on that and it wasn’t invoked. I set another on utils:::.as.character.roman() and it was invoked, but I don’t quite see the path to that yet.

I also see utils:::.numeric2roman() which seems to do the conversion but sets the value to NA if it’s larger than 3899, and has no way to avoid that check.

So… bug report? There doesn’t appear to be one involving the fact that the values 3900-3999 should be representable - at least not as far as I can tell in Bugzilla.

I figured I’d make a post about it first and see if anyone has any further insights, otherwise I might work up the courage to email r-devel.