Resolving Namespace Conflicts in R

If you’ve used R for any length of time, you’ve probably run into naming conflicts. Maybe you’ve noticed them, but maybe you just ignored them. Usually, I’m the same way. I see a chunk of red text, and maybe I skim over it—but probably I just ignore it unless something goes wrong. The problem is, some of the results can be rather subtle, leading to puzzling errors. For example, consider this innocent-looking piece of code that uses dplyr:

iris %>%
  group_by(Species) %>%
  summarize(Mean.Sepal.Length=mean(Sepal.Length))

What does it produce? Well, it should produce this:

Source: local data frame [3 x 2]

     Species Mean.Sepal.Length
1     setosa             5.006
2 versicolor             5.936
3  virginica             6.588

But it might produce this:

Error in summarize(., Mean.Sepal.Length = mean(Sepal.Length)): argument "by" is missing, with no default

But why? Well, if you loaded dplyr and then, say, Hmisc, you might have noticed this message in your console:

The following objects are masked from 'package:dplyr':

    combine, src, summarize

R is telling you that summarize, along with some other functions, no longer refer to the functions in dplyr, but instead refer to the functions in Hmisc. You don’t even needed to have loaded Hmisc explicitly! Sometimes, one package or function will do this for you. For example, maybe you used stat_summary from ggplot2. You can see this by checking its environment:

environment(summarize)
## <environment: namespace:Hmisc>

You can also see this by running conflicts:

conflicts()
##  [1] "combine"      "src"          "summarize"    "filter"      
##  [5] "body<-"       "format.pval"  "intersect"    "kronecker"   
##  [9] "round.POSIXt" "setdiff"      "setequal"     "trunc.POSIXt"
## [13] "union"        "units"

Notice that summarize is listed.

Because summarize from Hmisc operates differently than summarize in dplyr, you may get the error above.

So how do you fix this? One way is to pay close attention to the order in which packages are loaded. You could load Hmisc before dplyr:

library(Hmisc)
library(dplyr)

If you do that, you’ll see this message:

The following objects are masked from 'package:Hmisc':

    combine, src, summarize

This may be nothing to worry about, but there’s another way to deal with the conflict, which is to just specify which version of summarizeyou want, using the “double colon operator”:

iris %>%
  group_by(Species) %>%
  dplyr::summarize(Mean.Sepal.Length=mean(Sepal.Length))

This might be a little tiresome, but it’ll solve the problem—and it’s not too different from how might access a namespaced function in some other language.