Statistics in Rust via R

Ever wish you could call some of R’s functions from Rust? Yeah, me neither, but it’s actually pretty simple and might even be useful in some situation or another. If ever you do need to perform some statistical tasks, why reinvent the wheel? Why not leverage functionality that R already provides?

R contains a standalone math library called, well, Rmath, containing functions like rnorm (which, for non-R folks, generates a random variable from a normal distribution with a given μ and σ). You can call these functions from Rust, with just a little bit of extra code.

Unfortunately, R doesn’t actually ship with the standalone library; you’ll have to obtain it elsewhere or build it yourself—which is trivial. If you’re on Ubuntu, there’s an associated package called r-mathlib. Just install it and you’re good to go. Otherwise, if you can’t find a precompiled library, you can just build Rmath from source. This page provides some guidance there, but it should just involve going into src/nmath/standalone, and running make shared followed by make install.

Otherwise, this is just a story of Rust-C interoperability, which the Rust book covers. The first step is just to use the libc crate (along with a feature attribute for now), which contains some functions and types for working with C code:

#![feature(libc)]
extern crate libc;
use libc::{c_int, c_double};

Specifically, for these examples, we want c_int and c_double, which will map to the corresponding Rust types. Using these, we can define the signatures for the functions we want to use from Rmath. For example, set_seed and rnorm:

#[link(name="Rmath")]
extern {
  fn set_seed(i1: c_int, i2: c_int);
  fn rnorm(mu: c_double, sigma: c_double) -> c_double;
}

Note also the link attribute.

From here, we can simply call the functions as needed, though they need to be wrapped in unsafe:

unsafe {
  set_seed(1357 as c_int, 2468 as c_int);
  let x = rnorm(0.0 as c_double, 1.0 as c_double);
  println!("Random from standard normal: {}", x);
}

Of course, it’d be nice to just wrap the Rmath functions up all nicely in Rust functions that we can use without concern for what’s happening internally. Easy enough:

fn rmath_set_seed(i1: i32, i2: i32) {
  unsafe {
    set_seed(i1 as c_int, i2 as c_int);
  }
}

fn rmath_rnorm(mu: f64, sigma: f64) -> f64 {
  unsafe {
    rnorm(mu as c_double, sigma as c_double) as f64
  }
}

So there you have it. For fun, here’s a Rust version of Darren Wilkinson’s Gibbs sampler example:

#![feature(core)]
#![feature(libc)]

extern crate libc;

use libc::{c_int, c_double};
use std::iter::AdditiveIterator;
use std::num::Float;

#[link(name = "Rmath")]
extern {
  fn set_seed(i1: c_int, i2: c_int);
  fn rnorm(mu: c_double, sigma: c_double) -> c_double;
  fn rgamma(a: c_double, scale: c_double) -> c_double;
}

fn rmath_set_seed(i1: i32, i2: i32) {
  unsafe {
    set_seed(i1 as c_int, i2 as c_int);
  }
}

fn rmath_rnorm(mu: f64, sigma: f64) -> f64 {
  unsafe {
    rnorm(mu as c_double, sigma as c_double) as f64
  }
}

fn rmath_rgamma(a: f64, scale: f64) -> f64 {
  unsafe {
    rgamma(a as c_double, scale as c_double) as f64
  }
}

fn gibbs(n: usize, thin: i32) -> (Vec<f64>, Vec<f64>) {
  let mut xs: Vec<f64> = Vec::with_capacity(n);
  let mut ys: Vec<f64> = Vec::with_capacity(n);
  let mut x: f64 = 0.0;
  let mut y: f64 = 0.0;
  for _ in 0..n {
    for _ in 0..thin {
      x = rmath_rgamma(3.0, 1.0 / (y.powi(2) + 4.0));
      y = rmath_rnorm(1.0 / (x + 1.0), 1.0 / (x + 1.0).sqrt());
    }
    xs.push(x);
    ys.push(y);
  }
  (xs, ys)
}

fn mean(x: Vec<f64>) -> f64 {
  x.iter().map(|&a| a).sum() / (x.len() as f64)
}

fn main() {
  rmath_set_seed(1123, 5813);
  let (xs, ys) = gibbs(20000, 500);
  println!("mean of x: {}", mean(xs));
  println!("mean of y: {}", mean(ys));
}