Sunday, October 12, 2014

The Reciprocal Normal Distribution

A recent question on OR-Exchange dealt with the reciprocal normal distribution. Specifically, if $k$ is a constant and $X$ is a Gaussian random variable, the distribution of $Y=k/X$ is reciprocal normal. The poster had questions about approximating the distribution of $Y$ with a Gaussian (normal) distribution.

This gave me a reason (excuse?) to tackle something on my to-do list: learning to use Shiny to create an interactive document containing statistical analysis (or at least statistical mumbo-jumbo). I won't repeat the full discussion here, but instead will link the Shiny document I created. It lets you tweak settings for an example of a reciprocal normal variable and judge for yourself how well various normal approximations fit. I'll just make a few short observations here:
  • No way does $Y$ actually have a normal distribution.
  • Dividing by $X$ suggests that you probably should be using a distribution with finite tails (e.g., a truncated normal distribution) for $X$. In particular, the original question had $X$ being speed of something, $k$ being (fixed) distance to travel and $Y$ being travel time. Unless the driver is fond of randomly jamming the gear shift into reverse, chances are $X$ should be nonnegative; and unless this vehicle wants to break all laws of physics, $X$ probably should have a finite upper bound (check local posted speed limits for suggestions). That said, I yield to the tendency of academics to prefer tractable/well-known approximations (e.g., normal) over realistic ones.
  • The coefficient of variation of $X$ will be a key factor in determining whether approximating the distribution of $Y$ with a normal distribution is "good enough for government work". The smaller the coefficient of variation, the less likely it is that $X$ wanders near zero, where bad things happen. In particular, the less likely it is that $X$ gets anywhere near zero, the less skewness $Y$ suffers.
  • There is no one obvious way to pick parameters (mean and standard deviation) for a normal approximation to $Y$. I've suggested a few in the Shiny application, and you can try them to see their effect.
I'd also like to give a shout-out to the tools I used to generate the interactive document, and to the folks at RStudio.com for providing free hosting at ShinyApps.io. The tool chain was:
  • R (version 3.1.1) to do the computations;
  • R Studio as the IDE for development (highly recommended);
  • R Markdown as the "language" for the document;
  • Shiny to handle the interactive parts;
  • various R packages/tools to generate the final product.
It's obvious that a lot of loving effort (and probably no small amount of swearing) has gone into the development of all those tools.

No comments:

Post a Comment

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.