- Mixed effect models are useful tools in statistics because they can capture both fixed effects and random effects. Jared Knowles, a PhD student at University of Wisconsin Madison, created a tutorial with real world examples that explains how to run mixed models in R.
- Revolution Analytics compiles a list of industry news on R and statistics, including coverage on Domino, a San Francisco startup on collaborative Data Science, an R visualization tutorial, and some news on Quandl.
- Andrew Gelman discusses the concept of randomization and how it is misused in an interesting blogpost titled
*Three unblinded mice*. - For the finance and forecasting folks, a simple tutorial on how to create dygraphs using rCharts (don’t know what dygraphs is? It’s a fast, flexible, open source JavaScript charting library).
- How to analyze your Facebook friends network with R? A new package called Rfacebook can help you.
- And lastly, Derek Jones explains why he believes OLS is dead and software engineers like himself should use other tools.

## stats

2

Dec 13

## The week in stats (Dec. 2nd edition)

11

Nov 13

## The week in stats (Nov. 11th edition)

- Tableau has become a star in the Business Intelligence/Analytics world for its data visualizations. Yet, you can get even more out of Tableau if you integrate it with R. If you also use SQL, here is a tutorial for you on SQL, R and text analysis.
- Bad breaks, then flatlines. Good holds steady.
- Andrew Gelman offers his thoughts on the term marginally significant, which is commonly used but often misleading.
- A list of finance data sources which can be accessed directly using R. This is a must for quants, financial analysts and traders.
- Professor Vivek H. Patil of Gonzaga University describes some R visualization techniques using base R, ggplot2, and rCharts.
- Christian Robert, of Universite Paris-Dauphine, aka Xi’an, discusses his views on an article from The Economist about statistical significance and why many published research papers are unreproducible.

21

Oct 13

## The week in stats (Oct. 21st edition)

- Spreadsheets are user friendly, but they can also be dangerous. Patrick Burns explains why you should avoid spreadsheets and work with R instead.
- How’s your fantasy team doing? Revolution Analytics compiles a series of Fantasy Football modelling articles by Boris Chen of New York Times.
- Rexer Analytics has been conducting regular polls of data miners and analytics professionals on their software choices since 2007. They presented their results at the 2013 Rexer Analytics Data Miner Survey at last month’s Predictive Analytics World conference in Boston.
- Everyone understands the p-value, except for those who don’t. Here is an example that once again shows the p-value – that workhorse of modern science – continues to be misinterpreted in even the top tiers of the scientific literature.
- Despite all the hype surrounding big data and analytics, Louis Columbus of Forbes argues that the majority of business analysts lack access to the data and tools they need. Columbus explains why and how this should be changed.
- Six Decades of the Most Popular Names for Girls, State-by-State, represented all in one interactive map.

14

Oct 13

## The week in stats (Oct. 14th edition)

- The
*R is my friend*blog publishes a series of four articles on neural networks. This is probably one of the most comprehensive introductions to neural networks in R. If you are in love with neural nets and want to learn even more, here is another tutorial by Saptarsi Goswami. - State-by-state media preferences as revealed by bit.ly.
- Andrew Gelman, Professor of Statistics and Political Sciences at Columbia University, discusses why Bing is preferred to Google by people who aren’t like him.
- Have you heard of Simpson’s Paradox? Here is an interactive visual (using the 1973 Berkeley sex discrimination lawsuit as an example) that explains the paradox in 60 seconds.
- Dan Delany does a visual breakdown of furloughed employees due to the U.S. government shutdown. The main view shows furloughed proportions by department, and there are real time tickers for duration, estimated unpaid salary, and estimated food vouchers unpaid.
- If there is an 82% chance an an event will occur within your life time (and assuming that you live for 70 years), what is the probability that this event will occur on any given day?
- Tableau, the popular interactive data visualization tool, is coming out with a new 8.1 update, and it will include integration with the R language. Learn how to integrate the two in just 30 seconds.
- A short (but not trivial) lesson on data smoothing using R.

7

Oct 13

## The week in stats (Oct. 7th edition)

- The picture above is a very well-known mathematical construction called the fractal cat. Brian Lee Yung Rowe shows how to construct fractal artworks using R.
- Arthur Charpentier of Freakonometrics explains how to construct ROC (
~~rate of change~~Receiver Operating Characteristic) curves in R, as well as how to interpret and plot them. This is a useful for those in fields that frequently encounter longitudinal data, such as finance, engineering or biostatistics. - There are many kinds of intervals in statistics. To name a few of the common ones: confidence intervals, prediction intervals, credible intervals, and tolerance intervals. Each are useful and serve their own purpose. You should not only know their names, but also when to use them and why.
- A map of the most visited website for every country in the world (source: Alexa.com), as well as the internet population of each country.
- Suppose that you drop 5 blue marbles and 5 red marbles randomly (and uniformly) on the interval [0,1]. What is the probability that the marbles will interleave each other?