December, 2013

The week in stats (Dec. 23rd edition)

The week in stats (Dec. 16th edition)

Prize for statistics students?

In order to promote work on statistical simulations, as well as thinking about deeper issues in data analysis, I’m considering starting a prize for students.

Here are my ideas:

* One prize would be for the most innovative use of Monte Carlo methods to model a problem in pure or applied statistics. This prize would be offered in two divisions: undergraduate and graduate.

* One prize would be for an essay that explores the foundations of probability theory or statistics with an emphasis on epistemological issues. This would be open to all students.

* Prizes would be in the $3,000 – $6,000 range.

* The judging committee would be drawn from professors, students and industry.

What are your thoughts? Specifically:

* If you’re a student, is this something you’d apply for?

* If you’re a professor or instructor, do you think your students would be interested in this? Would you pass along the information to them?

* If you represent a company, could you see advantages to sponsoring one of the prizes?

* What changes or suggestions do you have?

The week in stats (Dec. 9th edition)

  • The problems with using a p-value as a fixed cutoff for hypothesis testing are well known. Probabilities and P-Values is another article that discusses the weakness of the p-value. However, like every author who claims the p-value is horrible, no one is able to produce a satisfactory substitute.
  • PirateGrunt is currently producing a series of 24 articles called 24 Days of R. In every post, he shares a few neat R tricks and explains how you can use them. You may find his first post here and the subsequent ones in his blog.
  • Coursera – an online education startup – has rapidly expanded its curriculum of statistics and data analysis courses. There are now 33 modules directly linked to the field, excluding the courses where statistics and data science are used as a supportive tool (e.g. finance). These courses make use of multiple statistical software packages like Python, MATLAB and of course R.  Here’s the complete list of Coursera courses using R, ranked by “popularity”.
  • For those interested in machine learning, a preview of Data Mining Applications with R by Yanchang Zhao and Yonghua Cen is available here.
  • A tutorial on the R package Plotly, and how to make beautiful visuals and graphs with it.
  • A recent article by Matt Asay claims that “Python is displacing R as the language for data science.” David Smith of Revolution Analytics discusses his thoughts on the competition of R and Python.
  • Consider n points uniformly distributed on a sphere. What is the probability that all points lie on a same hemisphere (not necessarily the north or south hemisphere)? Arthur Charpentier of Freakonometrics presents a simulation-based solution, along with some very nice visuals.

The week in stats (Dec. 2nd edition)