First, apologies for the delay in posting; I know that your life has been dimmed by my absence, but I did finish a paper for AoAS. Now, enough hyperbole — more computing.

I came across an older (2006) blog post via this HN post arguing that C/C++ are poor tools for numerical and scientific programming. It’s a bit dated, but I agree with many of Mark’s points. In particular, it is much more difficult to write efficient low-level numerical code in C than in some other languages. This is particularly true compared to, for example, Fortran or functional languages like OCaml.

However, I’m not ready to ditch C/C++ and start building my next project in one of those languages. For these decisions, I think it’s important to distinguish between low- and high-level numerical programming. This point is being hashed-out somewhat in the HN thread, but it seems that relatively few of them are building analytics software for scientific users. From my perspective, I want three (conflicting) traits in my programming language for numerical development:

Read More

I recently went over this paper on the maximal information coefficient (MIC), which has garnered a lot of attention among the statistics blogosphere and some of my applied collaborators. My initial reaction is that it looks like a nice addition to the range of exploratory techniques, but it has some major limitations. Also, if you are reading this paper, I strongly recommend going through the supplement (PDF) as well; it has most of the important technical details about their methods.

Read More

Welcome to the blog. I intend to update this at least weekly with comments, questions, and musings on the union of statistics, computing, and massive data (intersection is so restrictive). It is my hope that you find my ramblings informative, thought-provoking, and/or rage-inducing — really, I’ll take any or all of the three. Happy holidays, and welcome again!

© 2012 Alexander W Blocker