In this post I will show you how you can add TEZ options to your Hive ODBC connection and thus your RODBC queries in R. Hive only a few years ago was rare occurrence in most corporate data warehouses, but these days Hive, Spark, Tez, among others open source data warehouses are all the buzz in the corporate world and data analysts need to adapt to this changing world.

Continue reading

Intro Today I will discuss how to install Apache Spark onto a Windows machine. I have just walked through the process a second time at work due to a laptop swap and it takes me some time to remember all the steps to get the install right, so I thought I would document the process. Step #1: Download and Installation Install Spark First you will need to download Spark, which comes with the package for SparkR.

Continue reading

As R turns 25 years old this year, I thought it would only be appropriate to thank the creators of R, Ross Ihaka and Robert Gentleman, as well as the R global community for changing my life. I started learning R in 2014 because I was tired at how SAS had become like a curmudgeon old monopoly rather than a true innovator. But I guess that is what happens to companies with a first mover advantage; they get complacent and feel invincible.

Continue reading

Author's picture

Alfredo G Marquez

Data and R Aficionado

Sr. Data Analyst