Choosing hardware for big data analysis is difficult because of the many options and variables involved. The problem is more complicated when you need a full cluster for big data analytics. This session will cover the basic guidelines and architectural choices involved in choosing analytics hardware for Spark and Hadoop. I will cover processor core and memory ratios, disk subsystems, and network architecture. This is a practical advice oriented session, and will focus on performance and cost tradeoffs for many different options.
Mike Pittaro has over 25 years experience in the high technology industry, specializing in high performance computing, data warehousing, and distributed systems. He has held engineering and support positions at Alliant Computer, Kendall Square Research, Informatica, and SnapLogic. Mike is currently the principal architect for big data on Dell’s Cloud Software Solutions team, where he focuses on delivering big data solutions. He is a member of the ACM, The Free Software Foundation, and the OpenStack Foundation.