Big Data Analytics: Asking the Right Questions

(Part 4 of 4)

According to IDC’s 2011 Extracting Value from Chaos study – the 5th consecutive report of its kind – last year the amount of data created and replicated burst through the zettabyte barrier for the first time. That’s more than one trillion gigabytes of data. Even if you don’t know the scale between zettas and gigas, you know that’s Big Data. In her final post of a 4-part series for the SHARE President’s Corner, veteran tech journalist Renee Boucher Ferguson explores how organizations are gleaning Big Analytics from Big Data.

From Mainframe to the Cloud – What the Future Holds

While conventional wisdom long has held mainframe computing to be the province of big enterprise and government workloads (the U.S. government uses mainframe computers to process census data) new pricing structures and old-world processing power are combining to bring the mainframe into the world of big data analytics, according to researchers.

“No other platform can offer predictable response time, uptime and security to 10's of thousands of concurrent users like a mainframe,” said Neil Raden, VP and Principal Analyst at Constellation Research Group, who focuses on Analytics and Business Intelligence. “When you consider the fact that big data represents a sort of brutal convergence of analytical and operational processing, there will be a multitude of situations where nothing else will suffice. Big data could prove to be a big renaissance for the mainframe. Given the prevailing idea that tons of cheap hardware in a cluster solves all big data problems, this will come as quite a surprise to many.”

In an October 2011 research report by Clabby Analytics and Enterprise Computing Advisors, the authors estimated that 40 percent of the current IBM System z customer base (read: mainframe customers) is either piloting or actually has deployed new workloads on their System z’s—including batch/transactional workloads and Java/Linux and analytics workloads.

According to the report, “Choosing IBM zEnterprise for Next Gen Business Analytics Applications,” there are two main reasons customers are deploying new workloads on System z. The first is superior economics, as one System z can cost less to operate and acquire than a bunch of servers. The second reason is the computer’s processing strength.

“We both (Enterprise Computing Advisors and Clabby Analytics) were conducting research into big data, analytics and mainframes to discover the intersection point,” said Joe Clabby, president of Clabby Analytics. “And we both concluded that if you have very large databases and you have to do a lot of heavy I/O—and you’re looking for security and resiliency and high availability—and you start looking at x86 architectures and midrange architectures versus mainframes, what you’re going to find is mainframes are very well suited for very large databases, and very large user populations.”

IBM, in fact, deployed its massive Blue Insights private cloud on System z (using Cognos 8 BI software).

While it doesn’t have quite the history of mainframe computing, cloud computing has come into its own over the last decade. Companies like have made not only Software-as-a-Service (SaaS) a household term, but also the more esoteric infrastructure and platform as service concepts have taken hold, too (IaaS and PaaS, respectively.) Analytic clouds are still gaining traction.

In its big data analytics report, The Data Warehousing Institute (TDWI) found that BI professionals consistently prefer private clouds over public clouds, especially for BI, data warehousing and analytics.

“This helps explain why public clouds have the weakest commitment,” writes TDWI’s director of research, Philip Russom. “The preference for private clouds is mostly due to paranoia over data security and governance. Even so, some organizations experiment with analytic tools and databases on a public cloud, then move them onto a private cloud once they decide analytics is mission critical.”

What To Do

With all the methodologies and technologies available to help companies process and understand big data analytics—and ask the right questions—where should organizations start?

IBM’s Jeff Jonas, chief scientist of the IBM Entity Analytics group and an IBM Distinguished engineer, believes that one of the things being missed by companies, as they try to find patterns, look for outliers and anomalies in big data, is the relatively straight forward task of counting.

“If you think you have three facts about three different things or three different people, and it’s really three facts about the same person—if you can’t count that, you can’t see the vector. You can’t see how fast it’s going,” said Jonas. “This is being missed by a lot of organizations because they’re bringing their data together in piles, but they’re not counting it.”

Clabby Analytics Research Analyst Jane Clabby suggests that IT organizations should start with a small pilot project and learn from there. Importantly, there must be buy in from business, IT and executives for projects to be successful. There also must be an organizational structure that enables the use of big data.

“You have to have a plan,” said Jane Clabby. “You start small with a project – we’re having an issue with fraudulent claims—and get started with that small problem. Tackle that and then branch out into other areas.”

Advice that brings the complexity of Big Data Analytics down to its simplest level.

For more than a decade, Renee Boucher Ferguson has been a senior writer for Ziff Davis Enterprise. She wrote the “Big Data Analytics: Asking the Right Questions” series on special assignment for the SHARE President’s Corner blog.

Recent Stories
Transforming Batch Resiliency through Greater Control Using Analytics and Automation

Rolling with the Tide: Flexibility in Mainframe Training

Message from SHARE: A Full-Throated Defense of COBOL