Big Data Analytics: Asking the Right Questions

(Part 3 of 4)

According to IDC’s 2011 Extracting Value from Chaos study – the 5th consecutive report of its kind – last year the amount of data created and replicated burst through the zettabyte barrier for the first time. That’s more than one trillion gigabytes of data. Even if you don’t know the scale between zettas and gigas, you know that’s Big Data. In her third post of a 4-part series for the SHARE President’s Corner, veteran tech journalist Renee Boucher Ferguson explores how organizations are gleaning Big Analytics from Big Data.

There is no shortage of major technology providers — from hardware to software vendors — with some sort of play in big data analytics. But at least analysts draw a distinction.

“As I look at the portfolio of companies in this space – that’s the combination of being able to handle vast amounts of data and then being able to analyze it, such as EMC and HP and Oracle and Microsoft – the one that really jumps out at me is IBM,” said Joe Clabby, president of Clabby Analytics. “If you look at their portfolio, they are banging this drum pretty hard.”

IBM has invested heavily in big data analytics, spending more than $14 billion on more than two dozen analytics-related acquisitions over the past five years. The company expects to generate $16 billion in revenue by 2015, up from $10 billion in 2010, according to news reports.

Within its broad portfolio, IBM has several major products that enable big data analytics: IBM InfoSphere BigInsights, InfoSphere Streams and Netezza. IBM acquired Netezza in 2010. The company brings a data warehouse appliance that became the basis for current analytic database platforms. InfoSphere BigInsights sits on top of Hadoop to provide text-based analytics, a spreadsheet-based data discovery and exploration tool, and administration tools. InfoSphere is a platform for real-time analytic processing of structured and unstructured data.

In addition to its expert entity counting engine that helps companies turn “puzzle pieces into puzzles,” as IBM chief scientist Jeff Jonas puts it, the IBM Entity Analytics group is developing a new capability. It is integrating SPSS, an IBM predictive analytics pattern-discovery system, into a new analytics engine, codenamed G2, that make SPSS entity analytics capabilities available to a much wider group of customers, according to Jonas.

“In the IBM Big Data group we’ve got a handful of really, really cool things,” Jonas explained. “InfoSphere Stream is truly unique. It’s meant for [streaming] a million things a second and moving big data around. We have the expert counting engine for turning puzzle pieces into puzzles that I’m doing, and then we have the Hadoop platform for deep discovery and deep introspection after the fact. It’s going to be the combination of these things together – deeply integrated – that is going to give IBM a pretty unique story.”

And IBM hasn’t shirked on big data internally. In 2009 the company launched Blue Insight, its internal analytics cloud that gathers information from nearly 100 different information warehouses and data stores, providing analytics on more than a petabyte of data. More than 200,000 IBMers had access to the new system upon launch, including sales, product development and manufacturing process engineers. Blue Insight served as a template for IBM Smart Analytics Cloud, the private analytics cloud offering for enterprises.

Big Data Analytics and SHARE

SHARE is an independent, volunteer-run association for IT professionals with big interests in big data. The concentration goes back to SHARE’s roots. The group started in 1955, two years after the release of IBM’s first computer, when a group of IT professionals got together to form the user group. Five decades later SHARE has more than 20,000 members representing over 2,000 of IBM’s top enterprise computing customers. (While independent, SHARE maintains a close partnership with IBM.) Members hail from a number of vertical markets, including finance, insurance, manufacturing, retail, higher education, government and consulting.

Helping IBM and other companies understand what users need to exploit new technologies like big data is a major focus for SHARE—from determining the right strategy to asking the right questions. Throughout its history, when new technologies are introduced, SHARE has sponsored conferences, published educational materials and position papers. SHARE’s efforts in 2012 will also include the development of task forces and discussion groups, to provide a platform where users of information technology can gather to explore these issues and share experiences on Big Data and Big Analytics. SHARE’s annual conferences, including the March 2012 SHARE in Atlanta Conference, are a major forum for discussion on these and other key enterprise IT topics.”

In the fourth and final installment in this series, Renee Boucher Ferguson asks experts in the field this question: If Big Data Analytics are the future, whither the trusty mainframe?

Recent Stories
Transforming Batch Resiliency through Greater Control Using Analytics and Automation

Rolling with the Tide: Flexibility in Mainframe Training

Message from SHARE: A Full-Throated Defense of COBOL