The Big Deal About Big Data – Part 2 of 3

By Pedro Pereira

The amount of data the digital world generates is expected to grow at a 44-percent rate over the coming decade. In the second chapter of a 3-part series for SHARE President’s Corner, veteran tech writer Pedro Pereira explores the Big Deal about Big Data…

A Fine Balance

Another issue big data creates centers on privacy. “Personal data such as health and financial records are often those that can offer the most significant human benefits, such as helping to pinpoint the right medical treatment or the most appropriate financial product,” according to the McKinsey report.

It will take a fine balance, however, to ensure data from medical, financial, human resources and legal records isn’t exposed. Much of that data travels over the public Internet, which means securing it from a technology standpoint is critical. But that’s only part of the challenge: Actually accessing private data for analysis can itself be problematic, and requires thoughtful policy-making.

“Policy makers need to recognize the potential of harnessing big data to unleash the next wave of growth in their economies. They need to provide the institutional framework to allow companies to easily create value out of data while protecting the privacy of citizens and providing data security,” says McKinsey.

Needles in Haystacks

Extracting actionable information from the growing morass of unstructured data is like finding needles in haystacks, Bhambhri says. It’s not easy to identify nuggets collected from laptops, databases, medical devices, smartphones, RFID tags and GPS devices – to name a few – for real-time insights and to spot historical patterns for long-term benefits.

Rosen says enterprises are just beginning to understand the magnitude of the big data challenge. And even though agencies such as NASA have worked on it for decades, he says, “we are still in the early stages.”

IBM is working with enterprises in various industries and governments through its Smarter Planet initiative to collect, analyze and make data actionable. Utilities are using analytics on data collected from sensors to prevent malfunctions; credit card companies are analyzing use patterns to spot signs of fraud; marketers are collecting social media data to target their promotions, says Bhambhri.

In Dublin, Ireland, an IBM InfoSphere Streams project has been collecting traffic data from buses and sensors at intersections and traffic lights. With 4,000 detectors in place in a road system with 700 intersections, the Streams project is receiving 20,000 data records per minute, a pace of more than 300 per second. A thousand Dublin buses engaged in the project are sending 3,000 GPS readings per minute, a rate of 50 per second. On average, each bus sends location data every 20 seconds.

Why so much info from the streets of Dublin? One benefit the Streams project aims to deliver is a system that triggers traffic lights to give any bus that approaches an intersection a green signal. In addition to trimming operational costs for diesel fuel and electricity consumed while idling, the guaranteed green could boost ridership, as citizens opt for timely bus routes to avoid traffic jams.

Meanwhile in Canada, Project Artemis, a collaboration of IBM, the University of Ontario Institute of Technology and Toronto’s Hospital for Sick Children is collecting data from bedside devices and notes from doctors and nurses to help newborns. The goal is to use data to spot potential signs of life-threatening infection 24 hours in advance.

“Close to 200 pieces of information get generated per second for every baby,” Bhambhri says. It is humanly impossible to properly analyze all this information, unless you resort to technology. Project Artemis uses IBMInfoSphere Streams, a new processing architecture that employs targeted algorithms to give doctors information in near-real time to make potentially life-saving decisions.

InfoSphere Streams is part of the IBM BigInsights Enterprise Edition analytics platform, which enables rapid large-scale analysis of diverse data. Built on the open-source Apache Hadoop platform, BigInsights supports unstructured and structured data.

IBM is taking a leadership position in big data with its Smarter Computing initiative. Other companies, such as SAP, Oracle and Google, also have big data initiatives. Microsoft, meanwhile, wants users to view its upcoming SQL Server 2012 release as a platform to help them unlock insights from big data.

In the next installment of the Big deal About Big Data, Pedro Pereira tells what SHARE is doing to help with Big Data and how the cloud could make matters easier or more challenging – depending on your perspective.

Recent Stories
Mainframe Matters: How Mainframes Keep the Financial Industry Up and Running

GDPR and Mainframe: What You Need to Know

Millennial Mainframers: Kyle Beausoleil on Mainframe’s Perception Problem