Until your business users are able to answer at least some of these questions on a continuous basis, more advanced analytics and exercises in optimization may simply be allowing you to do the wrong things faster and more efficiently.
Jeff Hassemer does a great job focusing on the more tactical uses of small data to target customers in his excellent Advertising Age article
Getting Small Data right is hard.
Although the questions above straightforward, it is often difficult to get at the underlying information needed to even begin building intuition around them. Scale isn’t necessarily the biggest problem around integrating data and making it available to a user base. Small Data presents its own set of challenges that are further complicated when Big Data is added to the mix.
My experience has been that often, data analysis is project-oriented with a number of different people spanning lines of business, and IT might need to get involved to pull together the answer to what seems like a simple question. An analyst might spend days joining and aggregating data through Excel wizardry. This slows down the decision process and burns valuable time that could used to take action rather than waiting for analysis.
So, what are some of the things preventing organizations from successfully getting value out of their small data? Based on my observations and informal discussions with peers and customers, there are a few recurring themes that are worth sharing:
Even small data sets present many of the same challenges as large data sets – The hard part tends to be getting the business meaning of the data right, linking it to reference data, and handling the exceptions. The scale of data doesn’t have much impact on how hard it is to integrate as one might think.
It can be difficult for business users to explore data because of technical challenges – IT organizations may not have the right specialized expertise to wring every drop of performance out of their data oriented systems, or they are so preoccupied with fire-fighting that they can’t focus on decision systems that are competing with core business applications for attention. This means that handling performance and usability problems is often deferred.
The business has trouble accessing the data because of the way the data is organized – The semantics of data often aren’t clear and it becomes challenging for the business to use the information effectively. This article from HBR does a great job of addressing both the role of small data and the need for agreement on business rules (semantics).
The tools provided to the business are still too hard to use – Self-service analytics tools often still rely on technical skills, or knowledge of database structure that business users aren’t going to have. This leads to the emergence of new class of specialists that are charged with pulling data from the “self-service” tool and handing it off to others for analysis in Excel and presentation in PowerPoint.
What can we do about it?
There are a few key things that can help simplify getting more value out of Small Data and addressing some of the challenges identified above.
Think globally and create a vision – I am not suggesting that you should pursue the holy grail of an “Enterprise Data Architecture”, but there should be a cohesive sense of where data will reside, how it will get there, the technology that will support it, and the semantics associated with that data. This vision gives incremental efforts something to aim for and can be adjusted, as the organization gets smarter about its data.
Start small and deliver value fast – Ensure that you are able to quickly deliver useful, impactful business insight. What does this mean? Within a few weeks of starting, have new capabilities in users’ hands. Do not try to build a complete enterprise data architecture and attendant processes before going to work with real data, solving real business problems.
Don’t focus on sunk costs – Transitioning away from a legacy environment can be daunting, but the tools and technology available have progressed rapidly. Don’t be afraid to walk away from past investments that are no longer working for the organization. Building a cloud-based infrastructure could be surprisingly low cost.
Don’t forget about organization and roles – Think about the separation of concerns across organizations. Don’t expect business analysts to also be data scientists. Don’t expect data scientists to be software engineers. Be realistic about the skills you need and about where and how those skills can be sourced.
From a more technical perspective, there are a number of established technologies that can help support making better use of small data. There are a few areas in particular that warrant attention:
Look to the cloud first – This is probably obvious at this point, but I would suggest that any organization that is looking to build new infrastructure in the next 12 months should look to the public cloud first. The economics have become very attractive at all but the very largest scales, and unless you have highly specialized regulatory or security requirements, there are few functional barriers.
Look for tools that work well at the scale you already have – Don’t create complexity in anticipation of requirements that may never materialize. The cost and complexity of tools that operate well at Small Data scales has been dropping precipitously. Big Data tools like those included in the Hadoop stack can be useful at smaller scales, but not necessarily enough offset the added complexity until scale increases into the multi-terabyte range.
Partner for scarce skills – Most organizations will not be able to attract the specialized skills that they need to succeed in fully realizing the value of their data. Look to external providers to fill the gaps. Specialists should be favored over broad-based integrators/consultants who operate on a high-leverage model that could saddle you with expensive novices.
Look forward to scaling up when you need to – Don’t completely ignore the potential needs of the future. Identify small pilot projects that can help your organization build expertise and confidence without committing to an infrastructure that you just don’t need yet. The cloud makes it very cost effective to set up and tear down infrastructure for experimentation without a substantial, long-term capital outlay. Find opportunities to start leveraging a Big Data platform that can guide future investments.
While this post challenges the prevailing emphasis on Big Data and advanced analytics, these are both areas that are important to today’s enterprise and are becoming moreso.
However, it's clear that starting with core enterprise information, and delivering key analytics to the business, without over-engineering the process and infrastructure is an important step on the journey to a more comprehensive approach that encompasses both “Big” and “Small” Data.
There is a lot of great material out there that touches on this and related topics. Here are just a few of the items that I have run across that expand on some of the ideas in this post:
Demetrios Kotsikopoulos is the founder of Silectis. You can find him on LinkedIn and Twitter.