Master Data Management meets Big Data
Data quality, data governance, and material master data management are what I generally talk about in these posts, because that is what I know (duh!). However, in the past year, one term related to data has become extremely popular and has become a part of our lexicon. It is being touted as the next big thing and the solution to almost every problem faced by organizations. Of course, the term I am talking about is ‘Big Data’.
What is Big Data?
If you have not lived under a rock for the past couple of years, you must have heard someone or the other talk about Big Data and how it was going to revolutionize … well, everything. But do you know what it means and why it is being hailed as a panacea? Let’s learn a bit about it first.
Big Data is not a new type of data, rather the term refers to the scale and accessibility of data available in the world today. For most of human history, the generation and distribution of data (of all sort) was restricted – sometimes due to a lack of literacy (people being unfamiliar with Greek and Latin), sometimes due to a lack of distribution resources (books had to be copied by hand before the printing press was invented), while sometimes due to gatekeepers that regulated how much and what data could be released (newspapers and news channels editing content). However, with the internet and especially social media, the rate of generation of content as well as its reach has scaled new heights. Just to give a few examples, every minute –
- Almost 100,000 new tweets go out on Twitter
- Almost 700,000 status updates are posted on Facebook
- Almost 700,000 searches are performed on Google
- Almost 168 million e-mails are sent
The amount of data being generated and distributed is staggering, and these examples do not include the enormous amount of data we are producing outside of social media. All this is termed as Big Data and the term is defined using three V’s.
- Volume – The amount of data being produced and distributed. Both these rates have become exponentially larger in the last few years and continue to grow.
- Velocity – The rate at which this data is being produced is also a feature of Big Data. Gone are the days when a monk took months copying a book by hand so that his seminary could have a copy. These days, Walmart is handling over 2.5 petabytes of data – 167 times the information contained in all book sin the US Library of Congress.
- Variety – All this data is not just text. It includes images, videos, graphs, and any other type of data you can think of. Just to give an example Facebook is estimated to handle more than 50 billion photos – and it is not a photo sharing website!
Where does MDM come in?
Master Data Management can provide the fourth V to Big Data. It is essential to ensure the quality of all this data, and more importantly its credibility. Thus ‘Veracity’ can become the fourth V and this can be ensured by using a robust data quality system. As a large part of this data is user generated, credibility still remains and issue. However, if we are able to incorporate data quality (and even data governance) in this deluge of data, we can make Big Data more reliable, more useful, and more effective for organizations around the world.
What do you think about Big Data and MDM coming together? Share your ideas in the comment section below.
Further Reading –
Blog – Master Data Governance is Re-shaping the Enterprises
Case Study – Leading Electric Utility taps Verdantis for help executing Massive MDM and ERP Upgrade Initiative
White Paper – Improve Enterprise Data Quality through Master Data Management
Image Courtesy - liliendahl.files.wordpress.com
Latest posts by Vipul Aroh (see all)
- Key Takeaways from our Webinar – Strategically Manage Data Quality in an ERP rollout - November 4, 2015
- Boosting the ROI of an ERP project – Part I - August 31, 2015
- Master Data Quality Improvement – Adding to your EAM Implementation ROI - July 29, 2015