As I have been building analytics competencies and platforms with customers, I am continuously running into the same type of questions and desires and they center around a critical, but poorly understood topic; Master Data Management or MDM. I thought that it was time I got a couple of thoughts on this topic out so others might benefit from what I have seen so far.
What is Master Data Management Really?
First and foremost, MDM, like DevOps, is not a thing. You cannot buy MDM (yet - unlike DevOps, you might be able to get 95% automation or more for MDM). Currently, MDM falls largely into two broad buckets: DIY, and completely ignored.
In the completely ignored world, data is like the junk-drawer in your kitchen; every once in a while you rummage through it looking for something you thought you saw in there once last week and thinking how you should clean it out, but then you get distracted by the TV or the dog or whatever just flew by the window and you close the drawer and ignore it again. Pretty soon though that drawer is overflowing with stuff; some of it is undoubtedly useful - even VERY useful, like the wine opener, flashlights, batteries, and the like. Unfortunately, as in the world of data, most of it is just stuff. It may have a use, but by sitting in the dark in the drawer it doesn't really do anything for you but take up space.
The DIY world of MDM happens when an organization realizes that there is stuff in the junk drawer they need and actually takes steps to clean it up. The reason you cannot "buy" MDM is because there isn't yet a good, comprehensive, and most critically AUTOMATED way to classify and tag data. MDM is at its heart a database of data about your data that is, in itself, also useful data for analysis and discovery.
The “nuts and bolts” mechanism for MDM is stupidly simple really; you look at the data and tag it (usually, but not always, in some kind of repository) and then use a simpler interface to this repository for discovering and retrieving data that has relevance to your particular need (usually a search tool). Where DIY MDM breaks down for pretty much everyone is that it is a full-time job (often a bunch of full-time jobs) to manage and keep this catalog of data current and relevant for use. This is where automation is critical since new data is added continuously to every possible type of data source. Just imagine your typical corporate environment and you can get an idea of how truly awful this job really is; having to manage and maintain a catalog of files, share point data, databases, external sources, applications, and every other possible data set is just not a simple task.
Tools to Harness Your Data
There are definitely tools that greatly simplify this process and they generally either enable better discovery of data sources, or the simplified management of the resultant catalog. An ideal solution would encompass both capabilities, along with the ability to then build a data pipeline to pass data into the analytics and application work stream seamlessly, without the constant manual intervention of a data engineer or application developer. This solution is on the horizon as the ecosystem around analytics grows, and in the meantime, it is possible to greatly simplify the overall MDM workload and improve the time-to-relevance and results with the tools that are already available. The time where doing nothing for managing data has past, and with the continued explosion of data across the enterprise, it is more than ever a critical element of every organization’s data-driven strategy to get a handle on this data. The key here though is that like all aspects of digital transformation, the tools are only one element of the transition. Driving people and the entire corporate culture towards the need for data management and tighter rules of engagement around the use and management of data. This is a significantly more difficult part of the overall MDM discussion, but it can be done and there are great examples of companies who have successfully completed this transition and continue to build competency in the Data Management space.
The digital transformation wave is only just beginning to really gain mainstream momentum, and becoming data driven is a part of this overall change that is both critical to get right, and extremely difficult to achieve without the right dedication and tools. With a data-delivery platform like Talend, that provides the best of the data integration, governance, quality, and MDM, a tool to support this data-driven imperative exists today. The connection between master data management and data-driven cultures lets organizations take a “new” approach to the 360-degree view of a customer. Looking to get started? Check out this guide from Enterprise Management Associates “Master Data Management for Data-Driven Organizations”.