Demystifying IBM InfoSphere MDM Products: How the Products Work

Physical, Virtual, Hybrid, and Collaborative MDM—what are these things, and what do they mean to my business? 

MDM. Master Data Management. If you’re part of a large enterprise, you’re either thinking about implementing MDM, you’re already implementing it, it’s already established, or you don’t have it and aren’t planning to get it because it’s expensive, it takes work to set it up, and your boss can’t justify making that sort of investment. Or maybe nobody in your enterprise has even heard of it. (In the latter two cases, it may be time to polish the ol’ resume, if you know what I mean.)

The idea of MDM is more than a software product. The last M is for Management, so think of MDM as a combination of technologies, policies, and practices that enable an organization to control and manage change to the critical asset—their Master Data.

Physical MDM (IBM MDM Advanced Edition) and Virtual MDM (IBM MDM Standard Edition and IBM MDM Advanced Edition) are both similar in the types of data that they manage—data regarding individuals (patients, customers, citizens), organizations (customers, vendors, other businesses), and various others; and in the case of Physical MDM, Product and Account-specific domains. Both products also offer Custom domain capabilities, though the implementation process is quite different. Conceptually, both products consolidate the data from various source systems into a single MDM repository to create a “golden record.” However, Virtual MDM and Physical MDM do this in slightly different ways.

Virtual MDM brings together member records from various source systems and stores them as individual members within the MDM hub. Key attributes will always be included with the member record, including the original source system of the record and its identifier within that source. 

The member records are probabilistically associated with other member records in the hub and are linked together into discrete entities either as a result of a data steward’s decision or, more commonly, based upon the likelihood that the records actually represent the same individual (or customer, or organization, or vehicle, or suspect, etc.). The linking occurs by simply making each member record a part of an entity—effectively associating an entity ID with that member. Two records sharing the same entity ID are considered linked and deemed to represent the same individual (or business, or organization, etc.). 

Each member record may contribute similar attributes (name, address, phone, etc.), with some sources potentially providing attributes unknown to some of the other systems. Since each member contains pointers back to the source record, the hub is also effectively an index or reference. When the entity—the golden record or 360-degree view—is retrieved, the information presented is assembled on the fly from the members that comprise that entity. Thus the golden record does not exist as a true record: it is virtual.

Physical MDM also receives member records from various source systems. It will compare those members against data already stored—either using the same probabilistic methods just described or using deterministic rules—to determine whether the new information matches information already present. However, if a match is encountered, the new data is typically combined with the existing record from the MDM system. In this case, there ends up being a single physical copy that provides a golden record or 360-degree view. 

If an incoming record is not matched with an existing record initially, but is matched manually (using the provided data stewardship user interface [DSUI] or via the provided MDM services) or there is a change to the critical data used in the matching process which subsequently deems the records duplicates, a collapse process can take place which will create a net new physical record that is a compilation of the original records, taking into account the established survivorship rules. 

Physical MDM has a pre-defined schema that includes common types of data that enterprises might want to include in their master view. Robust tools are provided to add new tables to the schema or extend existing ones. With Virtual MDM, there is no predefined layout. There’s more to the picture, of course, but these are the key differentiators.

For more information, see here.

Hybrid MDM uses both the Physical MDM and Virtual MDM servers. In the Hybrid MDM implementation, source systems feed data into the Virtual MDM system, the data can then be matched and linked with records from other sources, and then the virtual golden record gets persisted using Physical MDM. Enhanced data can then be associated with the Physical golden record such as privacy preferences, opt-out/in, etc. At present, this only applies to Party data (individuals, businesses), but you are effectively getting the best of both worlds.

At this point, you may be asking yourself “How do I know whether to pick Virtual MDM, Physical MDM, or Hybrid MDM?” That’s a pretty big discussion and will be the topic of a future blog, so stay tuned. 

Collaboration Server is a very different animal. The data stored and managed by Collaboration Server is product data. It typically has far more complexity than data previously discussed. Product data needs to share information from product families or templates, it may come from various suppliers, it may be represented as part of multiple catalogs or a hierarchy of catalogs, attributes may differ based upon product type, products may be bundled or related for cross-sell/up-sell, etc. And as the name implies, with the Collaboration Server there is a huge emphasis on collaboration—people from across the enterprise (or even people from outside the enterprise) can readily collaborate on the creation and maintenance of the definition of products. Even the concept of “product” is quite ambiguous—a chemical company has very different products and volumes than a consumer electronics company or a financial institution. Within the Collaboration Server, the creation of products is controlled by one or more workflows, each of which may have multiple steps, various validations, approvals by inside or outside parties, etc. 

For a more comprehensive understanding of the capabilities of the Collaboration Server, see here.

There is a certain degree of overlap between Collaboration Server and Physical MDM. Physical MDM is capable of storing and operationalizing product information, as is Collaboration Server. Physical MDM has data stewardship and product matching capabilities; however, the workflow/approval functionality is more robust in Collaboration Server. At the time of this writing, there is no “Hybrid for Product.”


Paul Mickelson
Solution Implementation Manager/Analyst

Paul Mickelson is a Solution Implementation Manager with over 25 years of technical expertise, including 15 years of customer-facing experience in Professional Services, Pre-Sales and Consulting roles. He has led enterprise-scale deployment projects of IBM InfoSphere MDM solutions for a wide variety of major clients. Paul provides experience in all project phases from pre-project planning and requirements gathering through deployment and transition. Additionally, Paul has hands-on experience in all phases of product lifecycle—from sales through to deployment and ongoing operations. Paul has developed standard project documentation, training materials, sales collateral, and web content. He has led training sessions both in-person and remotely. He has experience leading, mentoring, and working with offshore teams in Europe, China and India.