Data Federation and the CMDB
This is the first in a series of posts looking into the fit between data federation and the Configuration Management Database (CMDB). For those unfamiliar with CMDBs, here's a quick Wikipedia update:
A configuration management database (CMDB) is a repository of information related to all the components of an information system. Although repositories similar to CMDBs have been used by IT departments for many years, the term CMDB stems from ITIL. In the ITIL context, a CMDB represents the authorized configuration of the significant components of the IT environment. A CMDB helps an organization understand the relationships between these components and track their configuration.
What got me interested in data federation as it relates to CMDBs was the increasing frequency of the word "federation" in the marketing collateral and communications of several CMDB vendors (You don't have to call me twice to supper...). It made perfect sense: with organizations so large and assets all over the globe, there seemed no way one could just have a database to store it all in. Plus, it seemed to me a lot like reporting, which is a common use for data federation.
On my path to enlightenment, I came across Hank Marquis' post Enterprise CMDB, which helped me understand the core problem federation is aiming to solve. Turns out it's not just that one needs to gather data from lots of distributed systems to create a federated CMDB, but it's the data model itself that cries out for federation:
But a CMDB does not store most of its data; it references data stored externally in other, perhaps relational, databases. And a CMDB is used to provide contextual awareness over non-obviously related bits of data. For example, a common inquiry posed to configuration management might be: "How many users, in sales, use SAP during the last week of the month?"
This kind of query is not well suited to a centralized relational database with pre-built SQL queries. There are just too many possible combinations of data. This type of query has to pull information from many systems, and the data it needs is probably not all nicely lined up in rows and columns ready to query. No, an enterprise CMDB has to be dimensional—a technology that represents data as different dimensions or plains.
And OLAP is not the answer:
Don't get too excited—simply having OLAP does not give you a CMDB for a couple of very special reasons. First, most CMDB data resides outside of the CMDB system. In order to pull this data from many sources requires federation—a new CMDB buzzword you will begin to hear about more and more.
By way of an example of federation and what it requires, lets consider an IT service for project management. Composing this service are human resource information residing in SAP, project management data in Microsoft Project Server, IT asset information in CA Unicenter, and networking hardware resource data discovered and stored in CiscoWorks.
There are, of course, challenges with things like reconciliation and synchronization. That's where a product like Ipedo's comes in. We already have may of these features built in to our system, providing great tools for someone trying to build a federated CMDB.
Here's how Hank puts it:
An enterprise CMDB is not a singular database, it is a complex system that has to federate other data stores, reconcile alternate views of the same data, detect unauthorized changes, synchronize approved changes with its own metadata store, and be able to dynamically represent configurations graphically on demand. This is no small task, and also the reason there are so few true CMDB solutions available today.
His last sentence really caught my attention. Federation is a hard problem, which is why there are specialist vendors like us. I think CMDB federation represents a huge upside for data federation
That's it for now. I'll be digging into some more specifics in and around CMDB in future posts.

