Ipedo Ipedo Blogs

January 02, 2008

Data Federation and the CMDB

This is the first in a series of posts looking into the fit between data federation and the Configuration Management Database (CMDB).  For those unfamiliar with CMDBs, here's a quick Wikipedia update:

A configuration management database (CMDB) is a repository of information related to all the components of an information system. Although repositories similar to CMDBs have been used by IT departments for many years, the term CMDB stems from ITIL. In the ITIL context, a CMDB represents the authorized configuration of the significant components of the IT environment. A CMDB helps an organization understand the relationships between these components and track their configuration.

What got me interested in data federation as it relates to CMDBs was the increasing frequency of the word "federation" in the marketing collateral and communications of several CMDB vendors (You don't have to call me twice to supper...).  It made perfect sense: with organizations so large and assets all over the globe, there seemed no way one could just have a database to store it all in.  Plus, it seemed to me a lot like reporting, which is a common use for data federation.

On my path to enlightenment, I came across Hank Marquis' post Enterprise CMDB, which helped me understand the core problem federation is aiming to solve.  Turns out it's not just that one needs to gather data from lots of distributed systems to create a federated CMDB, but it's the data model itself that cries out for federation:

But a CMDB does not store most of its data; it references data stored externally in other, perhaps relational, databases. And a CMDB is used to provide contextual awareness over non-obviously related bits of data. For example, a common inquiry posed to configuration management might be: "How many users, in sales, use SAP during the last week of the month?"

This kind of query is not well suited to a centralized relational database with pre-built SQL queries. There are just too many possible combinations of data. This type of query has to pull information from many systems, and the data it needs is probably not all nicely lined up in rows and columns ready to query. No, an enterprise CMDB has to be dimensional—a technology that represents data as different dimensions or plains.

And OLAP is not the answer:

Don't get too excited—simply having OLAP does not give you a CMDB for a couple of very special reasons. First, most CMDB data resides outside of the CMDB system. In order to pull this data from many sources requires federation—a new CMDB buzzword you will begin to hear about more and more.

By way of an example of federation and what it requires, lets consider an IT service for project management. Composing this service are human resource information residing in SAP, project management data in Microsoft Project Server, IT asset information in CA Unicenter, and networking hardware resource data discovered and stored in CiscoWorks.

There are, of course, challenges with things like reconciliation and synchronization.  That's where a product like Ipedo's comes in.  We already have may of these features built in to our system, providing great tools for someone trying to build a federated CMDB.

Here's how Hank puts it:

An enterprise CMDB is not a singular database, it is a complex system that has to federate other data stores, reconcile alternate views of the same data, detect unauthorized changes, synchronize approved changes with its own metadata store, and be able to dynamically represent configurations graphically on demand. This is no small task, and also the reason there are so few true CMDB solutions available today.

His last sentence really caught my attention.  Federation is a hard problem, which is why there are specialist vendors like us.  I think CMDB federation represents a huge upside for data federation

That's it for now.  I'll be digging into some more specifics in and around CMDB in future posts.

December 19, 2007

BI and Enterprise 2.0

Came across an interesting article from Colin White, who has been chronicling changes in the BI and integration worlds for some time now.  It's called Using Enterprise 2.0 for Business Intelligence, and worth the read. 

Here are some of the points that stood out for me:

Enterprise 2.0 does not replace current approaches – it simply provides innovative ways of quickly building some urgently needed business user capabilities.

This is a great opening point.  Business users and BI analysts should agree that bringing in some more modern Web features to BI is a good thing.  CIOs might like the fact that they can do it on the cheap.  Existing BI vendors might not like this (more on that later).

Most business intelligence applications focus on creating reports and analytics that aid executives and analysts in developing and optimizing strategic and tactical business plans and initiatives. As I have discussed in previous Business Intelligence Network articles, companies now want to take business intelligence to the next level by using it to drive daily business operations and to expand its use to a broader set of users, both inside and outside of the organization.

Another good point, something many of us have been calling "BI for the masses."  Some have derisively termed this "BI for the unwashed masses," feeling that not everyone can understand what they are seeing.  This is certainly an unenlightened view, and a dangerous one if your competition puts information into their customers' and employees' hands.

And what does the technology look like.  Here are three of my favorites, that fall in line with what we've been talking about here at Ipedo for some time:

  1. Information Exploration and Analysis: Employs technologies such as federated queries, enterprise search, and content analytics to explore and analyze unstructured and semi-structured business content, including that produced by the information collaboration component.
     
  2. Information Integration: Extends traditional enterprise data capture and transformation technologies with the ability to process unstructured and semi-structured business content, including that produced by the information collaboration component.   
     
  3. Information Syndication and Delivery: Uses syndication protocols (RSS and ATOM, for example), syndication servers, and data mashups to publish and deliver all types of structured information and unstructured business content to enterprise applications and users.

And he closes with this synopsis of what the current BI vendors are doing:

Enterprise 2.0 approaches have started to gain momentum outside of the business intelligence environment, but BI vendors have been slow to consider, or adopt, new Web 2.0 technologies. This has opened up an opportunity for smaller vendors outside of the traditional BI marketplace to compete with established BI vendors for a slice of the Enterprise 2.0 BI environment. These new vendors will be able to move quickly into the BI marketplace with lower cost and more usable solutions that may not have the functionality of the big BI platforms, but nevertheless will be attractive for addressing certain types of business problems. Meanwhile, the large BI platform vendors will find it difficult to move rapidly to compete because of their poorly integrated and older product architectures. It will be interesting to see how this battle will emerge over the next two to three years.

This is certainly true.  Why, one large BI vendor who shall remain nameless is still built on CORBA.  CORBA!  How many times have you seen that on the agenda at a Web 2.0 conference?  The point is that many product suites are showing their age.  In most cases, a more modern approach can yield just the results you need.

December 04, 2007

New ROI Guide

Over the past two years, one of our consistently most popular pieces of collateral has been our ROI guide.  So I'm happy to announce that we've released a revised and refreshed version, with a renewed focus on information virtualization and some tweaks from customers and partners who have used it.

Roi_guide_tnlrg_2 The guide contains five different quantitative models for calculating returns, so organizations can choose the one that most closely matches their technical or business situation. In addition, the guide includes additional qualitative methods that show how information virtualization can provide upside impact to a business.

To get your very own copy, just go here.

November 27, 2007

Fusing Together Oracle Products with Data and Service Virtualization

We did a demo recently that I thought was worth sharing.  With all the talk of Oracle Fusion in the press recently, I thought it was worth showing how simply and easily you can create custom combinations of data using Ipedo's data and service virtualization techniques.

In the demo we fuse together data from some Oracle 10g databases, with Salesforce.com (ex-Oracle CEO,  so we threw it in) and data moved from SQL Server by Oracle DI.  We put it all into Oracle BI Publisher, run Ipedo on Oracle AS, use another Oracle 10g database as a cache, and lastly move it into Oracle BPEL.  Whew!  Our professional services guys are now almost Oracle experts.

Here's the demo.  It's 15 minutes long, but broken into sections so you can navigate around quickly.

September 26, 2007

New Release of Ipedo's Virtual Integration Server

I'm very pleased to announce the new version of Ipedo XIP.  In 4.3, we've added a bunch of features to make the product even stronger.   I'm especially excited about the new incremental caching capabilities.

If you missed the announcement, here are the highlights:

The new enhancements are designed to increase both the performance and flexibility for large deployments, as well as accommodate the growing use of open source databases in the enterprise.  Whether creating data services in an SOA, or new reports in business intelligence applications, customers can access combinations of business information using a simple templated approach, with performance levels that are transparently maintained.  New features include:

 

-- Incremental Caching -- Ipedo XIP now allows for its cache to be
    refreshed incrementally, significantly reducing network traffic when a
    cache refresh is required.  Working automatically and in the background,
    the new incremental caching capability watches remote data sources for
    changes and automatically keeps the cache in sync.

-- Parameterized Views -- Parameterized views allow developers to
    encapsulate complex queries as parameterized templates, thereby avoiding
    construction of complex ad-hoc query strings at execution time. They are
    defined using a syntax common to prepared statements at design time, and
    parameter values are passed into the view at run time. Ipedo XIP performs
    all the optimizations on the resulting composite query tree to ensure
    efficient execution of the user query.

-- MySQL Enterprise Server Support -- Ipedo XIP can now work seamlessly
    with MySQL Enterprise Server, both as a data source and as a results cache.
    Extensive testing on data type interoperability and performance was done to
    ensure optimal deployment.

The new parameterized views are quite powerful.  MySQL support will be a hit in several of our accounts.  You can see the full release here

 

September 21, 2007

Enterprise Information Virtualization and the "Information Fabric"

I have to say, I am liking the new coinages from Mike Gilpen and Noel Yuhanna at Forrester: Enterprise Information Virtualization and the Information Fabric.

What's interesting is that last year they called it Enterprise Data Virtualization, but have correctly broadened the definition to include content, thus the switch to "information."  Here's how they described it in last year's report, which is the only publicly available summary:

Enterprises are facing the growing challenges of using disparate sources of data managed by different applications, including problems with data integration, security, performance, availability, and quality. Business users want fast, real-time, and reliable information to make business decisions, while IT wants to lower costs, minimize complexity, and improve operational efficiency. New technology is emerging that Forrester has coined "information fabric," a term defined as a virtualized data layer that integrates heterogeneous data and content repositories in real time. This technology is provided via middleware components that deliver quality information — "the truth" — when and where it's needed. (Here's a link to this year's EIV report - subscr. req.)

I like the idea of Enterprise Information Virtualization being the category, with Information Fabric being the catchy rubric for the concept.  That leaves the technologies for implementing EIV: query federation, virtual views, metadata and distributed caching, which fall in place nicely.  Forrester adds enterprise search to the list of technologies, which could make sense, depending on the application.

I've been talking about virtualization of data sources for quite a while in this blog, so it's nice to see Forrester's work. 

I think we still have a few loose ends to take care of, however.  Since EII and EIV are essentially the same thing (this is my take, not Forrester's), then what to do with the term EII? Also, Forrester also has Information as a Service (IaaS) in the mix, which I get, but to me sounds like something more like a hosted integration service.  Gartner talks about "data services" (which is about more than virtualization, and lacks content, hence the 'data') and the newer "Information Infrastructure."

Stepping back, I think it can be said that, despite the overabundance of terms, the analysts are converging on the idea that a new kind of infrastructure is needed that can deliver the right combinations of data and content to the right users on demand.  This infrastructure must, by its nature, be distributed.  Centralization is not an option.  It's also not about sticking information on a bus - this is for applications, not people.  Sounds like the perfect fit for a combination of query federation (structured and unstructured), virtual views, metadata and caching, which is exactly what Ipedo's got - no matter what you call it.

August 30, 2007

Data(base) Virtualization in the Mix

Came across a convenient summary of virtualization strategies today.  Though data and database virtualization have been discussed here and by analysts that cover integration, this is one of the first articles I've come across that discusses data and database virtualization as full fledged peers of the more familiar application and storage virtualization technologies.

A bit brief on the specifics, but a useful summary nonetheless.

Here's the link to the article.

August 16, 2007

EII Presentation from Mark Madsen

Came across a nice presentation on EII today by Mark Madsen of Third Nature.  It's called "Enterprise Information Integration Technology: Architectures, Uses and Abuses."

I highly recommend it to those looking to learn about EII and federation.  It gives a very detailed view and advises on where and when to consider EII.   I have to say, I couldn't have explained it better myself.

Here's a link to Mark's blog entry with the embedded presentation.

July 19, 2007

Data Transformations in EII

From time to time I get asked about Ipedo's data transformation capabilities.  I was asked again a few weeks ago by a prospect, so I decided to put two of our developers to work.   I asked them to write a few posts that explain what Ipedo XIP can do in the area of data transformation.

Now we have two great entries that describe both halves of Ipedo's dual-core query engine: one on transforms using SQL, and one on transforms using XQuery.

Both of these just scratch the surface.  Just like the old Chinese expression "A long journey starts with the first step," these two entries only begin to expose the rich set of transformation capabilities available.  But they do cover some common cases in EII, which are often different from ETL needs. 

Let us know what you think.  If you want more examples, all you need to do is ask.  In the words of another Chinese proverb, "Dig a well before you are thirsty."

May 18, 2007

Fusing SQL and XQuery to Combine XML and Relational Data for Reporting

There is a lot of talk these days about harnessing content for use in business intelligence.  This is something we've been enabling for some time at Ipedo, but it seems like more and more people are having the need/becoming aware of the possibilities.

Generally speaking, there are two types of content people are interested in - search results from textual documents and access to documents represented as XML.  I know, I know, there are more than that.  But if you look at what people are actually using, I contend they are looking to utilize search results on Web and/or Microsoft Office documents; or they want to search and extract information from semi-structured XML documents, which are usually forms, industry schemas (MISMO, FpML, ACORD et al), or transformed Microsoft Word or Excel.

So I want to show you something really cool.  Take a look at the picture below, which shows how you can use SQL to query semi-structured XML documents.  Perfect for BI.

Ipedo_xml_tables_7
 

The magic is our Dual-Core Query Engine that marries SQL and XQuery.  We have something called XML Tables that make XML documents look like relational tables.  So you can query a bunch of XML documents using SQL and manipulate it in a reporting tool like Crystal Reports.  Or you can query a combination of relational and XML sources using SQL and manipulate in Crystal et al.

If you want to see something live, you can also see how this same technology works against combinations of relational and Web Services data here.