An interesting article from Colin White over at the Business Intelligence Network called Using Unstructured Business Content in Business Intelligence. Colin does his usual thorough job, and I have to say the term unstructured business content is pretty nice. It really helps me separate the useful stuff from what most of us have just been calling "unstructured content." I think unstructured business content does a better job capturing what's inside industry-specific XML schemas like XBRL, ACORD and others.
And here's a great summary of how this type of content might be used:
A review of semi-structured information shows that a high
percentage of this type of information is in an XML format. Tags in XML
files provide some semantic information about the contents of the file.
There are also an increasing number of industry XML vocabularies, or
metamodels, that add additional semantics to XML documents. An example
here is XBRL for reporting financial information. XML is becoming the
standard approach for exchanging information between systems and
between companies.
Examples of applications that can analyze unstructured and
semi-structured content and thus enhance BI processing include customer
and market intelligence, pricing optimization, customer sentiment and
complaint analysis, product safety and quality analysis, regulatory
compliance, legal discovery, fraud detection, financial analysis, and
IP protection.
One thing I would like to throw into the mix here is the power of XQuery, especially when it can be easily combined with SQL. Colin has an extensive section in the article covering the different ways of processing business content, but it comes up a bit short in getting specific on what you can do with XQuery.
First, as opposed to simple search (which he mentions), XQuery can pinpoint, extract and transform relevant pieces of unstructured business content. Second - and this is the big one - you can do selects and joins with XQuery. Last, but not least, as we do in Ipedo's product, you can seamlessly combine SQL with XQuery - so creating Oracle + XBRL reports are a snap. Imagine GROUP BY functions across XML policy reports and your CRM system, all in SQL (with XQuery invoked for you), and plugable right into Crystal Reports.
It's all here today. In production.