Source: Connotate Blog Post
Web Scraping, Clipping, Encapsulation, Ripping, Surfacing… Call it What You Want – The Value of Web Data is Undeniable
Blog Post (by Gina Cerami) – November 10, 2008
Having just returned from the TDWI event in New Orleans it’s undeniable that many of you want to consume unstructured web data and use the data for your business intelligence initiatives.
According to the “Capturing Web Data and Content for BI” session presented by Mark Madsen of Third Nature, 45% of people report that web pages are being accessed for data now; and that number will increase to 80% in three years. They’re looking for public census data, competitive pricing information, published industry data posted by industry associations and government sites, production data, industry benchmark data, sentiment data and more.
The session was quite informative but presented the challenge of making sense out of unstructured web data for consumption by the BI team to be a difficult task that requires intimate knowledge of web integration tools, web application frameworks and application architectures. While it’s important for the IT team to understand and appreciate these nuances, all the business user should be concerned with is:
It was great to see and hear how more and more people are relying on web data for their business intelligence needs as it only reinforces the value web data provides. But where were the vendors and success stories focusing on scalable solutions that could handle all that unstructured web data? Where were the solutions that are enterprise class but don’t require a programmer’s thought process and skills to harness the value of the web? Think I’ll reach out to some BI departments and present them with a scalable, enterprise class solution!