Connotate Web Scraping for Data Monitoring, Extraction & Collection

Supporting Large-Scale Content Ingestion Through Automated Data Collection

Connotate helps the Associated Press (AP) automate data collection for its large-scale content ingestion and aggregation pipeline.

case-study-apThe AP is one of the largest and most trusted sources of independent news gathering in the world. Its news pipeline ingests, enriches and distributes real-time text, audio, imagery and video news 24/7 across the globe.

Several years ago, the AP sought to automate content aggregation from member sources that lacked support for direct news feeds. Today, Connotate is a key component of AP Exchange, a content management system used by AP members worldwide to exchange news in real time.

Connotate is currently working with the AP to transition its on-premise deployment to a maintenance-free, hosted solution, preserving the continuity of the AP’s business processes while reducing operational expenditures.

Project Highlights

  • Connotate aggregates content from hundreds of AP member sources, delivering it in a usable format
  • Connotate collects Web data and metadata, applying contextual information and delivering the package to a database where it is normalized and indexed
  • Connotate also delivers accurate, timely content to the AP Mobile application suite (for Apple iOS, Android and other mobile and tablet platforms)

“Connotate has allowed us to automate a tedious, time-consuming process with an enterprise-class solution for content detection, extraction and acquisition. This has dramatically increased the reach, timeliness and accuracy of our news ingestion operation” – Lorraine Chichowski, Senior Vice President, Technology, Associated Press.

Discover how Connotate can help your organization automate news and content aggregation to reduce costs and support revenue-generating products and services. Contact Us Today!

Market:
News and content distribution.

Challenge:
Extend the use of automation to support large-scale content ingestion, aggregation and enhancement, while requiring minimal effort on the part of AP members who use the distribution system.

Results:
Connotate automates data collection from 350 AP member sites. On any given day, 3.5 billion people see news distributed by the Associated Press.

DATA AUTOMATED FROM

350

AP MEMBER SITES

3.5 Billion

PEOPLE SEE NEWS DISTRIBUTED BY THE ASSOCIATED PRESS