REVIEWER: Keith P. DeWeese, director, information and semantics management at The Tribune Co.

BACKGROUND: The Tribune Co. is one of the country’s leading multimedia companies, operating businesses in news publishing, digital and broadcasting. Since late 2007, the Tribune has been using SAS Ontology Management and SAS Enterprise Content Categorization. These applications, both part of the SAS Text Analytics suite, are used as part of the Tribune’s natural language processing initiative, semi-automatically indexing and categorizing Tribune and third-party content.

PLATFORMS: Dell PowerEdge 2950 servers, dual quad-core Intel Xeon Processors, 16GB memory, Windows 7, MySQL.

PROBLEM SOLVED: Time is of the essence when breaking news hits, and websites must post a journalist’s story quickly. We needed a way to quickly identify the topics covered in news content – beyond the headline – in order to relate articles and drive traffic to all the other relevant portions of our website. Also, we needed a way to index the content so our site visitors could find and access the articles they’re most interested in. The Tribune uses SAS Text Analytics to quickly and accurately define descriptive metadata, and automatically index, tag and organize this electronic content as it is added by both consumers and editorial staff, all with minimal use of internal resources.

PRODUCT FUNCTIONALITY: SAS Text Analytics automate the organization of electronic documents within our network of news and broadcast television websites, spanning markets across the U.S. For example, documents in each of the Tribune’s Topic Galleries are identified by parsing the content using terminology and definitions, or rules, managed using SAS Ontology Management and SAS Enterprise Content Categorization. The process evaluates the content and associates it with other relevant items. Based on the content of the articles, SAS applies meaningful descriptive metadata to news items and also supports defining the associations that might apply across different content items – both of which are then input into our search and retrieval technologies. We apply this from news feeds from our markets, and we also tag user-generated content such as blogs, photos and comments. All of this significantly improves the customer experience. SAS Text Analytics significantly increased Tribune page views, exceeding our initial implementation goals by 300 percent. We improved Tribune’s search-and-query retrieval. We can now extract greater value from information assets as new products using news content are developed. The automation SAS provides makes us much more efficient.

STRENGTHS: SAS software’s advanced browsing and search-and-replace capabilities make ontology exploration and modification consistent and comprehensive. The ability to collaborate in developing multiple ontology projects is simplified with an interactive GUI for editing and refining semantic terms and relationships. SAS allows us to develop ontologies and use them immediately for classifying content, thanks to SAS Ontology Management’s integration into SAS Enterprise Content Categorization. SAS provides prebuilt, industry-specific taxonomy starter kits and prebuilt rules for defining concepts and their attribute values to jump-start taxonomy development projects. SAS Ontology Management, as a terminology management application, has many uses and is a must-have for us. 

WEAKNESSES: We’d prefer SAS Ontology Management was Web-based and that it had a dashboard with customizable components. Some tasks, such as relating terms, are multistep processes.

SELECTION CRITERIA: The Tribune Co. selected SAS Content Categorization and SAS Ontology Management because of its widespread use by the news industry, its ability to integrate with our infrastructure, the cost, SAS support, and its add-ons, such as the IPTC NewsCodes.

DELIVERABLES: The primary output for the Tribune is descriptive metadata that can be leveraged by other applications supporting various initiatives – for example, search, product aggregation and content push.

VENDOR SUPPORT: SAS consistently and comprehensively supported the pre-implementation of the applications and met all deadlines. Likewise, after going into production, support is available to the Tribune 24x7, and it is always efficient and solution-focused. SAS’ customer support has been exceptional.

DOCUMENTATION: The documentation is thorough, though somewhat daunting in coverage. Of course, this is because the applications are highly sophisticated workhorses, so there’s no quick-start approach. Additionally, because the documentation covers so much ground, its utility depends on one’s background and interaction or development with the applications. For example, some parts are useful to IT developers and other parts are useful to linguists.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access