A New Tool for Your Content Arsenal
Editor’s Note: For years content strategists and marketers (us included) have struggled with the laborious nature of executing content inventories and audits. The recent launch of CAT (Content Analysis Tool) immediately caught our attention, so we wanted to get Paula Land, co-creator of CAT, on here to give you a proper introduction.
The New CAT on the Block
Whether you are managing your own website or in the business of creating and delivering sites for clients, you’ve probably conducted a content inventory and audit to assess the current state of your content. And you may have struggled with the available tools and the sheer volume of data associated with even a moderately-sized site. It is typical to spend days and even weeks on the data gathering and organizing process. And that’s before you even begin doing the work of the actual content audit.
Although there’s no getting around the fact that inventorying and auditing involves wrangling a lot of data, we’ve built a new tool that helps you accelerate the process of gathering all that information and starting to make sense of it all.
The Content Analysis Tool (CAT), from Content Insight, has just launched. CAT is designed to speed up the process of creating inventories and audits and present the data in a way that’s easier to interact with. CAT consists of a crawling engine, accessible from a Job Setup interface, from which you configure your crawl. You can include or exclude patterns of links, which allows for fine-grained control of the results. For example, if your site includes a number of sub-domains, you may want them included along with your base URL so you get the complete picture. On the other hand, if you’re auditing a specific section of the site and want to focus in on just those pages, you can restrict the crawl to that particular section’s URL structure. If you happen to be a whiz with regular expressions, you can get even more crafty with the configuration.
When a crawl has completed, the results are presented in a Dashboard view (shown above), which provides a summary list of the counts of files, by type, and a complete list of all the files found. You can filter the results by file type or status or sort and view the results by URL, type, level, date, or title. In this view you can export the results (a .csv file is generated) or drill down into a detailed view of each page’s data (shown below), including the metadata for the page (title, description, keywords); lists of all the images, media files, and documents associated with the page (click to open and view any of the files in the list), lists of all the links in and out of each page, and, even a screenshot of the page (an option that you choose in job setup).
From Data Analysis to Content Analysis
Content inventories are usually considered the quantitative step in assessing site content. They are often just a list of files, which is then organized and supplemented to create the audit, the qualitative analysis. Using CAT, the line between the quantitative and qualitative starts to blur. It allows you to move more quickly and seamlessly from the data analysis part of the project to the actual content analysis.
Among the insights that can be gained about the content using CAT data: URLs and metadata allow you to evaluate whether the site is search engine friendly. Page titles allow you to identify duplicate page titles, which can be a search issue as well as an indication of duplicate content. Links in and out show the connections between content and help surface discoverability issues. Lists of documents help you see if the site relies heavily on non-indexable content, such as PDFs.
Using a tool like CAT to automate the inventory not only speeds up the actual data gathering process, but by including information about the pages themselves, it helps you move more quickly into the audit process and on to strategy development.
If you are working on a migration project, the inventory also plays a key role. Many sites are constructed from a variety of sources—a content management system (or two!), databases, ecommerce engines, and so on. So the only way to have a complete list of all the pages that are currently served up to end users is to scan the live site.
Using a CAT-created inventory, you can track all the pages, as well as all of the images, documents, and media files that need to be migrated, helping ensure that you don’t miss key files during the migration. The lists of links in and out help you track what links may be broken if pages are moved or removed, so you can accurately configure your redirects.
CAT also has a job comparison feature, which allows you to rerun and compare two jobs, listing all the files that have been added, deleted, or changed from crawl to crawl. This information is helpful during the migration process as well, particularly since sites continue to change during the course of the project and you won’t want to either migrate old content or miss new content.
Give it a (Free) Try
If you want to give CAT a try, click here to register for a free trial account.
P.S. – CAT is brand new but we have already have a great roadmap of new features we would like to add. If there are ways we can improve or enhance it that would help you in your work, please let us know in the comments section below.