Content taxonomy is the art, science and general pain-in-the-@ss process of organizing an existing site to fit with a new site. Sounds easy enough, right?
Imagine being tasked to take 10,000 pages from a site, and find pages that are down (404), pages that are currently being redirected (301 or 302), images, music and audio files, and everything else that comes along with a large website. That part’s easy actually, because there are awesome, free tools that pull all of the information for you. But now that you have all of that information, it’s time to organize all of those line items into a more readable format for a new site.
Prior to doing a content taxonomy, I’d highly recommend doing other upfront strategy work to ensure that the new mapping takes into consideration easy to understand navigation and new content opportunities. Some of those upfront strategies could be keyword research, competitive analysis, user interviews, analytics review and an audit of the current site. With the information gathered from these strategies, it’s easier to see where strengths and weaknesses currently are, so you know where you want to go with the future site.
The majority of content taxonomy is time consuming because a lot of URLs are like this:
If you can’t understand what that URL above is, than neither can your readers/visitors or the search engines, which means you’ll have to go to each individual page to be able to put it in its new home. Then, you need to give it an SEO-friendly URL that people and search engine spiders understand:
And the end result of a content taxonomy is very, very large spreadsheet that makes content migration easier. It helps your developers know where 301-redirects need to happen, and it helps your content team know exactly what content goes with what page – beautiful!
What if the tool missed pages?
I’d love to say that these tools are ultra-reliable, just like I’d like to say that all of your URLs are linked in such a way that allows search engines spiders to reach all of those pages, but from what I’ve seen, 9 times out of 10, that’s simply not true.
So you may need to double check your work; yes, that means making sure that you have every URL on the site, even the new ones that were created between the time the content taxonomy was compete and the site goes live.
What I typically do is run the original scan of URLs that I have and compare it to a new scan. I then de-dupe pages in excel and see if there are any new finds. I also like to use Google Analytics (or whatever analytics program is used) to go back from the date that the original content taxonomy was complete to now to help find pages that the crawler didn’t find.
You should actually pull a list of all of the URLs from your analytics program when you do the initial analysis; it’ll just help ensure that you truly have a complete list of URLs… nothings worse than having to start over because the tools crawlers were unable to see some pages.
What if I didn’t map everything?
Such is life, no one’s perfect. You’re probably going to miss some pages, but the key is to make sure you got as many of them as possible, especially top trafficked pages and top linked to pages. If you’re working on your own site, this will be much easier because you’ll probably already have an idea of what those pages are, but if you’re working with a client, it’s nice to send over the content strategy ahead of time so they can give it a glance over as well (glance over is an understatement, it’ll probably take them a week to go through the spreadsheet).
But the safest bet is to make sure you’re eyeing webmaster (Google and Bing) like a hawk after the new site launch:
- What pages are they reporting with errors?
- Run the site through Moz Campaigns; is there an influx in error pages there?
- Try to use as many tools as possible to monitor your site closely after a new launch. You need to make sure that any problems are caught immediately and fixed as soon as possible to help the search engine spiders better understand what is going on with your site.
Don’t be surprised if you see a dip in organic traffic following a big site change. Even submitting the site to be re-crawled can sometimes take the spiders days, or weeks to catch up to all of the changes. A dip in traffic lasting a month or two wouldn’t be surprising, as long as you’re taking the preventative measures necessary for redirecting pages that were once live. However, if you’re seeing a dip that 25%+, you may want to call in a professional to help you review.
Lastly, there are other things that need to be optimized on a site before it goes live; it’s not just as easy as mapping content. Pages need to be optimized and SEO needs to be top of mind from wire-framing to launch. I once had a site owner call me (who didn’t have an SEO person help them through the site redesign) freaking out because the new site lost all of its organic traffic – the good news for him was that their developer forgot to take of the no-index meta, so it was a quick fix… but there are many other issues like this that can cause a severe loss in search engine traffic.