---
title: How to Conduct a Content Audit for Your Website
date: 2020-07-28T05:00:00-04:00
author: Sean Smith
canonical_url: "https://website101podcast.com/episodes/season-03/episode-10/content-audits/"
section: Podcast
---
&lt;!\[CDATA\[YII-BLOCK-BODY-BEGIN\]\]&gt;[Skip to main content](#main-content)Season 03 Episode 10 – Jul 28, 2020   
23:23 [Show Notes](#show-notes)

## How to Conduct a Content Audit for Your Website

﻿

0:00

0:00

1.0x

0.75x1.0x1.25x1.5x2x

[](//dts.podtrac.com/redirect.mp3/website101podcast.com/uploads/mp3/season-03/10-Content-Audits-Master.mp3)

Learn why you should conduct a content audit on your website to improve SEO, optimize content, and streamline your site structure. Get expert advice and practical tips in this episode.

<a name="show-notes"></a>### Show Notes

A discussion of what is included in content audit including orphan links. A content audit is a good opportunity to prune outdated content.

Mike explains how to use a spreadsheet and common columns to include when doing your content audit.

### Show Links

- [Airtable](https://airtable.com/)
- [How to Perform a Website Content Audit to Guide Your Content Marketing Strategy](https://www.semrush.com/blog/content-audit-for-content-marketing-strategy)
- [Screaming Frog](https://www.screamingfrog.co.uk/seo-spider/)
- [WordPress All Export](https://en-ca.wordpress.org/plugins/wp-all-export/)
- [How to conduct a website content audit for your nonprofit](https://nonprofitmarcommunity.com/how-to-conduct-a-website-content-audit-for-your-nonprofit/)

Powered Transcript Accuracy of transcript is dependant on AI technology.

**\[00:00\]** **Sean:** Hi, and welcome to the ninth episode of season three of the website 101 podcast. I'm Sean Smith, your co-host, and as usual, Mike Mele is joining me. Mike, how's things going in isolation?

**\[00:12\]** **Mike:** Yeah, isolation's going okay. Not doing so. We've talked about before that you and I tend to be isolated anyway, just the nature of our work. So, we have more people around us than usual, actually, because our families are around. You may actually hear my family in the background, but you know, pay no attention to the people behind the curtain.

**\[00:31\]** **Sean:** Well, hopefully the self-isolation is ended by the time this episode gets released. For the record, we're recording in middle of April, and we just started releasing episode one just this week, so we're on episode nine for recording. And today, we are going to talk about content audits, what they are, why you should have one and how to go about it. And for the most part, this is Mike's area specialty, so Mike will be doing most of the talking. Mike, what exactly is a content audit?

**\[01:09\]** **Mike:** Okay, well, a content audit is a document that you produce which records all of the content that makes up your website. So anytime you add text or, you know, let's say a staff member or whatever, just a new blog post, anything like that. Anything that really has its own URL typically, you would document it in what's called a content audit. And usually they take the form of a spreadsheet, so Excel, spreadsheet or whatever, where you have row upon row, and each row is sort of a link contains a link to a page and various other information about that content which we can get to later.

But that's essentially what it is. It's a document that has a record, keeps a record of all the content and it makes up your website.

**\[02:01\]** **Sean:** So would this include like the copy and the images and everything on the page or just a link to it and like a little summary about it?

**\[02:09\]** **Mike:** Well, I guess they, content audits could be different for different people and how they do them and what it is they want to catalog. But in my experience, it's typically, given that we're doing it for websites, It's really documenting the pages of your site. So you wouldn't be capturing every single paragraph that's on your site. It would be more like, OK, well, what are all the pages that we have? What are all the files? You might include PDFs, things like that. So any link that someone can visit and see something, it's cataloging that link and some various characteristics

**\[02:44\]** **Sean:** about it. OK, so this would include content that might be hidden behind like a password wall or a pay wall or something like that as well,

**\[02:54\]** **Mike:** correct? Very much so. Yeah. In fact, I'm going through a content audit with a client right now.

And they have various sort of affinity groups, we call them, where you know, you have different types of audience members. And they each have logins. And when they log in, they see different content than the other groups of users. So you would have to catalog, you know, what is the information that they see that's different from someone else.

Also, very often, people tend to have links, which are sometimes called orphaned links, where you might have a page that you promote through, say, a newsletter or something, but it's not necessarily accessible through your navigation menu of your website.

**\[03:36\]** **Sean:** Right. So like a marketing page or something like that.

**\[03:40\]** **Mike:** Landing page, yeah, they have a bunch of different names. But you need to make sure you record those as well, because obviously if you just do sort of a crawl of your site, it may not show up if it doesn't appear in the navigation normally.

**\[03:52\]** **Sean:** Oh, okay. Well, that's really useful to know. So why would I want to do this?

**\[04:00\]** **Mike:** Well, they're very, very important when you're doing a redesign, which in the case of the client that I'm working with now, we're redesigning their website, which I built for them sometime ago five, six years ago. A lot has changed since then. A lot of content has been added, removed, and so on. And if you're doing a redesign, you need to make sure you are aware of all the content that you're going to be moving over, and that you don't accidentally move something that's outdated, because that's another purpose of it, is that it allows you to sort of do a spring cleaning of all the content on your site and determine, you know, what should be here and what shouldn't, you know?

**\[04:40\]** **Sean:** Yeah, I regularly recommend that my clients do that if we're doing a rebuild or a redesign or something like that. Hey, this is a good time to prune your content.

**\[04:49\]** **Mike:** Absolutely. Yeah, because very often people can be really good at adding stuff to their site, but it's not often you find a client or a website owner that regularly goes through and as you said prunes content that's outdated or whatever. Have you taken that advice yourself and prune

**\[05:08\]** **Sean:** your own content because I know I have a couple of times.

**\[05:11\]** **Mike:** I did do that a few times yet, although I'm one of those people that doesn't even update my own work as often as I probably should, so maybe I should start there and do the fruity.

**\[05:23\]** **Sean:** Well, the last time I redesigned my site was about a year and a half ago and I migrated all my content over because I didn't want to lose any of it, but then I turned off the content that I didn't feel was worthy of still displaying because it was dated or wasn't relevant to what I'm doing at

**\[05:43\]** **Mike:** this point. Yeah and especially with you and I we write about the web and it moved so fast that it can you know any given article could go out of date pretty quickly if you're not you know

**\[05:56\]** **Sean:** keeping it current. Right so like I used to have a lot of articles about expression engine and They were all written during the Expression Engine heyday, which was like 2008, 2012 during version two. And it's now version six is coming out later this year. And I don't use it that much. And so a lot of the content in there is not relevant. So I've turned it off. And that was part of my content audit.

**\[06:29\]** **Mike:** Yeah, that's right. I mean, it's an audit like this is useful. Even if you're not making any big changes to your site, if you're not doing a redesign, if you're not moving anywhere, whatever, yeah, just to sort of like, as I said before, like a spring cleaning of your website,

**\[06:45\]** **Sean:** it can be very rewarding. OK, so you kind of touched on this a little bit. But what aspects of content should be documented? When we first started off, you said it's the URL. and you want to get all your orphaned content. But what about the exact copy, which you said does not really necessary, but there's things like images and then maybe links to resources such as PDFs or WordDocs or off-site links that you want to document because maybe it's a third-party service that you're using?

**\[07:22\]** **Mike:** Right. Yeah, and again, this is depending on how thorough and what exactly you want to get out of your audit. But I've got one in front of me here that I'm working on right now, and I'll just go through some of the sort of column. So just to describe how this looks, if you picture like an Excel spreadsheet, doesn't have to be Excel, Google, whatever they call it.

Google, what's it called? Or Google Sheets, Google Sheets. Anything like that, I use air table. But you'd basically have the first column would be the page title.

So you might record, you know, say who we are or staff or whatever the title is. You could have another column for the new page title. Maybe if you're doing a redesign, you plan to change. Maybe it's not going to be called get involved.

Maybe now it's going to be called take action. And if you plan to make that change, you should record that in this. So you might have new page title as one of the columns. And so in that same row, you would edit the information as needed.

You could also have the current URL. You could have a new URL if you're going to be reorganizing the way your address is sort of work. You could have a notes field. If you have some special notes, you need to record about it.

And then you can get more and more granular depending on the CMS you're using. So in my case, I have page type because the CMS I'm using offers different types of pages. You could have the parent page if it's part of a sub menu, like, does it have a parent page? If so, what is that?

And there's a few things like that. And most importantly, or in my experience, one of the most important things to record about each one is, does it need to be updated? Does it need to be removed? Or should it stay as it is?

So if you're doing a redesign, you're going to need to know that about all of your pages. You're going to need to know, OK, this content can move over exactly as it is. or maybe this is way at a date, we got to take this down from the site, or maybe it's, well, this is still relevant, but we got to contact, you know, Karen and marketing to get the content updated because it's not accurate anymore. That's important.

**\[09:38\]** **Sean:** So like an action column. Yeah. Like keep it as it is, update it, or remove it.

**\[09:44\]** **Mike:** Exactly, and I have with the spreadsheet program that I'm using, like I said, it's called AirTable. It allows me to have a column that basically lets me select one of those three options. So for every row, I'd just select whichever one it is.

**\[09:59\]** **Sean:** All right, so one of the things you said there was like maybe changing the URL of your page. So it could be changing from about to about us or about to our company or whatever you want to call it. What do you, how do you handle the redirects for that? do you put this sort of thing as a note into your content audit, or do you just like hope to remember that you need to redirect it and not lose your Google juice?

**\[10:31\]** **Mike:** Yeah, I mean, that's the kind of thing. I think we have other episodes where we talk about something called an HT access file, or there's various tools that are sort of like that. But basically, they're like a sort of a thing running in the background that if someone visits the old link, you can record in that file, oh well, when someone tries to go there, send them over here instead. And also, if it's a search engine, make sure the search engine learns that the new URL is the correct one from now on.

And you can do all that kind of stuff, of course, or your web developer can do it for you. And yeah, those kind of things, if you plan to adjust the URLs on a new build, you definitely want to take note of whether that's been done. And that brings me to another column that I never mentioned yet, which is also very important, which is ready question mark, is how I write it. So it's basically a column that's a check box.

And for every page that I have recorded, it says ready. Meaning is it ready to be moved over? Has Karen and marketing updated the content? Do we have the new URL?

And whenever it's all done, you just check that box and then you're ready, okay, you know, we can just move this over into the new thing

**\[11:41\]** **Sean:** and all that's gonna be good. All right, so I wanna go back a little bit to the redirects. And I know this is a little bit off topic, but why would you want to change the URL of your entry from about to about us or about to our company or whatever the page is, is there an advantage to changing that?

**\[12:04\]** **Mike:** Well, I can think of a couple of reasons. For one, if you decide to move a page into a different section of your site, very often people have, you know, you might have, if you have, say, an about section where you have staff and board members or whatever, your URL might say your website.com slash about slash and then the page. And if you decide to move it out of the about sections and now it's going to belong under, you know, who we are or something else, then that might change the URL. That would be one reason.

Well, what reasons can you think of there, Sean, you know any ideas for why someone might want to do that?

**\[12:41\]** **Sean:** I was actually an honest question. I haven't really thought about why you would want to change it.

**\[12:48\]** **Mike:** Yeah, moving them is one.

**\[12:49\]** **Sean:** A couple of times I wanted to change something on my own site because I didn't like the word that I'd used previously, but I just never change it. Yeah. Because I don't want to take a hit on whatever Google juice I'm getting. And I realized that the hit is temporary, but I just decided that I'd rather just stick with my old URL for whatever reason.

**\[13:10\]** **Mike:** Hi, hope you're enjoying this episode. We're always looking for topics suggestions from listeners, so if there's anything you like us to discuss in the future, please let us know.

**\[13:20\]** **Sean:** We're also looking for guests. If you know somebody who would make a great guest, if you think you would be a good guest, please let us know. You can reach out to us at website 101podcast.com slash contact.

**\[13:36\]** **Mike:** I can think of another reason why, and I've done this before where, so you mentioned Expression Engine earlier, it's a great CMS, but we typically use newer ones than that these days. But one of the things that was common in Expression Engine back in the day was, let's say you have a blog, and it's called blog. in a section called blog, they call them channels over there. Your entries would, by default, be something like your website.com slash blog slash entry slash and then the URL title and very often people don't want the word entry in there because it's like of course it's an entry, you know, you don't need to add that in because everyone knows shorter URLs are sexier URLs.

So usually you want to get rid of that. Yeah. So I can remember when I move to craft, it's very easy to, you know, it doesn't sort of have that by default. So you can take things like that.

Also, I think a lot of people who use WordPress have it set up so that there is no segment in the URL except the page identifier. So it would be your site.com slash and then whatever the page title is, it could be a blog entry, it could be a staff member. It's not a crawlback type of architecture, so that's another reason if you want to migrate to a system like that.

**\[14:59\]** **Sean:** Yeah, I've heard that first-level URLs do better for SEO. I don't have any evidence on that, but I can see that as being a good reason. One thing I'd like to bring up is before we started recording, I did a search and I found article on semrush.com. That's an SEO website that's like very authoritative.

And the article's called content audit for content marketing strategy. And in this long, long post, which we'll link through in the show notes, there's a screenshot of an Excel sheet. And it's very, very similar to what Mike described. I would strongly recommend that you go and take a look at this.

So their first column as URL and then they've got sections of columns set up as basic info, category, metadata and metrics. And each of those have sub columns within it. And I think this kind of illustrates what Mike was talking about earlier.

**\[16:02\]** **Mike:** Yeah, you can basically get as detailed with the, with your, you know, spreadsheet as you need to for your project. If you want to record like, what level is this page in the navigation? Is it like top level, is it a sub menu, is it a sub-sub menu? You could record all that in case any of that's going to change from a site to the redesign site. You know, it's very helpful to get all this stuff cataloged in one place.

**\[16:28\]** **Sean:** Excellent, excellent. Let's go on and talk about, you've got an existing site. How do you find all of the pages? You've had your site for a while, and you've got hundreds of pages in your blog, and your services section, or whatever. How do you find all of these pages to catalog?

**\[16:55\]** **Mike:** Yeah, you can, I mean, obviously whatever CMS you're using may let you sort of spit out content in a certain way. So for example, what I said earlier about, if you have a site where there's different member logins, and they see different things depending on which group they belong to, your CMS might be able to output all of the content for a given member group and just like, you can get the information out that way. So that's helpful, but also there are tools available online, many of them free to an extent, that sort of scrub through your page and allow you to sort of, you know, record, well, yeah, generates a list basically, like a site map list of whatever pages are in your site.

**\[17:38\]** **Sean:** So like a site scraper.

**\[17:40\]** **Mike:** Yeah.

**\[17:41\]** **Sean:** Yeah. Actually one of the more popular ones is screaming frog. And we'll include a link to that. And it has paid and free different levels to it.

It's very comprehensive and we'll do a really, really good job of getting all of your URLs and information that you need from it. I used it once about a year and a half ago. And I would recommend it. there are other tools available.

And then other options might be that your CMS would give you an ability to export your content as a CSV, which you could open in Google Sheets or Microsoft Word. So the latest version of Kraft now gives you the option right from each channel. In the edit listing, you can just export all of your entries. And it will give you CSV or an XML, or maybe it gives you the option.

I tested it. It's really, really good.

**\[18:39\]** **Mike:** Yeah, I think I just stumbled upon that feature just yesterday. I didn't know it was there.

**\[18:43\]** **Sean:** But I think that came in 3.4, which was released a few weeks ago. OK, again, like I kind of stumbled across it. It wasn't there previously as far as I remember.

**\[18:53\]** **Mike:** That's a great feature that makes me wish that I had that system running on my old sites that I'm doing this for. But of course, that's why I'm moving them because the system they're running is a little outdated.

**\[19:05\]** **Sean:** Well, even if your CMS doesn't have that option natively, there will probably be a plug-in that you could use that would accomplish that. Yeah. Or if there isn't, you could get your developer to write some code to do the same thing. I've been writing a lot of XML templates with older expression engine sites because I need to spit out that content and import it into craft for a new site rebuild.

And the import module that I'm using in craft uses JSON or XML. I just happen to like XML, I find it easier to write. So that's what I'm using. And you could do the same thing.

You could write something that would export it as a CSV file.

**\[19:54\]** **Mike:** Yeah, I've done that same thing myself before too, where you create a little template. I mean, again, this is a little bit more technical than for some of our audience, that you could get your dev or me or Sean to help you out with it if you need to, but yeah, you create a little template of XML code and then you put the tags that your system has to spit out content the way you would show blog entries on the front end or whatever. And then it sort of builds a site map based on that.

**\[20:20\]** **Sean:** Right, actually, I'm going to open up a site, a WordPress site that I was working in. I can't remember my user name, oh crap, okay, well I don't have my user name in my head and I don't have time to go look it up, but anyways, I'll put it in the show notes later. I was working in a WordPress site that we were importing into a new craft site and we needed a plugin that would export all of the content into an XML file, as I mentioned earlier, and this plugin worked perfectly. It was really, really good.

I just needed to massage the data a little bit, but it was really simple to use. So if you're on a WordPress site, which I believe a lot of our listeners are, there is a plugin which I just can't remember the name off the top of my head right now, and I will include that link in the show notes.

**\[21:18\]** **Mike:** That's great. That's the one one awesome thing about WordPress is people have made plug-ins for every single thing you need done.

**\[21:25\]** **Sean:** Absolutely. And Mike, you've written an article about content audits, haven't you?

**\[21:36\]** **Mike:** Yeah, I did a guest post for a marketing blog called The Nonprofit Mark Community. some time ago. It's quite a few. Actually, we were looking at earlier. How old is this?

**\[21:47\]** **Sean:** October 2, 2014. It's a content seems timeless to me. Yeah, it's an old post,

**\[21:54\]** **Mike:** but I discuss, you know, basically what we discussed here today. And there's also a template that I have available there that's basically a spreadsheet with some of those fields that I mentioned. Some of those columns are already written out. and you can sort of like use that as starting point and put your content in there. So we'll like that to that in the show notes and you should check out that blog post.

**\[22:17\]** **Sean:** But oh nice, I just opened it up. It's a Google doc so you could just add it to your own thing. Yeah, that's really cool.

**\[22:22\]** **Mike:** Nice. Yeah, and there's some references there to some of those tools we mentioned that can help you create site maps and whatnot, although I'm not sure, I can't guarantee all those are still online after six years, we'll see, but anyway.

**\[22:37\]** **Sean:** Anyways, so that's content audits, what it is, why you should do it, and some resources on how to help you do that. Hey, thank you so much for listening to this episode. My name is Sean Smith, your co-host, and you can find me at my website, caffeinecreations.ca on Twitter, caffeine creation that's spelled C-A-F-F-E-I-N-E-C-R-E-8-I-O-N. And also, I'm on LinkedIn, caffeine creations.

**\[23:11\]** **Mike:** And I'm Mike Mele and you can find me online at blikewater.ca. And I'm also on LinkedIn and Twitter. My username is Mike Mele. That's M-I-K-E-M-E-L-L-L-A.

Close Transcript 

Have a question for Sean, Mike, and Amanda? [Send us an email](/contact).

[![Listen on Google Play Music](/assets/images/google_podcasts_badge@2x.png)](https://www.google.com/podcasts?feed=aHR0cHM6Ly93ZWJzaXRlMTAxcG9kY2FzdC5jb20vZmVlZC5yc3M%3D)[![itunes badge](/assets/images/itunes-badge.png)](https://itunes.apple.com/ca/podcast/website-101-podcast/id1449510012)[![itunes badge](/assets/images/spotify-logo.png)](https://open.spotify.com/show/3rmSM1R9t6q1U8DmYWJRSO?si=NrYPMgDaRV6Dd56PjEaPow)### Season 03

- 1 [ Do You Really Need a Website](https://website101podcast.com/episodes/season-03/episode-1/do-you-really-need-a-website/)
- 2 [ Wordpress](https://website101podcast.com/episodes/season-03/episode-2/wordpress/)
- 3 [ How to Adapt During an Emergency: A Special Website 101 Podcast](https://website101podcast.com/episodes/season-03/episode-3/adapting-during-an-emergency/)
- 4 [ Video Marketing: Boosting Business with Video Content](https://website101podcast.com/episodes/season-03/episode-4/using-video/)
- 5 [ Vacations and Website Maintenance: Navigating the Challenges of Time Off](https://website101podcast.com/episodes/season-03/episode-5/vacations/)
- 6 [ There's a plugin for that](https://website101podcast.com/episodes/season-03/episode-6/theres-a-plugin-for-that/)
- 7 [ Backups: Why You Need Them and How to Implement Them](https://website101podcast.com/episodes/season-03/episode-7/backups/)
- 8 [ Using Custom Email Addresses: A Professional Touch for Your Business](https://website101podcast.com/episodes/season-03/episode-8/email/)
- 9 [ The Importance of Website Maintenance Plans and Retainers](https://website101podcast.com/episodes/season-03/episode-9/maintenance-plans/)
- 10 [ How to Conduct a Content Audit for Your Website](https://website101podcast.com/episodes/season-03/episode-10/content-audits/)
- 11 [ Own Your Content](https://website101podcast.com/episodes/season-03/episode-11/own-your-content/)

### All Seasons

- [Season 01](https://website101podcast.com/season/01/)
- [Season 02](https://website101podcast.com/season/02/)
- [Season 03](https://website101podcast.com/season/03/)
- [Season 04](https://website101podcast.com/season/04/)
- [Season 05](https://website101podcast.com/season/05/)
- [Season 06](https://website101podcast.com/season/06/)
- [Season 07](https://website101podcast.com/season/07/)
- [Season 08](https://website101podcast.com/season/08/)
- [Season 09](https://website101podcast.com/season/09/)

      &lt;!\[CDATA\[YII-BLOCK-BODY-END\]\]&gt;
