I had 287 blog posts published over 4 years. Which ones were working? Which needed updating? No clue. Manual audit quote: $2,500-4,000 from an agency. DIY timeline: 40-60 hours to analyze everything. Traffic plateaued.
Then I built an AI content audit system.
Results in 12 minutes:
- 287 posts analyzed
- 43 quick-win opportunities identified
- 18 posts flagged for deletion
- 67 posts needing updates
- $0 cost
After fixes (60 days):
- 156% traffic increase on updated posts
- 12 posts now ranking #1 (up from 0)
- 87% less content cannibalization
Let me show you how to audit your content with AI.
Why Content Audits Matter
Your old content is probably:
- Outdated
- Underoptimized
- Cannibalizing
- Underperforming
- Broken
- Inconsistent
Updating one post can deliver equivalent traffic of 5 new posts. It's easier to improve existing than create new.
My AI Content Audit System
What it does:
- Analyzes all published content
- Pulls performance metrics
- Identifies SEO issues
- Finds quick-win opportunities
- Detects content cannibalization
- Recommends actions (update/merge/delete)
- Generates improvement prompts
Time: 12 minutes for 287 posts (vs 40-60 hours manual)
Component 1: Content Inventory
import requests
import pandas as pd
from datetime import datetime
class ContentAuditor:
def __init__(self, site_url):
self.site_url = site_url
self.posts = []
def collect_all_posts(self):
"""Pull all published posts."""
# For WordPress
posts = self.get_wordpress_posts()
# Enrich with analytics
for post in posts:
post['analytics'] = self.get_analytics_data(post['url'])
post['seo_metrics'] = self.get_seo_metrics(post['url'])
self.posts = posts
return posts
def get_wordpress_posts(self):
"""Pull from WordPress REST API."""
posts = []
page = 1
while True:
response = requests.get(
f"{self.site_url}/wp-json/wp/v2/posts",
params={'per_page': 100, 'page': page}
)
if response.status_code != 200:
break
batch = response.json()
if not batch:
break
for post in batch:
posts.append({
'id': post['id'],
'title': post['title']['rendered'],
'url': post['link'],
'date_published': post['date'],
'date_modified': post['modified'],
'word_count': len(post['content']['rendered'].split()),
'excerpt': post['excerpt']['rendered']
})
page += 1
return posts
def get_analytics_data(self, url):
"""Pull from Google Analytics."""
# Google Analytics Data API
client = BetaAnalyticsDataClient()
request = RunReportRequest(
property=f"properties/{GA4_PROPERTY_ID}",
dimensions=[{"name": "pagePath"}],
metrics=[
{"name": "screenPageViews"},
{"name": "averageSessionDuration"},
{"name": "bounceRate"},
{"name": "engagementRate"}
],
dimension_filter={
"filter": {
"field_name": "pagePath",
"string_filter": {
"match_type": "EXACT",
"value": url.replace(self.site_url, '')
}
}
},
date_ranges=[{"start_date": "30daysAgo", "end_date": "today"}]
)
response = client.run_report(request)
if response.rows:
row = response.rows[0]
return {
'pageviews_30d': int(row.metric_values[0].value),
'avg_session_duration': float(row.metric_values[1].value),
'bounce_rate': float(row.metric_values[2].value),
'engagement_rate': float(row.metric_values[3].value)
}
return {'pageviews_30d': 0}
def get_seo_metrics(self, url):
"""Pull SEO data."""
# Use Google Search Console API
service = build('searchconsole', 'v1', credentials=CREDENTIALS)
request = {
'startDate': (datetime.now() - timedelta(days=30)).strftime('%Y-%m-%d'),
'endDate': datetime.now().strftime('%Y-%m-%d'),
'dimensions': ['query'],
'dimensionFilterGroups': [{
'filters': [{
'dimension': 'page',
'expression': url
}]
}]
}
response = service.searchanalytics().query(
siteUrl=self.site_url,
body=request
).execute()
if 'rows' in response:
# Get top ranking keyword
top_keyword = response['rows'][0]
total_clicks = sum(row['clicks'] for row in response['rows'])
total_impressions = sum(row['impressions'] for row in response['rows'])
avg_position = sum(row['position'] for row in response['rows']) / len(response['rows'])
return {
'top_keyword': top_keyword['keys'][0],
'clicks_30d': total_clicks,
'impressions_30d': total_impressions,
'avg_position': round(avg_position, 1),
'ctr': round((total_clicks / total_impressions * 100), 2) if total_impressions > 0 else 0
}
return {}
Component 2: AI Performance Analysis
Analyze what's working and what's not.
def analyze_content_performance(posts_data):
"""AI analyzes all posts and categorizes."""
# Convert to dataframe for easier analysis
df = pd.DataFrame(posts_data)
prompt = f"""
Analyze this content inventory:
{df.to_json(orient='records', indent=2)}
Categorize each post into one of these buckets:
1. **STARS** (Keep as-is):
- High traffic (top 20%)
- Good engagement
- Up-to-date
- Well-optimized
2. **QUICK WINS** (Easy improvements):
- Decent traffic but underperforming for keyword
- Missing optimization opportunities
- Could rank higher with minor updates
- Good topic, needs refresh
3. **NEEDS UPDATE** (Major refresh required):
- Outdated information
- Low traffic despite good topic
- Poor engagement
- Ranking declined
4. **MERGE CANDIDATES** (Content cannibalization):
- Multiple posts targeting same keyword
- Similar topics that should be one comprehensive post
- Competing with each other in search
5. **DELETE** (Not worth keeping):
- Zero traffic for 6+ months
- Irrelevant topic now
- Thin content (<500 words)
- Broken or unusable
For each category, provide:
- Post IDs
- Why it's in this category
- Recommended action
- Estimated impact (High/Medium/Low)
Return as structured JSON.
"""
analysis = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"}
).choices[0].message.content
return json.loads(analysis)
Component 3: Quick-Win Identification
Find easy optimizations with big impact.
def identify_quick_wins(post_data, seo_data):
"""Find specific opportunities for each post."""
prompt = f"""
Analyze this post for quick-win SEO opportunities:
POST DATA:
Title: {post_data['title']}
URL: {post_data['url']}
Word count: {post_data['word_count']}
Published: {post_data['date_published']}
PERFORMANCE:
Pageviews (30d): {post_data['analytics']['pageviews_30d']}
Top keyword: {post_data['seo_metrics'].get('top_keyword', 'Unknown')}
Average position: {post_data['seo_metrics'].get('avg_position', 'N/A')}
CTR: {post_data['seo_metrics'].get('ctr', 0)}%
Identify quick wins (changes that take <30 min but drive results):
1. **TITLE OPTIMIZATION**:
- Is keyword in title?
- Is it compelling for CTR?
- Suggested improvement?
2. **META DESCRIPTION**:
- Does it exist and include keyword?
- Is it compelling?
- Suggested rewrite?
3. **HEADER STRUCTURE**:
- Logical H2/H3 hierarchy?
- Keywords in headers?
- Missing sections?
4. **INTERNAL LINKING**:
- Opportunities to link from this post
- Opportunities to link TO this post
5. **WORD COUNT**:
- Too short for topic depth?
- Suggested target word count?
6. **CONTENT GAPS**:
- What's missing that competitors cover?
- Quick sections to add?
7. **FRESHNESS**:
- Needs date update?
- Outdated stats/examples?
Return specific, actionable recommendations with estimated impact.
"""
recommendations = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
).choices[0].message.content
return recommendations
Component 4: Content Cannibalization Detection
Find posts competing with each other.
def detect_cannibalization(posts):
"""Identify posts targeting same keywords."""
# Group posts by keywords
keyword_groups = {}
for post in posts:
keyword = post['seo_metrics'].get('top_keyword', '').lower()
if keyword:
if keyword not in keyword_groups:
keyword_groups[keyword] = []
keyword_groups[keyword].append(post)
# Find groups with 2+ posts (potential cannibalization)
cannibalization_issues = []
for keyword, posts_list in keyword_groups.items():
if len(posts_list) >= 2:
prompt = f"""
These posts may be cannibalizing each other:
Keyword: {keyword}
Posts:
{json.dumps([{
'title': p['title'],
'url': p['url'],
'pageviews': p['analytics']['pageviews_30d'],
'position': p['seo_metrics'].get('avg_position', 'N/A')
} for p in posts_list], indent=2)}
Determine:
1. Is this true cannibalization? (targeting same intent)
2. Which post is strongest?
3. Recommendation:
- Merge into one comprehensive post?
- Differentiate by adjusting focus/keywords?
- Redirect weaker to stronger?
- Keep separate (different intent)?
Provide specific action plan.
"""
recommendation = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
).choices[0].message.content
cannibalization_issues.append({
'keyword': keyword,
'posts': posts_list,
'recommendation': recommendation
})
return cannibalization_issues
Component 5: Update Prompts Generation
AI writes prompts to improve each post.
def generate_update_prompts(post, recommendations):
"""Create specific prompts to update the post."""
prompts = {}
# Title optimization
if 'title' in recommendations:
prompts['new_title'] = f"""
Current title: {post['title']}
Rewrite to:
- Include primary keyword: {post['seo_metrics'].get('top_keyword')}
- Increase CTR (more compelling)
- Stay under 60 characters
- Match search intent
Provide 5 options.
"""
# Content expansion
if 'content_gaps' in recommendations:
prompts['add_sections'] = f"""
Current post: {post['title']}
Current word count: {post['word_count']}
Add these missing sections:
{recommendations['content_gaps']}
For each section:
- Write 200-300 words
- Include examples
- Match existing tone
- Naturally incorporate keywords
Write the complete new sections.
"""
# Meta description
if 'meta_description' in recommendations:
prompts['meta_description'] = f"""
Post: {post['title']}
Keyword: {post['seo_metrics'].get('top_keyword')}
Write a meta description (150-155 characters) that:
- Includes keyword naturally
- Creates curiosity or promises value
- Increases click-through rate
Provide 3 options.
"""
# Internal linking
if 'internal_links' in recommendations:
prompts['internal_links'] = f"""
Post: {post['title']}
Suggest 5-7 internal links to add:
- Where to add them (context in post)
- Which posts to link to
- Anchor text suggestions
Format as actionable list.
"""
return prompts
Component 6: Automated Execution
Apply fixes automatically or semi-automatically.
def auto_update_post(post_id, updates):
"""Update post via CMS API."""
# For WordPress
wp_api_url = f"{SITE_URL}/wp-json/wp/v2/posts/{post_id}"
update_data = {}
# Update title if provided
if 'new_title' in updates:
update_data['title'] = updates['new_title']
# Update content if provided
if 'new_content' in updates:
update_data['content'] = updates['new_content']
# Update excerpt/meta description
if 'meta_description' in updates:
update_data['excerpt'] = updates['meta_description']
# Use Yoast SEO API if available
update_data['yoast_meta'] = {
'yoast_wpseo_metadesc': updates['meta_description']
}
# Update modified date to signal freshness
update_data['modified'] = datetime.now().isoformat()
# Send update
response = requests.post(
wp_api_url,
headers={'Authorization': f'Bearer {WP_API_TOKEN}'},
json=update_data
)
return response.json()
Component 7: Priority Scoring
Know what to fix first.
def calculate_priority_score(post, recommendations):
"""Score posts by update priority."""
score = 0
# High traffic = higher priority to optimize
if post['analytics']['pageviews_30d'] > 100:
score += 30
# Currently ranking 4-10 = easy to push to top 3
avg_position = post['seo_metrics'].get('avg_position', 100)
if 4 <= avg_position <= 10:
score += 40 # BIGGEST OPPORTUNITY
elif 11 <= avg_position <= 20:
score += 20
# Low CTR = title/meta needs fixing (easy win)
ctr = post['seo_metrics'].get('ctr', 0)
if ctr < 2 and post['seo_metrics'].get('impressions_30d', 0) > 100:
score += 25
# Old content (freshness update helps)
days_old = (datetime.now() - datetime.fromisoformat(post['date_modified'])).days
if days_old > 365:
score += 15
# Number of quick wins identified
num_quick_wins = len(recommendations.get('quick_wins', []))
score += num_quick_wins * 5
return score
# Sort all posts by priority
posts_prioritized = sorted(
posts,
key=lambda p: calculate_priority_score(p, recommendations[p['id']]),
reverse=True
)
# Work on top 20 first (80/20 rule)
top_priorities = posts_prioritized[0:20]
My Audit Results
287 posts analyzed in 12 minutes:
- Stars (52 posts): Keep as-is, already performing well
- Quick Wins (43 posts): High-impact, low-effort fixes
- Needs Update (67 posts): Comprehensive refresh required
- Merge Candidates (18 posts): Consolidate 9 pairs
- Delete (18 posts): Remove or redirect
- New Topic Ideas (89): Gaps AI identified in my content
I focused on Quick Wins first: Time invested: 12 hours over 2 weeks (43 posts × ~15 min each)
Actions taken:
- Rewrote 43 titles (better keywords + CTR)
- Added 2-3 sections to each (addressing content gaps)
- Updated meta descriptions
- Added internal links
- Refreshed stats/examples
- Updated publication dates
Results after 60 days:
- Quick-win posts: 156% traffic increase (avg)
- 12 posts moved to #1 rankings (from positions 4-8)
- 27 posts moved into top 3 (from positions 5-15)
- Overall site traffic: +47%
- Zero new content published (all from updates)
Dashboard Visualization
def create_audit_dashboard(audit_results):
"""Visualize audit findings."""
import plotly.graph_objects as go
import plotly.express as px
# Category distribution pie chart
categories = audit_results['category_breakdown']
fig1 = go.Figure(data=[go.Pie(
labels=list(categories.keys()),
values=list(categories.values()),
hole=.3
)])
fig1.update_layout(title="Content Audit Category Breakdown")
# Priority scatter plot (Effort vs Impact)
opportunities = audit_results['opportunities']
fig2 = px.scatter(
opportunities,
x='effort_score',
y='impact_score',
size='pageviews_30d',
color='category',
hover_data=['title'],
title="Quick Win Opportunities (Top Right = Best)"
)
# Traffic trend by category
fig3 = px.line(
audit_results['traffic_by_category'],
x='date',
y='pageviews',
color='category',
title="Traffic Trend by Content Category"
)
# Save dashboard
dashboard_html = f"""
<html>
<head><title>Content Audit Dashboard</title></head>
<body>
<h1>Content Audit Results</h1>
{fig1.to_html()}
{fig2.to_html()}
{fig3.to_html()}
</body>
</html>
"""
with open('audit_dashboard.html', 'w') as f:
f.write(dashboard_html)
Tools & Costs
Data collection: Google Analytics and Google Search Console: Free
AI analysis: ChatGPT API: $15-25/month
Optional tools: Ahrefs/SEMrush: $99-119/month, Screaming Frog: Free-$259/year
Setup cost: $20/month (ChatGPT API only)
ROI: 47% traffic increase = $2,800+ more monthly revenue (ads + affiliates)
Getting Started This Weekend
Saturday (4 hours):
- Pull all post data
- Run AI analysis on all posts
- Review top 20 quick wins
- Fix first 5 posts
Sunday (4 hours):
- Fix 10 more posts (15-20 min each)
Week 2: Fix remaining quick wins
Month 2: Tackle "Needs Update" category
Month 3: Merge and delete as needed
The Bottom Line
Old content is your biggest untapped asset.
Manual audits cost $2,500-4,000 and take weeks.
AI content audits can:
- Analyze hundreds of posts in minutes
- Identify specific optimization opportunities
- Detect cannibalization automatically
- Generate improvement prompts
- Prioritize by impact
My results:
- 287 posts audited in 12 minutes
- 43 quick wins identified
- 156% traffic increase on updated posts
- 47% overall site growth
Audit your content this weekend.
Fix your quick wins first.
Watch traffic grow from content you already have.
