Technical SEO Checklist for GEO and AEO Visibility
Most content teams focus on words. But if your technical foundation is broken — missing JSON-LD, poor schema markup, no speakable declarations — neither Google nor AI engines like ChatGPT and Claude will ever surface your content. Here's the checklist that fixes that.
Janet JensenCMO6 min readWhy Technical SEO Is Now a GEO and AEO Problem
You've probably heard the acronyms. SEO you know. GEO — Generative Engine Optimisation — is how you get your content cited by AI-powered search engines like ChatGPT, Perplexity and Claude. AEO — Answer Engine Optimisation — focuses on getting featured in direct-answer formats, from Google's AI Overviews to voice assistants.
Here's the uncomfortable truth: none of that works if your technical foundation is weak. AI engines don't just read your words. They parse your structured data, evaluate your schema markup and use your JSON-LD to understand what your page is actually about.
A 2025 study by Frase.io found that pages with comprehensive schema markup received 40% more AI citations than pages without it. Google's own documentation confirms that structured data directly influences rich results, knowledge panels and AI Overview eligibility.
The Technical SEO Checklist for AI Visibility
Use this checklist to audit your site. Every item directly affects whether Google and AI engines can find, understand and cite your content.
1. JSON-LD Structured Data
JSON-LD (JavaScript Object Notation for Linked Data) is the format Google recommends for structured data. It sits in your page's <head> and tells search engines exactly what your content represents.
- Implement the correct @type — use Article for blog posts, FAQPage for Q&A content, HowTo for tutorials, Organization for your company profile. Match the schema type to the content's actual purpose.
- Include required properties — headline, author, datePublished, dateModified, publisher (with logo), description and image. Missing any of these reduces your eligibility for rich results.
- Validate with Google's Rich Results Test — run every page through Google's testing tool before publishing. Fix errors, not just warnings.
- Use nested entities — link your Article's author to a Person schema, your publisher to an Organization schema. AI engines use these connections to build knowledge graphs.
2. Schema Markup Beyond the Basics
Most sites stop at Article schema. That's a mistake. AI engines reward depth.
- SpeakableSpecification — declare which parts of your page are suitable for voice readouts and AI assistant summaries. Use CSS selectors targeting your H1 and lead paragraph.
- FAQPage schema — even if your page isn't a traditional FAQ, adding structured Q&A sections with FAQPage markup dramatically increases your chances of appearing in AI-generated answers.
- aboutTopics — use the
aboutproperty in your JSON-LD to list the specific topics your page covers. This helps LLMs disambiguate your content from competing pages. - BreadcrumbList — implement breadcrumb schema to give AI engines context about where your page sits in your site hierarchy.
- Organization + WebSite schema — these should exist on every page (typically injected globally). Include your logo, social profiles, contact information and sameAs links.
3. Crawlability and Indexing for AI Engines
Google isn't your only crawler anymore. LLM-powered engines send their own bots.
- robots.txt — check that you're not blocking AI crawlers. GPTBot (OpenAI), ClaudeBot (Anthropic) and PerplexityBot all respect robots.txt. If you've blocked them, your content won't appear in AI search results.
- llms.txt — this emerging standard (similar to robots.txt but for LLMs) lets you provide a structured index of your most important content. Implement it if your CMS supports it.
- XML sitemap — keep it clean, current and submitted to Google Search Console. Include lastmod dates that actually reflect real content changes.
- Canonical tags — every page needs a self-referencing canonical. Duplicate content confuses both traditional and AI search engines.
- Page speed — Core Web Vitals still matter. Slow pages get crawled less frequently, which means AI engines see stale content.
4. Content Structure That AI Engines Can Parse
Structured data is only half the equation. Your HTML content structure matters just as much.
- Semantic HTML — use proper heading hierarchy (H1 → H2 → H3). AI engines use heading structure to understand content relationships.
- Answer-first formatting — place your key answer or definition in the first paragraph, immediately after the H1. AI engines frequently extract this position for citations.
- Descriptive anchor text — internal links with descriptive text help AI engines understand topic relationships across your site.
- Alt text on images — AI engines increasingly process multimodal content. Descriptive, keyword-relevant alt text improves both accessibility and AI visibility.
- Clean URL structure — use short, descriptive URL slugs that include your primary keyword. Both Google and AI engines use URL structure as a relevance signal.
5. E-E-A-T Signals for AI Trust
E-E-A-T — Experience, Expertise, Authoritativeness and Trustworthiness — determines whether AI engines consider your content reliable enough to cite.
- Author pages with Person schema — every article should link to an author page that includes the author's credentials, experience and social profiles, marked up with Person JSON-LD.
- HTTPS everywhere — this is table stakes, but still worth checking. Mixed content or expired certificates erode trust signals.
- Structured contact information — include your organisation's address, phone number and contact page in your Organization schema. AI engines use this to verify legitimacy.
- Citation sources — link to authoritative external sources. AI engines evaluate your outbound link quality as a trust signal.
- Date freshness — keep dateModified accurate in your JSON-LD. AI engines prioritise recent content, and incorrect dates can suppress your visibility.
6. Monitoring and Validation
Technical SEO for AI isn't a one-time project. It's an ongoing discipline.
- Google Search Console — monitor your structured data reports weekly. Watch for new errors after deployments.
- Schema validation tools — use Schema.org's validator alongside Google's Rich Results Test for comprehensive coverage.
- AI search monitoring — regularly search for your target queries in ChatGPT, Perplexity and Google AI Overviews. Track whether your content appears in citations.
- Log file analysis — check your server logs for GPTBot, ClaudeBot and PerplexityBot activity. If they're not crawling, they're not indexing.
The Checklist at a Glance
Here's your quick-reference version. Print it, pin it, share it with your dev team.
- JSON-LD with correct @type, all required properties, nested entities
- SpeakableSpecification on key content sections
- FAQPage schema on relevant pages
- aboutTopics declared in structured data
- BreadcrumbList schema implemented
- Organization + WebSite schema on every page
- robots.txt allows GPTBot, ClaudeBot, PerplexityBot
- llms.txt file published and maintained
- XML sitemap current with accurate lastmod
- Self-referencing canonical tags on all pages
- Core Web Vitals passing
- Semantic HTML heading hierarchy (H1 → H2 → H3)
- Answer-first content structure
- Descriptive anchor text on internal links
- Alt text on all images
- Clean, keyword-rich URL slugs
- Author pages with Person schema
- HTTPS with valid certificates
- Organization schema with contact details
- Accurate datePublished and dateModified in JSON-LD
- Weekly Search Console monitoring
- Regular AI search citation tracking
Where Most Organisations Fail
We audit dozens of enterprise sites every year. The pattern is consistent: content teams invest heavily in writing, but nobody owns the technical layer. The result? Brilliant content that AI engines can't parse, can't trust and won't cite.
The fix isn't complicated. It's systematic. Work through this checklist, involve your developers early and treat structured data as seriously as you treat your editorial calendar.