AI Readiness Report — Explained

This page explains how we generate and interpret your AI Readiness Report. It outlines the data sources, scoring, and what each section means for your brand’s AI visibility.

What we analyze

  • LLM visibility: how often models like ChatGPT, Gemini, Claude, and Perplexity cite your brand
  • Coverage: the breadth and depth of referenced pages across your site and key domains
  • Sentiment: recent tone of AI answers mentioning your brand
  • Competitor landscape: who else is cited for the same topics and pages

How scoring works

Scores combine frequency of mentions, quality of citations (authority and intent), and recency. We normalize results across models and time windows to provide comparable metrics.

Metrics — section by section

1. Crawler Access

We verify that AI crawlers can access and index your site by checking robots.txt rules, llms.txt presence, and page-level indexability.

robots & agents

  • robots.txt status: HTTP 200 response indicates proper configuration
  • User-agent rules: Checks for GPTBot, Claude-Web, Google-Extended, PerplexityBot, and other AI crawlers
  • Disallow patterns: Identifies if critical paths are blocked from AI indexing

llms.txt

  • File presence: Checks for /llms.txt or /.well-known/llms.txt
  • AI policy declaration: Structured guidance for LLMs on how to cite and reference your content
  • Brand context: Optional metadata helping models understand your brand positioning

indexable

  • Meta robots noindex: Scans for pages blocking indexing via meta tags
  • X-Robots-Tag headers: HTTP header-level indexing directives
  • Canonical conflicts: Pages with noindex but canonical pointing elsewhere

2. AI Readiness (5 categories, 30+ individual checks)

A comprehensive five-part model evaluating how well your content is structured for AI comprehension and citation. Each category contains multiple weighted checks.

DOM Depth & Order (50% shallow structure + 50% semantic elements)

  • DOM depth: Measures nesting levels; shallow (<12 levels) = 50 points
  • Semantic structure: Counts header, nav, main, article, section, aside elements (12.5 points each, max 4 = 50 points)
  • Content hierarchy: Proper use of HTML5 landmarks for AI parsing
  • Navigation clarity: Clear site structure for crawler understanding

Extractability (50% paragraph count + 50% main content ratio)

  • Paragraph count: Minimum 5 paragraphs for full 50 points (scales linearly)
  • Text-to-HTML ratio: Measures content density vs markup bloat
  • Main content ratio: Content inside <main> or primary article tags vs total page content (60%+ = 50 points)
  • Boilerplate detection: Identifies and weights core content vs navigation/footer
  • Readability signals: Sentence structure, paragraph length, content flow

Media Accessibility (50% alt coverage + 25% captions + 25% code/tables)

  • Alt text coverage: Percentage of images with descriptive alt attributes (100% coverage = 50 points)
  • Alt quality: Checks for meaningful descriptions vs generic "image" or empty strings
  • Video captions/transcripts: Presence of <track> elements, transcript links, or caption files (25 points)
  • Audio transcripts: Text alternatives for audio content
  • Code blocks: Proper <code> and <pre> formatting for technical content (12.5 points)
  • Data tables: Structured <table> elements with headers for tabular data (12.5 points)

Chunkability (50% structured Q&A + 50% anchorable headings)

  • FAQ schema: JSON-LD FAQPage structured data (16.7 points)
  • HowTo schema: Step-by-step instructions in structured format (16.7 points)
  • QAPage schema: Question-answer page markup (16.7 points)
  • Question headings: H2/H3 elements phrased as questions (e.g., "What is...", "How to...")
  • Anchorable headings: H2 and H3 elements with IDs for deep linking (100% coverage = 50 points)
  • Heading hierarchy: Proper H1→H2→H3 structure without skips
  • Section breaks: Clear content segmentation for AI chunking

Clarity (50% language + 50% canonical)

  • Language declaration: <html lang="..."> attribute present (50 points)
  • Content language: Matches declared language for consistency
  • Canonical URL: Proper <link rel="canonical"> to avoid duplicate content (50 points)
  • Self-referencing canonical: Canonical points to current page or correct version
  • URL consistency: Canonical matches Open Graph and Twitter Card URLs

3. Page Speed (4 Google PageSpeed Insights metrics)

We run Google's PageSpeed Insights API on each page and report the four core Lighthouse scores. Performance drives the overall gauge.

Performance (0-100)

  • First Contentful Paint (FCP): Time until first text/image renders
  • Largest Contentful Paint (LCP): Time until main content is visible
  • Total Blocking Time (TBT): Sum of blocking time during page load
  • Cumulative Layout Shift (CLS): Visual stability during load
  • Speed Index: How quickly content is visually populated

SEO (0-100)

  • Meta tags: Title, description, viewport presence and quality
  • Crawlability: robots.txt, meta robots, canonical tags
  • Mobile-friendly: Viewport configuration, tap targets, font sizes
  • Structured data: Valid schema.org markup
  • Links: Descriptive link text, crawlable hrefs

Accessibility (0-100)

  • ARIA attributes: Proper roles, labels, and states
  • Color contrast: WCAG AA compliance for text readability
  • Form labels: Associated labels for all inputs
  • Alt text: Image descriptions for screen readers
  • Keyboard navigation: Focus indicators, tab order, skip links

Best Practices (0-100)

  • HTTPS: Secure connection and mixed content checks
  • Console errors: JavaScript errors and warnings
  • Image optimization: Modern formats (WebP, AVIF), proper sizing
  • Deprecated APIs: Use of outdated browser features
  • Security headers: CSP, X-Frame-Options, etc.

4. Structured Data (JSON-LD richness + structural completeness)

Evaluates both schema.org JSON-LD markup and fundamental HTML structure that helps AI models understand your content.

JSON-LD Types (20 points per type, max 100%)

  • Organization: Company info, logo, social profiles
  • WebSite: Site-level metadata and search action
  • WebPage: Page-level context and breadcrumbs
  • Article/BlogPosting: Content metadata, author, dates
  • Product: Pricing, availability, reviews
  • FAQPage: Structured question-answer pairs
  • HowTo: Step-by-step instructions
  • BreadcrumbList: Navigation hierarchy
  • LocalBusiness: Address, hours, geo coordinates
  • Review/AggregateRating: Star ratings and testimonials

Structural Checks (15 boolean checks = 100%)

Each check contributes ~6.7% to the structural score:

  • Heading hierarchy: Single H1, proper H2-H6 nesting
  • Semantic elements: header, nav, main, article, section, footer, aside usage
  • Lists or structured data: ul/ol lists or schema.org markup present
  • Paragraph structure: Multiple <p> tags with substantive content
  • Navigation structure: <nav> element with links
  • Rich content: Images, videos, or interactive elements beyond plain text
  • Content variety: Mix of text, media, and structured data
  • Title tag: <title> present and non-empty
  • Meta description: <meta name="description"> present
  • JSON-LD present: At least one <script type="application/ld+json">
  • Open Graph tags: og:title, og:description, og:image
  • Valid HTML structure: Proper DOCTYPE, head, body structure
  • Mobile-friendly: Viewport meta tag configured
  • Robots and crawling: No blocking meta robots tags
  • Language declaration: lang attribute on <html>

5. Freshness (content recency distribution)

We extract and analyze date signals from your pages to understand content freshness. AI models favor recent, up-to-date information.

Date Detection Methods

  • JSON-LD dates: datePublished, dateModified, dateCreated from schema.org
  • Meta tags: article:published_time, article:modified_time, og:updated_time
  • HTML time elements: <time datetime="..."> with valid ISO dates
  • Visible date patterns: Regex extraction from page content (e.g., "Updated: Jan 15, 2025")
  • Sitemap dates: lastmod from XML sitemaps as fallback

Freshness Categories

  • Fresh (green): Updated within last 30 days
  • Stale (yellow): Updated 30+ days ago
  • Unknown (gray): No date signals detected

Distribution Chart

The donut chart shows the percentage of pages in each category across your entire site audit.

Quick wins — by metric

LLM Visibility

  • Add a concise brand one‑liner to the top of your home and “What is X” page.
  • Create a short “Why [Brand]?” explainer section that models can quote.
  • Ensure your site name and product names are consistent across pages and metadata.

Coverage (Pages & Domains)

  • Publish or refresh high‑intent pages: pricing, “compare vs”, “integrations”, docs.
  • Fix canonicals; remove duplicates so models cite one clean URL per topic.
  • Earn citations on reputable third‑party domains (analyst, review, partner sites).

Sentiment

  • Add pros/benefits bullets near the top of key pages and product docs.
  • Address recurring negatives with a simple FAQ/objections section.
  • Keep a visible changelog/release notes page highlighting recent improvements.

Competitors

  • Publish “X vs [Brand]” pages with neutral, scannable comparison tables.
  • Create migration guides (“Move from X to [Brand]”) for top alternatives.
  • Link “Compare” pages from the navbar/footer to raise their authority.

Improving your score

  • Strengthen high-impact pages with clear, up-to-date content. You will need to add dates to your content.
  • Earn citations on relevant third-party domains (reviews, analysts, comparisons)
  • Ensure consistent branding and canonical URLs
  • Track changes weekly and iterate based on the opportunities surfaced
Questions? Email support@llmscout.co.