AI Page Token Inspector

See your page through an LLM's eyes - token by token.

Articles built to be cited.

Sight AI structures every article for clean LLM extraction - clear claims, scannable subheads, semantic markup. 7 free articles to test it.

Get 7 free articles

Walkthrough

How it works

1
Enter any URL
We strip out nav, footer, scripts, and styles to leave just the visible text - what an LLM would actually consume.
2
See the token estimate
Approximate tokens at ~4 characters per token. Useful for estimating cost or context fit.
3
Compare to model windows
See what % of GPT, Claude, Gemini, and Perplexity's context your page would consume.
4
Spot focus drift
The top focus words show what an LLM would consider the page to be "about". If your target keyword isn't in the top 10, you have a problem.

Why it matters

A small detail that compounds.

Modern AI assistants don't crawl the whole web on the fly - they retrieve a chunk of your page, fit it into a context window, and decide what to cite. Pages that don't fit cleanly get truncated or skipped.

When ChatGPT decides whether to cite your page or your competitor's, the input is usually 1,500–3,000 tokens of content. That window favors clearly structured, focused articles over sprawling, unfocused ones.

With Sight AI

Citation-grade structure on every article.

Sight AI articles are built around a clear topic core: a one-sentence claim, then evidence, then supporting points. That's exactly the structure LLMs look for when deciding what to cite.

Combined with our schema markup, semantic HTML, and llms.txt support, your articles consistently land in the citation slot when AI assistants answer relevant queries.

Clear lead-claim → evidence → supporting structure
Semantic HTML and JSON-LD on every article
llms.txt automation for clean LLM discovery
Built-in tracking of which articles get cited and where

FAQ

Common questions.

How accurate is the token count?

Within ~10% for English text. Each LLM uses a slightly different tokenizer (BPE variants), so exact counts vary by ±5–15%. The chars-per-token heuristic is the standard approximation.

Why does my page show way more tokens than I expect?

Hidden boilerplate. Footers, nav menus, JavaScript-injected text, and cookie banners all count as visible text once HTML is stripped. Inspect the preview to see what's actually getting included.

Should my article fit in a single context window?

Almost always yes. Most AI assistants only retrieve a portion of your page; if your key claim is on character 80,000 of a 100,000-char article, it won't make it into context.

How does this differ from Word's word count?

Word counts the full document including formatting markers. We count only the visible text content - what an LLM actually reads.

7 free articles included

Get 7 free articles with Sight AI

Sight AI writes long-form, SEO-optimized articles for you and tracks how AI assistants like ChatGPT and Claude see your brand. Create a free account to claim your 7 starter articles.

Create free account Browse more free tools

7 articles, AI visibility tracking, and our full publishing suite included.

More free SEO tools

Keep optimizing - every tool is free and runs in your browser.

View all tools

Crawler Simulator

See exactly what Googlebot, GPTBot, and Claude see on your page.

Query Fan-Out

Turn one query into 20+ AI-search variations.

LLMs.txt Generator

Generate an llms.txt file so AI models index your site correctly.

AI SEO Audit

Audit your site for AI search readiness in 30 seconds.

Brand Visibility Report

See how often AI assistants mention your brand.

AI Search Visibility

Check how visible your site is in AI search engines.

Visibility

Content

Indexing

AI Agents