Title: Markdown for Agents and Statistics
Author: chancerylaneproject
Published: <strong>22.4.2026</strong>
Last modified: 3.7.2026

---

Hae lisäosia

![](https://ps.w.org/markdown-for-agents-and-statistics/assets/banner-772x250.png?
rev=3518653)

![](https://ps.w.org/markdown-for-agents-and-statistics/assets/icon-256x256.png?
rev=3518653)

# Markdown for Agents and Statistics

 [chancerylaneproject](https://profiles.wordpress.org/chancerylaneproject/)

[Lataa](https://downloads.wordpress.org/plugin/markdown-for-agents-and-statistics.1.5.1.zip)

 * [Tiedot](https://fi.wordpress.org/plugins/markdown-for-agents-and-statistics/#description)
 * [Arvostelut](https://fi.wordpress.org/plugins/markdown-for-agents-and-statistics/#reviews)
 *  [Asennus](https://fi.wordpress.org/plugins/markdown-for-agents-and-statistics/#installation)
 * [Kehitys](https://fi.wordpress.org/plugins/markdown-for-agents-and-statistics/#developers)

 [Tuki](https://wordpress.org/support/plugin/markdown-for-agents-and-statistics/)

## Kuvaus

Markdown for Agents and Statistics converts your WordPress content to Markdown and
serves it
 to AI agents and language model tools that request it via HTTP content
negotiation (`Accept: text/markdown`).

The Chancery Lane Project is a charity that helps organisations reduce emissions
using the power of legal documents and processes. We’ve published this plugin as
we believe that making content more legible for AI Agents makes a meaningful difference
to their energy usage – not only by reducing the amount of tokens required (by up
to 90% over HTML) to consume the content, but also minimising the server resources
required to render, process and display pages at source.

**How it works:**

 1. Posts and taxonomy archive pages are converted to Markdown and saved as static
    
    files on disk inside `wp-content/uploads/`.
 2. When a visitor (or AI agent) requests a page with `Accept: text/markdown` in
     the
    HTTP headers, WordPress serves the pre-generated `.md` file directly — no page 
    render required.
 3. A `<link rel="alternate" type="text/markdown">` tag is added to each page’s
     so
    agents can discover Markdown versions automatically.

**Features:**

 * Content negotiation (`Accept: text/markdown`, `?output_format=md`, or known AI
   User-Agents)
 * **Taxonomy archive support** — category, tag, and custom taxonomy term pages 
   served as Markdown post listings
 * Automatic Markdown generation on post save; taxonomy archives auto-update when
   any post in the term changes
 * AJAX bulk generation with live progress counter — no page timeouts on large sites
 * Per-post-type field configuration — choose which meta/ACF fields go in frontmatter
   or body
 * ACF support with dot notation for nested group fields (e.g. `group.subfield`)
 * Content fields option — use ACF fields as the body content instead of post_content
 * Manifest generation with content hashes and change tracking per post type
 * Incremental export — only re-export changed documents (`--incremental`)
 * Delta file (`changes.json`) for RAG system sync
 * Access statistics — logs AI agent requests with a dedicated stats admin page
 * Access grouping by class of agent
 * **Optional frontmatter fields** — hierarchy (parent/ancestors/children IDs), 
   author display name, root-relative featured image paths
 * **Topics section** — appends a `## Topics` section with linked taxonomy terms
   to the Markdown body
 * **Export preview** — preview generated Markdown inline in the post editor without
   writing to disk
 * WP-CLI commands: `generate`, `generate-taxonomies`, `prune-stats`, `status`, `
   delete`
 * Fully unit-tested

## Asennus

 1. Upload the plugin to `/wp-content/plugins/markdown-for-agents/`, or install via
    the WordPress Plugins screen.
 2. Activate the plugin through the Plugins screen in WordPress.
 3. Visit **Settings  Markdown for Agents** and choose which post types and taxonomies
    to generate.
 4. Enable **Auto-generate on save** so files stay in sync as you publish or edit content(
    optional).
 5. Click **Generate All** to create Markdown for your existing content. On large sites
    you can also run `wp markdown-agents generate` and `wp markdown-agents generate-
    taxonomies` from WP-CLI.
 6. Verify by appending `?output_format=md` to any post URL (or using an AI User-Agent)
    to confirm Markdown is served.

## UKK

### Where are the Markdown files stored?

Inside `wp-content/uploads/{export_dir}/` (configurable in Settings). Post files

live under `{export_dir}/{post-type}/{slug}.md`. Taxonomy archive files live under{
export_dir}/taxonomy/{taxonomy}/{term-slug}.md. The directory is served by WordPress
when content negotiation is triggered.

### Will this slow down my site?

No. Markdown files are generated ahead of time (on post save or via manual/CLI
 
bulk generation). Serving them is a simple file read, much faster than rendering
a full WordPress page.

### AI agents are getting HTML instead of Markdown. Why?

Almost always this is a CDN, firewall, or page cache sitting in front of
 WordPress—
not the plugin. On many hosts (for example Cloudflare in front of WP Engine) the
edge answers a request before it ever reaches the plugin: a full-page cache can 
return the cached HTML, or a bot/WAF rule can block a known AI crawler with a 403/
429.

The reliable route is the query parameter: append `?output_format=md` to any post

or archive URL. Because that is a distinct URL, caches store it separately and firewalls
treat it as an ordinary request, so it reaches the plugin even on a hardened stack.
The plugin advertises this URL automatically via a tag in each page’s `<head>`, 
so agents that read the page can discover and follow it.

The `Accept: text/markdown` header and User-Agent routes also work, but only if

your CDN/cache is configured to let them through (see the next question).

### How do I let my CDN or cache serve Markdown to agents?

This is host/CDN configuration, not a plugin setting. Two changes help:

 * **Page cache (WP Engine, LiteSpeed, Varnish, nginx):** exclude agent-shaped
    
   requests from the full-page cache — any request whose `Accept` header contains
   text/markdown, whose query string contains `output_format=md`, or whose User-
   Agent is a known AI bot. Do **not** add User-Agent to the cache _key_; that fragments
   the cache for every visitor. Exclude from caching, do not key on it.
 * **Firewall / bot rules (Cloudflare):** add a skip/allow rule for the AI
    User-
   Agents you want to serve (for example GPTBot, ClaudeBot, PerplexityBot, Google-
   Extended). Otherwise they receive a 403/429 and get nothing.

If you skip this, nothing breaks — agents simply use the `?output_format=md` URL

via discovery instead. The plugin already protects against the reverse problem: 
Markdown responses are sent with `Cache-Control: private, no-store` and Vary: Accept,
User-Agent, so a shared cache cannot replay the Markdown to a human browser on the
same URL.

### How can I check what an agent actually receives?

Request a page the way an agent would and inspect the response headers:

    ```
    `
    ```

### Query-param route (the reliable one)

curl -sI ’https://example.com/your-post/?output_format=md’

### Accept-header route

curl -sI -H ’Accept: text/markdown’ ’https://example.com/your-post/’
 `

A genuine Markdown response from the plugin has `Content-Type: text/markdown` and

an `X-Markdown-Source: markdown-for-agents` header. If you instead see Content-Type:
text/html, the request was answered by a cache or firewall before reaching the plugin(
see the previous questions). Note that running these from your own server may bypass
your CDN; testing from an external network shows what real agents experience.

### Should I publish an llms.txt file?

llms.txt is a proposed convention for a single Markdown index of your site at
 https://
example.com/llms.txt, aimed at AI tools that look for a site-level manifest. It 
is an emerging community convention, not an official standard, and there is limited
evidence that the major AI crawlers consume it yet — so treat it as low-cost, optional,
and complementary to the per-page discovery this plugin already provides.

This plugin does not generate `llms.txt`. If you want one, publish a static file
at your web root listing your
 key pages with their `?output_format=md` URLs, and
keep it in sync with published and retired content or it will point agents at missing
pages.

### What are taxonomy archive files?

For every public taxonomy term (categories, tags, custom taxonomies) the plugin

generates a Markdown file listing all published posts in that term with links and
excerpts. These are served automatically when an AI agent requests a taxonomy archive
URL. This lets agents navigate your site structure by exploring term listings, not
just individual posts.

### What is the manifest.json file?

When you generate with `--with-manifest` or `--incremental`, a `manifest.json` is

created inside each post-type export folder (e.g. `wp-mfa-exports/post/manifest.
json`). It contains a registry of all exported documents with content hashes and
change tracking (new/modified/unchanged/deleted), enabling RAG systems to identify
what changed since the last export without reprocessing all documents.

### How does incremental export work?

Use `wp markdown-agents generate --incremental` to only re-export documents that

have changed since the last export. The plugin compares content hashes against the
previous manifest.json and skips unchanged posts. This also generates a changes.
json delta file listing new, modified, and deleted documents — your RAG system can
read this to know exactly what to re-embed.

### How do I configure fields per post type?

In **Settings  Markdown for Agents**, each enabled post type has its own
 ”Field
Configuration” section with two textareas:

 * **Frontmatter fields** — meta or ACF fields added to the YAML frontmatter.
 * **Content fields** — meta or ACF fields used as the body content. When set,
    
   post_content is automatically excluded.

Use dot notation for ACF group fields (e.g. `clause_fields.clause_summary`).
 Plain
meta keys work too (e.g. `_yoast_wpseo_title`). ACF relationship fields are automatically
converted to a list of post titles.

### Can I customise the Markdown output?

Yes. Several filters are available:

 * `markdown_for_agents_pre_convert` — filter HTML before conversion
 * `markdown_for_agents_post_convert` — filter Markdown after conversion
 * `markdown_for_agents_frontmatter` — modify frontmatter fields for a post
 * `markdown_for_agents_taxonomy_frontmatter` — modify frontmatter fields for a 
   taxonomy archive
 * `markdown_for_agents_serve_enabled` — enable/disable serving for a specific post
 * `markdown_for_agents_serve_taxonomies` — enable/disable serving for taxonomy 
   archive pages
 * `markdown_for_agents_cache_headers` — override the cache-related headers sent
   with the Markdown response
 * `markdown_for_agents_file_generated` — action fired after a file is written
 * `markdown_for_agents_file_deleted` — action fired after a file is deleted

### Can I let CDNs/full-page caches cache the Markdown responses?

By default the Markdown response is sent with `Cache-Control: private, no-store,
max-age=0` (plus `X-LiteSpeed-Cache-Control`, `X-Accel-Expires` and `Vary: Accept,
User-Agent`). This is deliberate: the Markdown is negotiated on the _same URL_ as
the HTML page, so a shared cache that ignores or normalises `Vary` could otherwise
store the Markdown variant and replay it to ordinary browsers expecting HTML.

If your CDN/cache layer honours `Vary` correctly (or you serve Markdown from distinct
URLs), you can relax this with the `markdown_for_agents_cache_headers` filter. Map
any header to an empty string to omit it entirely:

    ```
    `
    ```

add_filter( ’markdown_for_agents_cache_headers’, function ( array $headers, string
$filepath ) {
 $headers[’Cache-Control’] = ’public, max-age=300’; $headers[’X-LiteSpeed-
Cache-Control’] = ”; $headers[’X-Accel-Expires’] = ”; return $headers; }, 10, 2 );`

This filter governs only the cache-related headers listed above. The `Content-Signal`
and `X-Markdown-Source` headers are sent separately and are unaffected (`Content-
Signal` has its own `markdown_for_agents_content_signal` filter).

Override with caution — incorrectly cached Markdown will be served to browsers.

### How do I generate taxonomy archives via WP-CLI?

`wp markdown-agents generate-taxonomies
 wp markdown-agents generate-taxonomies --
taxonomy=category wp markdown-agents generate-taxonomies --dry-run

## Arvostelut

There are no reviews for this plugin.

## Avustajat & Kehittäjät

“Markdown for Agents and Statistics” perustuu avoimeen lähdekoodiin. Seuraavat henkilöt
ovat osallistuneet tämän lisäosan kehittämiseen.

Avustajat

 *   [ chancerylaneproject ](https://profiles.wordpress.org/chancerylaneproject/)

[Käännä “Markdown for Agents and Statistics” omalle kielellesi.](https://translate.wordpress.org/projects/wp-plugins/markdown-for-agents-and-statistics)

### Oletko kiinnostunut kehitystyöstä?

[Browse the code](https://plugins.trac.wordpress.org/browser/markdown-for-agents-and-statistics/),
check out the [SVN repository](https://plugins.svn.wordpress.org/markdown-for-agents-and-statistics/),
or subscribe to the [development log](https://plugins.trac.wordpress.org/log/markdown-for-agents-and-statistics/)
by [RSS](https://plugins.trac.wordpress.org/log/markdown-for-agents-and-statistics/?limit=100&mode=stop_on_copy&format=rss).

## Muutosloki

#### 1.5.1

 * Add `markdown_for_agents_cache_headers` filter so the cache-related headers on
   Markdown responses can be customised (e.g. to allow CDN caching where `Vary` 
   is honoured). Defaults are unchanged and remain cache-bypassing.

#### 1.5.0

 * Add new ’skipped’ grouping on generating MD files to show those that have been
   skipped for good reason (password or draft etc) rather than failed.
 * Add new ’Agent Class’ graph display on Agent Stats page which mimics Known Agents
   classifications to help understand traffic patterns
 * Better documentation for caching and generation logic

#### 1.4.5

 * Fix: Issues where memcache could cause problems on CLI invoked rebuilds on large
   sites. Also resolves minor issues with and outputs generated by post filters 
   appearing in MD output, while allowing for same in ` blocks where needed.`

#### 1.4.4

 * Fix: full-page caches (LiteSpeed, Varnish, nginx fastcgi_cache) could store the
   Markdown response under a page URL when an AI agent or `?output_format=md` request
   hit it first, then replay the `.md` body to subsequent HTML browser requests.
   Markdown responses now send `Cache-Control: private, no-store`, `X-LiteSpeed-
   Cache-Control: no-cache`, `X-Accel-Expires: 0`, and `Vary: Accept, User-Agent`
   unconditionally.

#### 1.4.3

 * Update to fix deleting posts on status change outside of auto-update flow

#### 1.4.2

 * Fixed issue with private/draft posts being created as MD files and added checkbox
   to post edit pages to exclude posts from MD generation. Also fixes small issue
   in unusual taxonomy slugs prodducing incorrect URLs in Topics secion of MD body.
   Adds Strauss namespacing to html-to-markdown/Composer includes to avoid collisions.

#### 1.4.1

 * Removed `llms.txt` index generation. The `LlmsTxtGenerator` class, its `--with-
   llmstxt` WP-CLI flag on `wp markdown-agents generate`, and the corresponding 
   unit tests have been dropped.

#### 1.4.0

 * Add notices and copy around generating and regenerating content on install and
   updates to Settings
 * Add transient to store and note when content needs regenerating

#### 1.3.0

 * Optional hierarchy frontmatter fields (`parent`, `ancestors`, `children` IDs)
   for hierarchical post types (pages, etc.).
 * Optional author display name in frontmatter.
 * Optional root-relative paths for featured images (survives domain migrations).
 * Optional `## Topics` section appended to the Markdown body with linked taxonomy
   terms.
 * Export preview — ”Preview Markdown” button in the post meta box renders generated
   Markdown inline without writing to disk.
 * New WP-CLI command: `wp markdown-agents prune-stats [--days=<n>] [--yes]` — removes
   access stats older than N days.
 * Manifest hash now covers taxonomy term slugs — incremental export correctly detects
   posts whose terms changed.

#### 1.2.0

 * Taxonomy archive support — generates Markdown index files for all public taxonomy
   terms (categories, tags, custom taxonomies), served via content negotiation.
 * Taxonomy archives auto-regenerate when any post in the term is saved or deleted.
 * AJAX bulk generation for taxonomy archives on the Settings page with live progress
   counter.
 * New WP-CLI command: `wp markdown-agents generate-taxonomies [--taxonomy=<slug
   >] [--dry-run]`.
 * `<link rel="alternate" type="text/markdown">` tag now emitted on taxonomy archive
   pages.
 * New filter: `markdown_for_agents_serve_taxonomies` to enable/disable taxonomy
   archive serving globally.
 * New filter: `markdown_for_agents_taxonomy_frontmatter` to modify taxonomy archive
   frontmatter before serialisation.
 * Bulk generation buttons converted to AJAX with live counter — no more page timeouts
   on large sites.

#### 1.1.0

 * Per-post-type field configuration for frontmatter and content fields.
 * ACF support with dot notation for nested group fields.
 * Content fields option — use ACF/meta fields as body content instead of post_content.
 * ACF relationship fields automatically normalised to post titles.
 * Added manifest.json generation with content hashes and change tracking.
 * New `--with-manifest` flag for `wp markdown-agents generate`.
 * Manifest is generated per post-type folder for independent change tracking.
 * Incremental export via `--incremental` — skips unchanged documents.
 * Delta file (`changes.json`) generated for RAG system integration.
 * Access statistics — logs AI agent requests; dedicated stats admin page.
 * UA detection — configurable User-Agent strings force Markdown serving.

#### 1.0.0

 * Initial release.

## Metatiedot

 *  Version **1.5.1**
 *  Last updated **22 tuntia sitten**
 *  Active installations **70+**
 *  WordPress version ** 6.3 or higher **
 *  Tested up to **7.0**
 *  PHP version ** 8.1 or higher **
 *  Language
 * [English (US)](https://wordpress.org/plugins/markdown-for-agents-and-statistics/)
 * Tags
 * [Agents](https://fi.wordpress.org/plugins/tags/agents/)[AI](https://fi.wordpress.org/plugins/tags/ai/)
   [content negotiation](https://fi.wordpress.org/plugins/tags/content-negotiation/)
   [LLM](https://fi.wordpress.org/plugins/tags/llm/)[markdown](https://fi.wordpress.org/plugins/tags/markdown/)
 *  [Edistynyt näkymä](https://fi.wordpress.org/plugins/markdown-for-agents-and-statistics/advanced/)

## Arvosanat

No reviews have been submitted yet.

[Your review](https://wordpress.org/support/plugin/markdown-for-agents-and-statistics/reviews/#new-post)

[See all reviews](https://wordpress.org/support/plugin/markdown-for-agents-and-statistics/reviews/)

## Avustajat

 *   [ chancerylaneproject ](https://profiles.wordpress.org/chancerylaneproject/)

## Tuki

Viimeisen kahden kuukauden aikana ratkaistut ongelmat:

     1 / 2

 [Tukifoorumi](https://wordpress.org/support/plugin/markdown-for-agents-and-statistics/)