.md
**What it does NOT do:**
* Expose drafts, private posts, or password-protected content
* Serve non-post/page content types by default
* Require any configuration. Activate and it works.
== Why Markdown? ==
HTML is expensive for AI systems to process. [Cloudflare measured](https://blog.cloudflare.com/markdown-for-agents/) an 80% reduction in token usage when converting a blog post from HTML to Markdown (16,180 tokens down to 3,150).
Cloudflare now offers [Markdown for Agents](https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/) at the CDN edge via the `Accept: text/markdown` header, available on Pro, Business, and Enterprise plans.
This plugin does the same thing at the origin, so it works on any host. It also adds `.md` suffix URLs, `?format=markdown` query parameters, YAML frontmatter, static file caching, and server-level offloading.
If you use Cloudflare, both share the same `Accept: text/markdown` header, `Content-Signal` headers, and `X-Markdown-Tokens` response headers.
Cloudflare currently defaults to `Content-Signal: ai-train=yes, search=yes, ai-input=yes` with no way to change it. Botkibble defaults to `ai-train=no` and lets you override the full signal per site via the `botkibble_content_signal` filter.
== Performance & Static Offloading ==
This plugin supports static file offloading by writing Markdown content to `/wp-content/uploads/botkibble/`.
=== Nginx Configuration ===
To bypass PHP entirely and have Nginx serve the files (including variants) directly:
`
# Variants
location ~* ^/(_v/[^/]+/.+)\.md$ {
default_type text/markdown;
try_files /wp-content/uploads/botkibble/$1.md /index.php?$args;
}
# Default
location ~* ^/(.+)\.md$ {
default_type text/markdown;
try_files /wp-content/uploads/botkibble/$1.md /index.php?$args;
}
`
=== Apache (.htaccess) ===
Add this to your `.htaccess` before the WordPress rules:
`
RewriteEngine On
# Variants
RewriteCond %{DOCUMENT_ROOT}/wp-content/uploads/botkibble/_v/$1/$2.md -f
RewriteRule ^_v/([^/]+)/(.+)\.md$ /wp-content/uploads/botkibble/_v/$1/$2.md [L,T=text/markdown]
# Default
RewriteCond %{DOCUMENT_ROOT}/wp-content/uploads/botkibble/$1.md -f
RewriteRule ^(.*)\.md$ /wp-content/uploads/botkibble/$1.md [L,T=text/markdown]
`
Even without these rules, the plugin uses a "Fast-Path" that serves cached files from PHP before the main database query is executed.
== Installation ==
1. Upload the `botkibble` directory to `wp-content/plugins/`.
2. Run `composer install` inside the plugin directory to install dependencies.
3. Activate the plugin through the **Plugins** menu in WordPress.
4. That's it. No settings page needed.
**Test it:**
curl https://gregr.org/great-hvac-meltdown/?format=markdown
curl https://gregr.org/great-hvac-meltdown.md
curl -H "Accept: text/markdown" https://gregr.org/great-hvac-meltdown/
== Frequently Asked Questions ==
= How do I add support for custom post types? =
The plugin only serves posts and pages by default. To add a custom post type, use the `botkibble_served_post_types` filter in your theme or a custom plugin:
add_filter( 'botkibble_served_post_types', function ( $types ) {
$types[] = 'docs';
return $types;
} );
Be careful. Only add post types that contain public content. Do not expose post types that may contain private or sensitive data (e.g. WooCommerce orders).
= What does the YAML frontmatter include? =
Every response starts with a YAML block containing:
* `title` — the post title
* `date` — publish date in ISO 8601 format
* `type` — post type (e.g. `post`, `page`)
* `word_count` — word count of the Markdown body
* `char_count` — character count of the Markdown body
* `tokens` — estimated token count (word_count × 1.3)
* `categories` — array of category names (posts only)
* `tags` — array of tag names (posts only, omitted if none)
Example:
---
title: My Post Title
date: '2025-06-01T12:00:00+00:00'
type: post
word_count: 842
char_count: 4981
tokens: 1095
categories:
- Technology
tags:
- wordpress
- markdown
---
= How do I add custom fields to the frontmatter? =
Use the `botkibble_frontmatter` filter:
add_filter( 'botkibble_frontmatter', function ( $data, $post ) {
$data['excerpt'] = get_the_excerpt( $post );
return $data;
}, 10, 2 );
= How do I change the Content-Signal header? =
Use the `botkibble_content_signal` filter:
add_filter( 'botkibble_content_signal', function ( $signal, $post ) {
return 'ai-train=no, search=yes, ai-input=yes';
}, 10, 2 );
Return an empty string to omit the header entirely.
= Can I change the token count estimation? =
Yes, use the `botkibble_token_multiplier` filter. The default multiplier of `1.3` (word count × 1.3) comes from [Cloudflare's implementation](https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/):
add_filter( 'botkibble_token_multiplier', function () {
return 1.5; // Adjusted for a different model's tokenizer
} );
= How do I adjust the rate limit? =
Cache misses (when a post needs to be converted) are limited to 20 per minute globally. You can change this with the `botkibble_regen_rate_limit` filter:
add_filter( 'botkibble_regen_rate_limit', function () {
return 50;
} );
= Can I add custom HTML cleanup rules? =
Yes, use the `botkibble_clean_html` filter. This runs after the default cleanup and before conversion:
add_filter( 'botkibble_clean_html', function ( $html ) {
// Remove a plugin's wrapper divs
$html = preg_replace( '/(.*?)<\/div>/s', '$1', $html );
return $html;
} );
= Can I strip script nodes during conversion? =
Yes. Botkibble keeps converter node removal disabled by default (for backward compatibility), but you can opt in with `botkibble_converter_remove_nodes`:
add_filter( 'botkibble_converter_remove_nodes', function ( $nodes ) {
$nodes = is_array( $nodes ) ? $nodes : [];
$nodes[] = 'script';
return $nodes;
} );
If you also need `application/ld+json`, extract it in `botkibble_clean_html` first, then let converter-level script removal clean up any remaining script tags.
= How do I modify the body before metrics are calculated? =
Use the `botkibble_body` filter. This is the best place to add content like ld+json that you want included in the word count and token estimation:
add_filter( 'botkibble_body', function ( $body, $post ) {
$json_ld = '';
return $body . "\n\n" . $json_ld;
}, 10, 2 );
= How do I modify the final Markdown output? =
Use the `botkibble_output` filter to append or modify the text after conversion:
add_filter( 'botkibble_output', function ( $markdown, $post ) {
return $markdown . "\n\n---\nServed by Botkibble";
}, 10, 2 );
= Can I cache multiple Markdown variants (e.g. a slim version)? =
Yes. Add `?botkibble_variant=slim` when requesting Markdown to generate and serve a separate cached file. To ensure your variants are invalidated on post updates, return them from the `botkibble_cache_variants` filter:
add_filter( 'botkibble_cache_variants', function ( $variants, $post ) {
$variants[] = 'slim';
return $variants;
}, 10, 2 );
= Can I disable the Accept header detection? =
Yes, if you only want to serve Markdown via explicit URLs (.md or ?format=markdown), use the `botkibble_enable_accept_header` filter:
add_filter( 'botkibble_enable_accept_header', '__return_false' );
= Does the .md suffix work with all permalink structures? =
It works with the most common structures (post name, page hierarchy). Complex date-based permalink structures may require the query parameter or Accept header method instead.
= What about password-protected posts? =
They return a `403 Forbidden` response. There's no point serving a password form to a bot.
= What are the response headers? =
* `Content-Type: text/markdown; charset=utf-8`
* `Vary: Accept` — tells caches that responses vary by Accept header
* `X-Markdown-Tokens: ` — estimated token count (word_count × 1.3)
* `X-Robots-Tag: noindex` — prevents search engines from indexing the Markdown version
* `Link: ; rel="canonical"` — points search engines to the original HTML post
* `Link: ; rel="alternate"` — advertises the Markdown version for discovery
* `Content-Signal: ai-train=no, search=yes, ai-input=yes` — see [contentsignals.org](https://contentsignals.org/)
== Credits ==
We thank Cristi Constantin (https://github.com/cristi-constantin) for contributing cache variants, URL and SEO improvements, and fixing important bugs.
== Changelog ==
= 1.3.0 =
* Changed default Content-Signal from ai-train=yes to ai-train=no (opt-out of AI training by default).
* Added botkibble_converter_remove_nodes filter for opt-in HTML node stripping during conversion.
= 1.2.1 =
* Changed cache directory from /uploads/botkibble-cache/ to /uploads/botkibble/ per plugin guidelines.
= 1.2.0 =
* Rebranded to Botkibble to avoid naming ambiguity.
* Prefixed all functions, filters, and constants with botkibble_ for better compatibility.
* Updated symfony/yaml to 7.4.1 (Requires PHP 8.2).
* Corrected all internal references and documentation.
= 1.1.2 =
* Fixed routing issues for posts by implementing a custom botkibble_path resolver.
* Disabled canonical redirects for .md URLs to prevent 301 trailing slash loops.
* Added automatic version-based rewrite rule flushing.
= 1.1.0 =
* Replaced manual YAML encoder with symfony/yaml for security.
* Replaced regex-based shortcode removal with native strip_shortcodes().
* Added token estimation based on 1.3 word count heuristic.
* Replaced transients with static file offloading in /uploads/.
* Added SEO protection with noindex and canonical headers.
* Added "Fast-Path" serving to bypass main DB queries for cached content.
* Added support for direct server offloading (Nginx/Apache).
= 1.0.0 =
* Initial release.
* HTML-to-Markdown conversion via league/html-to-markdown.
* .md suffix, query parameter, and Accept header support.
* YAML frontmatter with title, date, categories, tags.
* Static file caching with automatic invalidation.
* Content-Signal and X-Markdown-Tokens response headers.
* Discovery via alternate link tag.