=== Crawler Record === Contributors: dizzysoft Tags: googlebot, bingbot, gptbot, seo, robots Requires at least: 6.0 Tested up to: 6.9 Requires PHP: 7.4 Stable tag: 0.9.3 License: GPLv2 or later License URI: https://www.gnu.org/licenses/gpl-2.0.html Crawler Record tells you the last time each of the most common search/chat bots visited (Google, ChatGPT, etc)- and the pages at which they looked. == Description == **Crawler Record** logs the last time specific user agents (for search and AI chat/LLMs) accessed your content including: - Google - Bing - ChatGPT (OpenAI) - Claude (Anthropic) - Perplexity - DuckDuckGo - Meta - Apple You can view this information: - **Admin page**: a grouped list of all crawlers, with last seen time, last page, and robots status for the crawler. - **Admin Bar (front-end)**: quick-glance status for the current singular or URL context (no dropdowns in wp-admin). - **Per Page/Post**: From the edit screen on any page or post, you can see that last time each crawler visited that page. - **Recent Pages**: From the admin page you can select a crawlwer and get a list of the recent pages this UA has visited. **Robots-aware:** The plugin checks your **robots.txt** and evaluates **Allow/Disallow** rules for a given path. If **Settings → Reading → “Discourage search engines”** is enabled, all agents are shown as blocked with a prominent warning. **Performance-friendly by design:** Write-throttling (default 10 minutes) and an auxiliary “last post ID per agent” record avoids heavy admin queries on large sites. **Privacy-friendly:** Saves only bot visit timestamps and last URLs crawled — no personal data. Learn [**how to use this plugin**](https://www.dizzysoft.com/crawler-record-plugin-for-wordpress/). = Highlights = * Supports common user-agent variants for Google, Bing, OpenAI (ChatGPT), Anthropic (Claude), Perplexity, Meta, Apple, and DuckDuckGo. * Robots status computed from local robots.txt (physical or virtual) without outbound requests. * Clear UI with grouped sections, microsecond timestamps, and a small diagnostics toggle showing the matched robots group and rule. = Known limitations = * Can only track crawlers from the time the plugin is installed- cannot look into the past. * robots.txt **wildcards** (`*`) and end-of-line marker (`$`) are **not** interpreted; matching is prefix-based only. Future versions may add full spec support. * To reduce database load and avoid slowing the website, bot tracking is throttled, so not every crawler visit will necessarily be recorded. == Installation == 1. Upload the plugin folder to `/wp-content/plugins/`, or install via the admin Plugins screen. 2. Activate the plugin. 3. Visit **Crawler Record** under **Admin → Crawler Record** to review crawler activity. == Frequently Asked Questions == = Why are there no crawler visits recorded? = This plugin can only begin to track crawler/bot visits from the time you have installed the plugun- not before. It may take several days or weeks (depending upon the popularity of your website) before any come for a visit. = Why do I see “Blocked by WordPress setting”? = If **Settings → Reading → Discourage search engines** is enabled, so none of these systems are allowed to read the pages on your site. == Changelog == = 0.9.2 = * Added Recent Pages history for each crawler. * Added support for additional crawler variants including Google-Agent (mobile and desktop), Google-Extended, Bing (Search, Chat, and Copilot), Claude-SearchBot, Perplexity-User, DuckDuckGo AI, Applebot-Extended, and Meta-ExternalFetcher. * Improved crawler tracking for non-post frontend URLs such as archives, taxonomy pages, and other site URLs. * Improved site-wide reporting so the latest page visited by each crawler is more accurate. * Improved admin bar reporting for both singular content and non-singular frontend URLs. * Improved robots.txt reporting with clearer diagnostics showing the matched group and rule. * Added informational robots status handling for agents that may ignore or bypass robots.txt, such as Google-Agent and Meta-ExternalFetcher. * Improved handling of the WordPress “Discourage search engines” setting with clearer blocked-status warnings in the admin interface. * Improved storage of recent crawler activity with bounded recent-page history per agent. = 0.9.1 = * Fixed error on any WP archive pages (pages that list posts). = 0.9.0 = * Updated for WordPress 6.9 * Now monitoring for Meta and Apple User Agents * More accurate site-wide UA reporting. * Ensured video tutorial appears on all admin screens. * Fixed small code errors. = 0.8.0 = * Google updated its useragents so I updated the matching strings to account for these changes. = 0.7.0 = * The robots.txt checker wasn't actually working. It does now. * When you look at the report in the admin section, the robots checker is looking for a site-wide rule; when you look at the back-end of a page, the robots checker is looking at that particular page. * If a page is blocked by the robots.txt file, a link appears sending you to the robots.txt file. * Added a video to explain how to use this plugin. = 0.6.0 = * Clarified distinction between Googlebots. * Better distinguishes Bingbots. * Cosmetic changes to page in admin section. * More clear documentation. = 0.5.0 = * First public release == Privacy == This plugin stores: - **Timestamps** of crawler visits (float, with microseconds) - **Last URL** seen per crawler (per-URL records) - **Last post ID** per crawler (for admin performance) It does **not** collect or store personal data about site visitors. No data is transmitted to third parties. == License == GPLv2 or later. See LICENSE file.