=== Substack Importer === Contributors: wordpressdotorg Tags: importer, substack Requires at least: 5.2 Tested up to: 6.9 Requires PHP: 7.4 Stable tag: 1.2.0 License: GPLv2 or later License URI: https://www.gnu.org/licenses/gpl-2.0.html The Substack Importer allows you to import content from a Substack newsletter into your WordPress site. == Description == The Substack Importer will import content from an export file downloaded from your Substack newsletter. The following content will be imported: - Posts and images. - Podcasts. - Comments (only for publicly accessible posts). - Author information. In the future, we plan to improve the importer by: - Mailing lists. - Enhancing the performance of processing export files with many posts and media. == Installation == This plugin depends on the [WordPress Importer](https://wordpress.org/plugins/wordpress-importer) plugin which needs to be installed first. To install the Substack Importer: 1. Upload the `substack-importer` directory to the `/wp-content/plugins/` directory 1. Activate the plugin through the 'Plugins' menu in WordPress == Development == For running unit tests and contributing to the plugin, see the [README on GitHub](https://github.com/wordpress/substack-importer#development). Tests can be run with wp-env or with any local WordPress setup paired with a Docker MySQL container. Run `composer install` first, then `vendor/bin/phpunit`. == Changelog == = 1.2.0 = * Compatibility: the plugin now requires PHP 7.4 or higher. * Enhancement: added new pre-import options for forcing Draft status, choosing publish date mode, setting the first image as Featured Image, and applying a global Category/Tag. * Enhancement: improved import behavior handling for featured image assignment and post metadata processing during import. * Enhancement: added `substack_importer_paywall_marker_text` filter to customize paywall marker text. * Enhancement: added `substack_importer_paywall_content` filter to override paywall block conversion. * Enhancement: added `substack_importer_post_content_after_conversion` filter to modify content after Gutenberg conversion. * Enhancement: added `substack_importer_raw_content` filter to modify raw HTML before Gutenberg conversion. * Enhancement: added `substack_importer_subtitle` filter to customize or skip the subtitle heading. * Enhancement: added `substack_importer_post_meta` filter to modify post metadata before processing. * Enhancement: added `substack_importer_converted_node` filter to customize individual block conversions. * Enhancement: added `substack_importer_image_result` filter to modify image block attributes. * Enhancement: added `substack_importer_embed_result` filter to modify embed block results after conversion. * Enhancement: added `substack_importer_pre_embed_conversion` filter to short-circuit embed conversion before default handling. * Enhancement: added `substack_importer_audio_block` filter to customize the podcast audio block. * Enhancement: added `substack_importer_before_post` action that fires before each post is processed. * Enhancement: added `substack_importer_after_post` action that fires after each post is added to the WXR. = 1.1.2 = * Enhancement: support captions for images. * Enhancement: support TikTok embeds * Compatibility: the plugin now requires PHP 7.2 or higher. * Fix: convert preformatted content to verse block. * Fix: twitter conversion bug. = 1.1.1 = * Tested up to WordPress 6.7 * Fix: null checking = 1.1.0 = * Update `wxr-generator` to latest version. Fixes a bug where imports could error out due to a misformed timezone identifier. = 1.0.9 = * Use subtitle as post excerpt if not empty * Testing the plugin up to WordPress 6.4.2 * Fix PHPCS error and cleanup composer.lock = 1.0.8 = * Removed the subscription input from post content = 1.0.7 = * Convert the paywall div to a paragraph = 1.0.6 = * Testing the plugin up to WordPress 6.2 = 1.0.5 = * Add support for WordPress 6.1 = 1.0.4 = * Fix Soundcloud embeds = 1.0.3 = * Identify authors for draft posts as "Draft Posts" = 1.0.2 = * Republishing to fix a CI error. = 1.0.1 = * Remove unnecessary load_meta_data line. * Fix embeds not displaying properly on website. = 1.0.0 = * Add post meta for paid content. * Convert Instagram embed to a link. * Add the subtitle as a H2 at the beginning of the post. * Set the correct comment_status for posts. = 0.1.0 = * Refactored the importer. * Add support for authors. * Add support for comments. * Conversion of content to Gutenberg blocks. * Convert the export to WXR and use the WordPress Importer plugin to import the WXR. * Add progress indicator * Add support for attachments. = 0.1 = Early proof-of-concept version. == Hooks == The Substack Importer provides filters and actions at key stages of the content conversion pipeline. = Post-level Filters = == substack_importer_post_meta == Filter the post metadata loaded from the Substack API before it is used for author, comments, and other post data. Parameters: * `$post_meta` (array|null) - The post metadata from the Substack API response. * `$post` (array) - The raw Substack post data from the CSV. * `$id` (int) - The Substack post ID. == substack_importer_raw_content == Filter the raw HTML content before Gutenberg conversion. Runs after the subtitle has been prepended (if present). Useful for cleaning up Substack-specific HTML, adding custom elements, or stripping unwanted markup. Parameters: * `$html_body` (string) - The raw HTML content from the Substack export. * `$post` (array) - The raw Substack post data from the CSV. * `$post_meta` (array|null) - The post metadata from the Substack API response. == substack_importer_subtitle == Filter the subtitle HTML before it is prepended to the post content. Return an empty string to skip the subtitle entirely. Parameters: * `$heading` (string) - The subtitle HTML (default: an h2 element). * `$post` (array) - The raw Substack post data. == substack_importer_post_content_after_conversion == Filter the post content after Gutenberg conversion but before it is added to the WXR. Useful for wrapping paywalled content in custom blocks (e.g., membership plugins). Parameters: * `$post_content` (string) - The converted Gutenberg block content. * `$post` (array) - The original Substack post data. * `$post_meta` (array|null) - Additional post metadata from Substack API. == substack_importer_post_data == Filter the final post data array before it is added to the WXR. Parameters: * `$post_data` (array) - The post data. * `$post` (array) - The original Substack post data. = Content Conversion Filters = == substack_importer_converted_node == Filter the result of a single node conversion to a Gutenberg block. Allows modification of the block name and attributes. Return a null block_name to skip the node. Parameters: * `$block_data` (array) - Array with 'block_name' and 'block_attributes' keys. * `$node` (DOMElement) - The converted DOM node. * `$node_name` (string) - The original HTML tag name (e.g. 'p', 'div', 'h2'). == substack_importer_image_result == Filter the image node conversion result. Useful for adjusting image sizes, captions, or link destinations. Parameters: * `$result` (array) - Array with 'block_attributes' and 'node' keys. * `$image_data` (array|null) - The decoded image data from the Substack data-attrs attribute. == substack_importer_pre_embed_conversion == Short-circuit the embed node conversion before default handling. Return a non-null array to skip the built-in switch statement entirely. Useful for handling unsupported embed types or overriding the default conversion for a specific provider. Parameters: * `$pre_result` (array|null) - Return non-null to short-circuit. Expected keys: 'node', 'block_attributes', 'block_name'. * `$node` (DOMElement) - The embed DOM node before conversion. * `$parent` (DOMElement) - The parent DOM element. * `$first_class` (string) - The CSS class identifying the embed type (e.g. 'youtube-wrap', 'tweet'). == substack_importer_embed_result == Filter the embed node conversion result after the default conversion. Useful for modifying embed URLs, adding custom attributes, or changing how embeds are represented. Parameters: * `$output` (array) - Array with 'block_name', 'block_attributes', and 'node' keys. * `$first_class` (string) - The CSS class identifying the embed type. == substack_importer_audio_block == Filter the Gutenberg audio block HTML for podcast posts. Parameters: * `$block` (string) - The Gutenberg audio block HTML. * `$audio_url` (string) - The URL of the podcast audio file. = Paywall Filters = == substack_importer_paywall_marker_text == Filter the paywall marker text that appears in the imported content. Parameters: * `$marker_text` (string) - The default paywall marker text. * `$node` (DOMElement) - The paywall node being converted. * `$parent` (DOMElement) - The parent element. == substack_importer_paywall_content == Filter the entire paywall conversion result. Return a non-null value to override the default conversion. Parameters: * `$result` (array|null) - The conversion result, null to use default. * `$node` (DOMElement) - The paywall node being converted. * `$parent` (DOMElement) - The parent element. = Actions = == substack_importer_before_post == Fires before a single Substack post is processed and converted. Useful for setting up state or performing actions before conversion begins. Parameters: * `$post` (array) - The raw Substack post data from the CSV. * `$post_meta` (array|null) - The post metadata from the Substack API response. * `$id` (int) - The Substack post ID. == substack_importer_after_post == Fires after a single Substack post has been converted and added to the WXR. Useful for logging, progress tracking, or performing cleanup after each post. Parameters: * `$post_data` (array) - The final post data that was added to the WXR. * `$post` (array) - The raw Substack post data from the CSV. * `$post_meta` (array|null) - The post metadata from the Substack API response. * `$id` (int) - The Substack post ID. == Frequently Asked Questions == = After about 30 seconds, the import stops and I am seeing a blank screen. What happened? = When trying to import a large number of posts and images, timeouts can occur. To solve this, you can try to run the import several times until all content has been imported.