Mastodon.social is experiencing issues

Write-up

Incident Report: mastodon.social Outage Affecting Publishing and Community

Date: April 20, 2026 Duration: ~5.5 hours Severity: Major

Summary Customers with channels connected to the mastodon.social instance experienced significant disruption to publishing and Community comment features. Posting success rates on mastodon.social dropped sharply, and most requests to ingest comments failed during the incident window. The root cause was an upstream outage on mastodon.social caused by a reported DDoS attack. Service recovered once mastodon.social's team deployed countermeasures.

Root Cause The incident was caused by a DDoS attack targeting the mastodon.social server specifically, as confirmed on the official mastodon.social status page. Mastodon is a federated network made up of many independent servers, and only mastodon.social was affected. Other Mastodon instances continued to operate normally throughout the incident. Because mastodon.social hosts roughly half of the Mastodon channels connected to Buffer (about 9.3K of 18.4K), the upstream outage had a meaningful impact on Buffer's Mastodon publishing and comment ingestion flows. There were no internal code changes or regressions involved.

Customer Impact 161 unique organizations had a total of 360 posts fail to publish during the incident. Customers with channels on other Mastodon instances were not affected. Publishing success rates on mastodon.social briefly recovered above 90% during the incident but dropped again before fully stabilizing, leading to an intermittent rather than continuous experience for affected customers. The incident was communicated via Buffer's public status page and an in-app banner.

Steps to Resolution The team correlated Buffer's internal publishing metrics with the mastodon.social status page within the first 20 minutes of the incident being declared, confirming the cause was an upstream DDoS attack. Because resolution depended on the mastodon.social team mitigating the attack, the response focused on customer communication and continuous monitoring rather than a code-level fix. Buffer's status page was updated to reflect the upstream cause, an in-app banner kept affected customers informed, and the team tracked publishing success rates throughout the incident window. Once mastodon.social deployed countermeasures and their status page moved to monitoring, Buffer's publishing success rates returned to above 99% and the incident was resolved.

Key Learnings

The federated nature of Mastodon means that an outage on one server only affects a subset of customers. Being able to communicate that clearly (which customers are affected, and which are not) helps customers understand whether they need to take action. We're looking at how to make this distinction even more explicit in future status page updates for federated platforms.
Our visibility into post publishing success rates is strong, but our visibility into comment publishing is less mature. Today we track comment replies through product analytics rather than infrastructure metrics, which makes it harder to assess real-time impact during an incident. We have a follow-up to build the same level of dashboarding for comments as we have for posts, so we can quantify impact across both surfaces from a single source.
Detection again relied on correlating internal metrics with the upstream provider's status page. Automated monitoring of partner status pages and per-platform error rate thresholds (so a platform-specific drop surfaces faster when other integrations are healthy) remain valuable investments and are consistent with learnings from recent third-party incidents.