April 1, 2025 - 7 min read

Managing Mastodon Storage

What you need to know about storage as a Mastodon instance owner.

By Paige Saunders

Running a Mastodon instance can be more resource-intensive than you initially anticipate. On a typical instance, almost all of your storage is used up by caches. These store the images and text of profiles that your users follow.

For instance owners, understanding these caches will save you money, improve performance, help you play better with other instances and even provide your users with better content.

Two Types of Cache

As far as your typical instance administrator is concerned, Mastodon has two different caches which do different things.

0:00
/0:15

A typical Mastodon instance in the FediHost dashboard showing Media and Database usage. As the instance matures most of the media storage is taken up by the media cache while most the database storage is taken up by the content cache.

The Media Cache

The media cache stores attached files such as images (PNGs, JPEGs) and videos. The majority of these are attached to posts, but it also include the header images of profile from other servers. When accounts on your server search and follow profiles on other instances a copy the images they share are cached on your server for users to view.

Clearing the media cache on your server does not delete the original files. The media cache simply stores copies to distribute the burden of serving content. If 100 users on your server follow a popular account, your server only has to download the file once for all your users to see it. If the file is not in the cache your server will automatically refetch the file for you from the other server.

The Content Cache 

The content cache stores the text part of the post including metadata and information about the account that created the post.

This data is stored in a database which is usually a more expensive form of storage than the media cache. The content cache stores remote posts and deleting it can lead to broken interactions with other servers. This is because you no longer know what your own users replies, favourites, and bookmarks are related to. With a cleared remote content cache your server ends up with "one side of the conversation" and no longer has the information needed to piece together the interactions and connect the remote posts with what is in the media cache.

What Fills These Caches?

The biggest culprit for new instance administrators: Relays. But lets go through them all.

Profiles

When a user on your server searches and views an account on a remote server for the first time the process of filling the cache begins. Profile avatars, headers as well as the images and videos on pinned posts go into the media cache.

0:00
/0:06

Because no-one on the Consultatron server has searched for and followed Eugen Rochko yet the cache starts being populated as soon as his profile is searched for and opened. His name and description go into the Content Cache. His profile picture and header image go into the Media Cache.

Addressing Profiles Caches

This does not typically use a large amount of space and clearing these from the cache requires you to use the CLI and specifically request the removal of items in the media cache.

0:00
/0:12

In FediHost the --prune-profiles command is periodically run and can be configured on your service's config page

User Follows

On a typical mature server the media and content cache are mostly filled with the day-to-day follows of your user base. We have found that an average follow uses between 50mb and 100mb of storage. It totally depends what accounts your users follow.

Post Frequency

If your users follow accounts that rarely or never post, they will add a very small amount of content and media cache data. Only the basic profile information will stay in the cache. On the other hand, some accounts post dozens or hundreds of times a day, such as major news outlets and and bot accounts.

Post Type

If your users are following accounts that mostly just post text, they will add to the content cache but not to the media cache. If they follow lots of Pixelfed and bridged instagram accounts with video clips, images and memes and links to news sites with preview images your media cache will get filled up quickly.

0:00
/0:03

Although URLs are text accounts that post them often fill your media cache with the open graph image elements embedded in most websites HTML

Addressing User Follows

Sometimes on smaller instances you will want to check out what accounts you and and your users are following.

High Numbers of Follows

Are any users following an impossibly large numbers of accounts? It's hard for an average person to stay on top of more than a few hundred accounts. Check if you have permanently muted or filtered any accounts while still following them.

Following Reposting Bots

Are any users following bots or automated accounts? Look for accounts that are reposting popular hashtags like #ai or #news. Also check if any users following bot accounts which post lots of video as this uses a much larger amount of the media cache.

Encourage User Follow Efficiency

Make the intent of your server more obvious and focused when taking signups. There are a number of ways to do this during the registration process for new users.

Generally it's good for users on an instance to be active and following lots of things. Users actually help others on an instance find content.

Users of the Mastodon instance QLUB in Quebec help fill the cache with Francophone and Quebec focused content which make the recommendations of this instance helpfully tailored to the userbase.

An instance that is space efficient is probably also a more enjoyable instance to be on. Caches are shared, so if 100 users are following one account, you only have one copy of that data in your cache. If your instance is for people who like #cats but a new user is only following #dogs then they will not be benefitting other users on your server. Most of their follows will be filling the cache rather than making use of previously cached #cat data. This is part of the reason that instances are better if they have a focus, like a specific geography or interest.

Relays

Relays are useful for populating a new Mastodon instance with content, but they are the number one cause of excessive storage usage on new instances. Their "firehose" approach pushes large volumes of data into both the media and content cache.

Relay List contains dozens of general relays

Addressing Relays

If you have "general interest" relays setup your should remove them. They are almost never the solution to kickstarting an instance.

Media Cache

Once you remove a relay media storage will go down automatically like you had never followed a relay in the first place. With default settings this will occur within several weeks but can be sped up by changing your content retention settings.

Content Cache

The content cache is a larger problem because it is harder to clear. Firstly a computationally heavy command tootctl statuses remove must be run on the database. This calculate which posts in the cache have not been interacted with by users on your instance. Following that the website must be taken offline while a database vacuum is run to permanently delete the records.

Hashtag Relays

General relays should be replaced with specific relays like fedi.buzz that filter for interests before they reach your server.

0:00
/0:15

Generating and using a hashtag relay

No Relays At All

Instead of relying heavily on relays, consider manually following accounts and hashtags relevant to your interests. Encourage users on your instance to do the same, effectively crowdsourcing content and creating a more tailored experience.

Content Retention Settings

Administration > Server Settings > Content Retention

If you are an admin user you have several settings available that help manage storage.

The Media Cache Retention Period

The media cache retention period determines how long media files, such as images and videos, are stored on your instance. The minimum recommended value for this is 14 days, which may be excessive for solo instances if you check your feed on a daily basis. Reducing this period can significantly decrease storage usage without affecting the availability of the original content, and copies can be refetched if needed.

User Archive Retention Period

The user archive retention period controls how long user-generated backup archives of posts and media, are stored. While generally less of a concern for storage issues, large archives can temporarily consume storage space when users request them. This setting becomes more relevant if you anticipate users downloading extensive archives of their past posts. However, this is typically a transient situation and less of a primary storage concern compared to media caching.

Remote Content Retention Period

The remote content retention period should almost never be set. Entering a value in this field leaves users with only "their side of a conversation" because remote posts that they are interacting with will be removed. Private messages and threads with users on other instances will look like a conversation with themselves.

0:00
/0:03

The @ceo account is on a normal server. The @Consultatron account is on a server where remote content has been cleared. It has no record of the reply that was sent to it by @ceo.

Modifying this setting should only be done with advanced knowledge and a clear understanding of the risks involved. If you have a new instance and have accidentally filled the content cache with large volumes of relay data that is one of the few times it could make sense to change this setting.

If you are on FediHost please reach out to support and we will help fix issues with remote content on your instance.

Conclusion

Effectively managing storage on a Mastodon instance involves understanding what caches do and have in them. It takes a while for the relationship between these things to sink in so consider watching a few of our videos on the topic below and give yourself some time to figure it out.

Suggested Articles
Activity In The Pub
Activity In The Pub

Running an Activity in the Pub is a great way to meet other people near you who are into the Fediverse

July 14, 20252 min read
Automatically Syncing YouTube to PeerTube
Automatically Syncing YouTube to PeerTube

How to keep a secure copy of your YouTube channel on a PeerTube instance

June 15, 20252 min read
Why Is The Explore Tab Empty
Why Is The Explore Tab Empty

When your Mastodon instance has a blank home page and explore tab that always says "Nothing is trending right now"

June 14, 20252 min read