Back to TILs.

Calmcode TIL

Matplotlib Memory Leaks logoMatplotlib Memory Leaks

I was running a FastHTML app that uses the fh-matplotlob plugin. It's a neat plugin, but when my app started getting users I also noticed that the memory was spiking.

Memory spikes
You might be able to see the moment where the memory leak became very clear, as well as when I deployed fix and redeployed.

This suprised me a bit initially because the app really was not doing anything fancy. To briefly explain the situation, here is a snippet:

from fh_matplotlib import matplotlib2fasthtml
from fasthtml.common import * 
import numpy as np
import matplotlib.pylab as plt

app, rt = fast_app()  


count = 0
plotdata = []

@matplotlib2fasthtml
def generate_chart():
    global plotdata
    plt.plot(range(len(plotdata)), plotdata)

Web app is bigger than this snippet, but you can see the first of it. There is some global data that can change and there is a function that can make the plot. The matplotlib2fasthtml decorator is a neat way to render matplotlib plots in a FastHTML app so that uses will not have to concern themselves with the translation details. You wrap it around a function that uses plt.plot and it will make sure that FastHTML can render it. Here is what it does internally:

def matplotlib2fasthtml(func):
    def wrapper(*args, **kwargs):
        # Reset the figure to prevent accumulation. Maybe we need a setting for this?
        fig = plt.figure()

        # Run function as normal
        func(*args, **kwargs)

        # Store it as base64 and put it into an image.
        my_stringIObytes = io.BytesIO()
        plt.savefig(my_stringIObytes, format='jpg')
        my_stringIObytes.seek(0)
        my_base64_jpgData = base64.b64encode(my_stringIObytes.read()).decode()

        # Close the figure to prevent memory leaks
        plt.close(fig)
        plt.close('all')
        return Img(src=f'data:image/jpg;base64, {my_base64_jpgData}')
    return wrapper

The main "trick" is that we turn the figure into a base64 representation by saving the figure into a BytesIO object. All of this works, but the memory was still spiking, even after explicitly closing the figure ... twice?

It turned out the culprit was a setting. This fixes everything:

import matplotlib
matplotlib.use('Agg')

Matplotlib has many different backends to pick from. These all come with different assumptions and most of the time they are designed to be used interactively by a user. In many ipython settings this makes a whole lot of sense. But the Agg backend is different! It is designed to only write into files non-interactively, which skips the interactive storage of the figures. Adding that import to the plugin fixed the memory issue.

ps.

There is also another alternative to consider. After diving into the docs about this I even found this section that even shows a small Flask application that uses the Figure class instead of relying on plt.plot. This is a bit more verbose but the Figure class is designed to save into in-memory buffers so that should also prevent the memory leak.

This won't work with the fh-matplotlib plugin because I can't assume that the user will actually use the Figure class. But if you're building your own app, this might be a better approach.


Back to main.