I am running a small python3 service that checks some TLS certificates with a systemd timer. Over the months it accumulated over 1.4 million rows and the memory consumption rose to over 750MB per check run.
The initial code was like this:
db = sqlite3.connect('database.sqlite3')
cur = db.cursor()
rows = cur.execute('SELECT timestamp, hostname, common_name, alt_names FROM samples ORDER BY id DESC')
rows = cur.fetchall()
for row in rows:
# do the things
db.close()
The biggest problem was this line: rows = cur.fetchall()
It did copy the entire database into memory.
Somehow a 253MB sqlite3 file occupies over 750MB in memory.
A small speedup was changing the isolation_level
, since I know that the database will not get accessed by any other program.
Using the with closable as db:
automatically flushes and closes the database connection.
# EXCLUSIVE prevents other database connections from reading the database while the transaction is underway.
with sqlite3.connect('database.sqlite3', isolation_level='Exclusive') as db:
cur = db.cursor()
rows = cur.execute('SELECT timestamp, hostname, common_name, alt_names FROM samples ORDER BY id DESC')
for row in rows:
# do the things
With these tweaks I was able to reduce the memory footprint from 750MB to 37MB and the runtime from 3.7s to 2.9s 🎉