I am running a small python3 service that checks some TLS certificates with a systemd timer. Over the months it accumulated over 1.4 million rows and the memory consumption rose to over 750MB per check run.
The initial code was like this:
db = sqlite3.connect('database.sqlite3') cur = db.cursor() rows = cur.execute('SELECT timestamp, hostname, common_name, alt_names FROM samples ORDER BY id DESC') rows = cur.fetchall() for row in rows: # do the things db.close()
The biggest problem was this line:
rows = cur.fetchall()
It did copy the entire database into memory.
Somehow a 253MB sqlite3 file occupies over 750MB in memory.
A small speedup was changing the
isolation_level, since I know that the database will not get accessed by any other program.
with closable as db: automatically flushes and closes the database connection.
# EXCLUSIVE prevents other database connections from reading the database while the transaction is underway. with sqlite3.connect('database.sqlite3', isolation_level='Exclusive') as db: cur = db.cursor() rows = cur.execute('SELECT timestamp, hostname, common_name, alt_names FROM samples ORDER BY id DESC') for row in rows: # do the things
With these tweaks I was able to reduce the memory footprint from 750MB to 37MB and the runtime from 3.7s to 2.9s 🎉