Custom statistics for Anki flashcard reviews

Anki stores a record for each review. This powers the built-in stats page, which is quite limited. Luckily, it’s easy to get the data:

File -> Export
Choose Anki Deck Package, Include Scheduling Information and Support older Anki versions
Unzip the resulting file

Extract the review log into a CSV:

sqlite3 collection.anki21 -header -csv "SELECT * FROM revlog;" > output.csv

Load the CSV into pandas

Add date, datetime and week columns:

df['datetime'] = pd.to_datetime(df['id'], unit='ms')
df['date'] = df['datetime'].dt.date

Now you can do things like see the median number of cards per session, seconds per card etc. for a particular period:

(df.groupby('date')['time']
.agg(['sum', 'count'])
.assign(minutes=lambda x: round(x['sum'] / 1000 / 60, 0).astype(int))
.assign(seconds_per_card=lambda x: round(x['sum'] / x['count'] / 1000,0).astype(int))
.tail(21)
).median()

Aggregation

(df.groupby('date')['time']
.agg(['sum', 'count'])
.assign(minutes=lambda x: x['sum'] / 1000 / 60)
.assign(seconds_per_card=lambda x: x['sum'] / x['count'] / 1000)
.tail(21)
)

Custom percentiles (instead of histogram)

custom_percentiles = [0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99]
df['values'].describe(percentiles=custom_percentiles)

Published Jan 14, 2025

British-born Indian, living in San Francisco after 9 years in China. Product Manager (ex-Google, ex-Amazon). Accountant (CIMA). MBA (Oxford). Entrepreneur (co-founder: Oakam, Euristix).Rahim Nathwani on Twitter