Automatic Data Purging with MongoDB TTL
A common database maintenance scenario is the need to purge unneeded or expired data after a period of time, such as log messages or perhaps sensitive user data. With MongoDB, this can be easily accomplished with a built-in process. By creating a specialized index, we can make use of Mongo’s TTL feature. This feature allows us to either purge documents at a specific date or specify an amount of time before the document expires.
Step 1: Design Decision
We have a relatively minor design choice to make regarding which field to create the index off of. Here are our options:
- Specify a date the document should be deleted
- Specify an amount of time the document has to live from the indexed field
Oftentimes, a collection will have been designed with some audit trail information in mind. There may be fields such as: createdBy, createDate, modifiedBy, modifiedDate. In this example, I already have a collection that contains these four audit fields. Therefore, I’m going to create my TTL index off of the createDate field and specify an expireAfterSeconds value. There are a few other rules to keep in mind when making this decision; the index must be single field, a date type, and the field may not already be part of another index. Lucky for me, my createDate meets the criteria.
Step 2: Creating the TTL Index
I’m using Mongoose in my project, so in order to create the index, all I have to do is add ‘expires’ to ‘createDate’:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var doomedDataSchema = new Schema({
name: String,
title: String,
reasonForDoom: String,
notes: [{ note: String, noteDate: Date }],
audit: {
createdBy: String,
createDate: { type: Date, expires: 60*60*24*30},
modifiedBy: String,
modifiedDate: Date
}
});
My complicated math formula will set the document to be deleted after thirty days. According to the API docs on this subject, we don’t need to specify an ugly number, but can substitute with an easier-to-read string value, like so:
createDate: { type: Date, expires: '30d'},
Conclusion
The MongoDB TTL feature is a great tool to keep in mind when your project requires time-based data purging. This feature saves us the development time and overhead of writing our own, similar services and can be enabled from the Mongo shell, or in my case, using an ODM tool.