Saturday, August 31, 2013

MongoDB: GridFS remove method deletes all files in bucket

Some time ago we ran into strange behaviour of MongoDB's GridFS which caused me creating a Bug Ticket for the MongoDB Java driver.

Today I found the link to the bug ticket in my browser bookmarks. The ticket isn't solved at the current time so I thought it would be worth a short blog post in case someone else runs into this problem.

Let's look at the following simplified Java service:
public class GridFsService {

  private GridFS gridFs;
  
  public void connect(String mongoDbHost, String databaseName) throws UnknownHostException {
    DB db = Mongo.connect(new DBAddress(mongoDbHost, databaseName));
    this.gridFs = new GridFS(db, "myBucket");
  }
  
  public void removeGridFsFile(String id) {
    GridFSDBFile file = this.gridFs.findOne(new ObjectId(id));
    this.gridFs.remove(file);
  }
  
  // .. other methods to create and update files
}
This service uses the MongoDB Java driver to create, update and remove files from GridFS.
However, there is a serious flaw in the removeGridFsFile() method.
Guess what happens if an invalid id is passed to removeGridFsFile().
gridFs.findOne() returns null for non existent ids. So null is passed to gridFs.remove() which then removes all files in the current bucket.

Fixing this is easy. Just add a null check or use another GridFS remove() method that takes an ObjectId instead of a GridFsDBFile:
public void removeGridFsFile(String id) {
  this.gridFs.remove(new ObjectId(id));
}
Using this way everything works fine if an invalid id is passed to removeGridFsFile() (no file is removed).

To make sure this won't happen again I tested what happens if null is passed to any of the three different remove() methods:
gridFs.remove((String)null);      // nothing happens
gridFs.remove((ObjectId)null);    // nothing happens
gridFs.remove((DBObject)null);    // all files from bucket are removed
I don't know if this is intended behaviour. The Javadoc comment for gridFs.remove(DBObject query) tells me that it removes all files matching the given query. However, if it is intended I think it should be clearly stated in the javadoc comment that passing null removes all files in the bucket.

Share this post using Facebook, Twitter or Google+

1 comment:

  1. You didn't mention the driver version you were using.

    I think think it was intended as you use de remove(query) method which is equivalent to :
    find(query) then for each result remove(_id)
    With null parameter it's like a find({}) which matches everything.
    Thus it'll end up deleting the whole bucket.

    anyway in the latest driver null argument is forbidden for a remove on gridfs
    See : https://github.com/mongodb/mongo-java-driver/blob/master/src/main/com/mongodb/gridfs/GridFS.java

    ReplyDelete