MongoDB – Text Search

Starting from version 2.4, MongoDB started supporting text indexes to search inside string content. The Text Search uses stemming techniques to look for specified words in the string fields by dropping stemming stop words like a, an, the, etc. At present, MongoDB supports around 15 languages.

Enabling Text Search

Initially, Text Search was an experimental feature but starting from version 2.6, the configuration is enabled by default.

Creating Text Index

Consider the following document under posts collection containing the post text and its tags −

> db.posts.insert({
   "post_text": "enjoy the mongodb articles on adglob",
   "tags": ["mongodb", "adglob"]
}
{
       "post_text" : "writing tutorials on mongodb",
       "tags" : [ "mongodb", "tutorial" ]
})
WriteResult({ "nInserted" : 1 })

We will create a text index on the post_text field so that we can search inside our posts’ text −

>db.posts.createIndex({post_text:"text"})
{
       "createdCollectionAutomatically" : true,
       "numIndexesBefore" : 1,
       "numIndexesAfter" : 2,
       "ok" : 1
}

Using Text Index

Now that we have created the text index on the post_text field, we will search for all the posts having the word adglob in their text.

> db.posts.find({$text:{$search:"adglob"}}).pretty()
{
       "_id" : ObjectId("5dd7ce28f1dd4583e7103fe0"),
       "post_text" : "enjoy the mongodb articles on adglob",
       "tags" : [
              "mongodb",
              "adglob"
       ]
}

The above command returned the following result documents having the word adglob in their post text −

{ 
   "_id" : ObjectId("53493d14d852429c10000002"), 
   "post_text" : "enjoy the mongodb articles on adglob", 
   "tags" : [ "mongodb", "adglob" ]
}

Deleting Text Index

To delete an existing text index, first find the name of the index using the following query −

>db.posts.getIndexes()
[
       {
              "v" : 2,
              "key" : {
                     "_id" : 1
              },
              "name" : "_id_",
              "ns" : "mydb.posts"
       },
       {
              "v" : 2,
              "key" : {
                     "fts" : "text",
                     "ftsx" : 1
              },
              "name" : "post_text_text",
              "ns" : "mydb.posts",
              "weights" : {
                     "post_text" : 1
              },
              "default_language" : "english",
              "language_override" : "language",
              "textIndexVersion" : 3
       }
]

After getting the name of your index from the above query, run the following command. Here, post_text_text is the name of the index.

>db.posts.dropIndex("post_text_text")
{ "nIndexesWas" : 2, "ok" : 1 }

Leave a Reply