More and more organizations are using ElasticSearch alongside MongoDB. To understand how the two can work together, it is important that you first understand what they are as well as their features.
MongoDB and its Features
- It falls in the classification as a NoSQL database because it uses BSON (JSON-like documents that feature dynamic schema) instead of rational, table-based structures common in traditional relational database management systems (RDBMSs). BSON makes integration of data in some applications faster and easier.
- The term ‘document-oriented’, in this context, means MongoDB is able to break a business subject into different relational structures and to then store them in the fewest possible documents. As an example, a single document with the name ‘Book’ can be used to store title, author, and other relevant information instead of having several distinct relational structures.
- You can index any field in the MongoDB. These indices are just like those in RDBMSs. You can also have secondary indices.
- MongoDB’s replica sets give you 2 or more data copies, which give you high availability. You can set each member of a replica set to take the primary or the secondary role whenever you want. The replica with the primary role performs read and write functions by default while the one with the secondary role uses built-in replication to maintain a copy of the primary replica’s data.
- MongoDB has a sharding feature for horizontal scaling. You can choose a shard key to determine load balancing/data distribution across different shards. You run MongoDB over several servers for this duplication and for resiliency in case of hardware failure. You can do automatic configuration and new machines are easy to add on a running DB.
- GridFS is a MongoDB function that allows you to use MongoDB as a file system. GridFS allows you to divide your files into chunks and to store the chunks as different documents.
ElasticSearch and its Features
ElasticSearch is cross-platform search server based on Lucene. It is written in Java and is an open-source software program. Of all the enterprise search engines, ElasticSearch is the second most popular.
- ElasticSearch finds primary use by offering a multitenant-capable, distributed full-text search function with schema-free JSON documents and it features a RESTful web interface. The term ‘distributed’, in this context, means you can divide indices into shards. Each shard can have several or no replicas. Each node has one or several shards.
- ElasticSearch allows you to do a scalable search. You can do searches in almost real-time. It supports multi-tenancy.
- Each node delegates operations to the specific shard or shards. ElasticSearch does routing and rebalancing between nodes automatically.
- The enterprise search solution has a ‘gateway’ feature that handles long-term index persistence. As an example, you can recover an index from the gateway in the event your server crashes.
Other notable features are support for GET requests in real time (making it a NoSQL) support for percolation and facetting which is useful in telling you whenever a registered query is matched by a document match.
Advantages of Using ElasticSearch alongside MongoDB
- If you need more than 5 indexes on MongoDB, consider using ElasticSearch because this search engine will give you faster results. For MongoDB, it is difficult and time-consuming to deal with large indexes.
- You may be asking yourself, why not use ElasticSearch as the main DB? MongoDB is the faster options if you only have a few DBs. You should, therefore, tune MongoDB for minimal indexes and it will outperform ElasticSearch.
- Another reason to continue using MongoDB is that ElasticSearch sometimes loses write operations when it is reforming and splitting the cluster. This common problem comes from ElasticSearch’s search engine roots. This means ElasticSearch is not the ideal option for your main DB.
- The best option is to have drivers for both MongoDB and ElasticSearch installed and to specify the use of both when writing your application. Other options are to use bi-lingua drivers, a good example being Mongoostastic, or go the Compose way where you let Compose’s transporter application to keep track of the data stored in ElasticSearch.
How to Set up ElasticSearch with MongoDB
It is possible to index MongoDB data using ElasticSearch. This involves installing ElasticSearch plugin.
- Once you have configured MongoDB as a standalone instance, the next step is conversion to a replica set. This is because ElasticSearch plugin is dependent on the MongoDBoplog (operational log), which logs all changes that MongoDB uses in self replication, to update ElasticSearch. MongoDB does not support built-in triggers, so this is the best alternative.
- The next step is installing the Java Run Time Environment. You will need this to install ElasticSearch. After these two installations, install a service wrapper.
- Ensure ElasticSearch is working. You can do this by sending HTTP requests to localhost:9200. After you have established that ElasticSearch is running, you need to install two plugins.
- The first of these plugins is Mapper Attachments, a dependency. There is an Elastic Search plugin script to do this. The other plugin is called ES ‘river’ for Mongo. This is a third party plugin, meaning installation syntax is different from that of Mapper Attachments. After installation, restart ElasticSearch service.
- The next step is ElasticSearch configuration. Note that search index setup is data-specific, but the default Elastic Search analyzers will fit most data types. Index management is through a Restful interface. Send JSON payloads telling ElasticSearch to be on the lookout for your data.
[Author- Jenny Richards is a well known database administrator. She argues that despite popular belief, even the best database will not cover all your database needs. Remote DBA Support uses ElasticSearch to optimize MongoDB queries.]
– NextBigWhat invites geeks to share useful tech notes with the audience. Feel free to get in touch: email@example.com