MongoDB for Data Storage
MongoDB is one of several database types to arise in the mid-2000s under the NoSQL banner. Instead of using tables and rows as in relational databases, MongoDB is built on an architecture of collections and documents. Documents comprise sets of key-value pairs and are the basic unit of data in MongoDB. Collections contain sets of documents and function as the equivalent of relational database tables.
Like other NoSQL databases, MongoDB supports dynamic schema design, allowing the documents in a collection to have different fields and structures. The database uses a document storage and data interchange format called BSON, which provides a binary representation of JSON-like documents. Automatic sharding enables data in a collection to be distributed across multiple systems for horizontal scalability as data volumes increase.
History:
MongoDB was created by Dwight Merriman and Eliot Horowitz, who had encountered development and scalability issues with traditional relational database approaches while building Web applications at DoubleClick, an Internet advertising company that is now owned by Google Inc. According to Merriman, the name of the database was derived from the word humongous to represent the idea of supporting large amounts of data. Merriman and Horowitz helped form 10Gen Inc. in 2007 to commercialize MongoDB and related software. The company was renamed MongoDB Inc. in 2013.
The database was released to open source in 2009 and is available under the terms of the Free Software Foundation’s GNU AGPL Version 3.0 commercial license. At the time of this writing, among other users, the insurance company MetLife is using MongoDB for customer service applications, the website Craigslist is using it for archiving data, the CERN physics lab is using it for data aggregation and discovery and the The New York Times newspaper is using MongoDB to support a form-building application for photo submissions.
Main features:
Some of the features include:
Ad hoc queries
MongoDB supports field, range queries, regular expression searches. Queries can return specific fields of documents and also include user-defined JavaScript functions.
Indexing
Any field in a MongoDB document can be indexed – including within arrays and embedded documents (indices in MongoDB are conceptually similar to those in RDBMSes). Primary and secondary indices are available.
Replication
MongoDB provides high availability with replica sets. A replica set consists of two or more copies of the data. Each replica set member may act in the role of primary or secondary replica at any time. The all writes and reads are done on the primary replica by default. Secondary replicas maintain a copy of the data of the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can optionally serve read operations, but that data is only eventually consistent by default.
Load balancing
MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.). Alternatively, the shard key can be hashed to map to a shard – enabling an even data distribution.
MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. MongoDB is easy to deploy, and new machines can be added to a running database.
File storage
MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.
This function, called Grid File System, is included with MongoDB drivers and available for many development languages (see “Language Support” for a list of supported languages). MongoDB exposes functions for file manipulation and content to developers. GridFS is used, for example, in plugins for NGINX and lighttpd. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.
In a multi-machine MongoDB system, files can be distributed and copied multiple times between machines transparently, thus effectively creating a load-balanced and fault-tolerant system.
Aggregation
MapReduce can be used for batch processing of data and aggregation operations.
The aggregation framework enables users to obtain the kind of results for which the SQL GROUP BY clause is used. Aggregation operators can be strung together to form a pipeline – analogous to Unix pipes. The aggregation framework includes the $lookup operator which can join documents from multiple documents.
Server-side JavaScript execution
JavaScript can be used in queries, aggregation functions (such as MapReduce), and sent directly to the database to be executed.
Capped collections
MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.
How does MongoDB work?
MongoDB stores data using a flexible document data model that is similar to JSON. Documents contain one or more fields, including arrays, binary data and sub-documents. Fields can vary from document to document. This flexibility allows development teams to evolve the data model rapidly as their application requirements change. When you need to lock down your data model, optional document validation enforces the rules you choose.
Developers access documents through rich, idiomatic drivers available in all popular programming languages. Documents map naturally to the objects in modern languages,
which allows developers to be extremely productive. Typically, there’s no need for an ORM layer.
MongoDB provides auto-sharding for horizontal scale out. Native replication and automatic leader election supports high availability across racks and data centers. And MongoDB makes extensive use of RAM, providing in-memory speed and on-disk capacity.
Unlike most NoSQL databases, MongoDB provides comprehensive secondary indexes, including geospatial and text search, as well as extensive security and aggregation capabilities. MongoDB provides the features you need to develop the majority of the new applications your organization develops today.
MongoDB offers a pluggable storage engine API, with multiple storage engines available. Select your storage engine based on your application requirements, and even mix storage engines within a replica set. The WiredTiger storage engine provides 7x-10x write performance improvements over MongoDB 2.6, while additional new storage engines provide encryption at rest and in-memory speeds.
Architecture
Programming language accessibility
MongoDB has official drivers for a variety of popular programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
Management and graphical front-ends
Most administration is done from command line tools such as the mongo shell because MongoDB does not include a GUI-style administrative interface. There are products and third-party projects that offer user interfaces for administration and data viewing.
Licensing and support
MongoDB is available for free under the GNU Affero General Public License. The language drivers are available under an Apache License. In addition, MongoDB Inc. offers proprietary licenses for MongoDB.
I am really impressed with your writing skills and also with the layout on your weblog.
Is this a paid theme or did you modify it yourself?
Anyway keep up the excellent quality writing, it is rare to see a great blog like
this one nowadays.
These are genuinely fantastic ideas in about blogging.
You have touched some pleasant factors here.
Any way keep up wrinting.
Great blog here! Also your web site loads up fast! What web host are you using?
Can I get your affiliate link to your host? I wish my website loaded up as
fast as yours lol
Hello would you mind sharing which blog platform you’re working with?
I’m going to start my own blog soon but I’m having a hard
time making a decision between BlogEngine/Wordpress/B2evolution and Drupal.
The reason I ask is because your design seems different then most
blos aand I’m looking for something completely unique.
P.S Sorry foor getting off-topic but I had to ask!
I could not refrain from commenting. Well written!
There is certainly a great deal to learn about this topic.
I like all the points you made.
Howdy! I could have sworn I’ve been to this website before but after checking through some
of the post I realized it’s new to me. Anyhow, I’m definitely
happy I found it and I’ll be bookmarking and checking back frequently!
I know this if off topic but I’m looking into starting my own weblog
and was curious what all is needed to get set up?
I’m assuming having a blog like yours would cost a pretty
penny? I’m not very internet savvy so I’m not 100% certain. Any recommendations
or advice would be greatly appreciated. Kudos
This text is priceless. Where can I find out more?
It’s a pity you don’t have a donate button! I’d definitely donate to this excellent blog!
I suppose for now i’ll settle for book-marking and adding your RSS feed to my Google account.
I look forward to brand new updates and will share this site with my
Facebook group. Chat soon!
Spot on with this write-up, I seriously think this web site
needs a great deal more attention. I’ll probably be back again to see more, thanks for the information!