MongoDB is being used as one of the key technology for storing data in recent time because of it's flexibility and scalability; but many of us don't know the journey behind it since Big Data concept came into picture. This is important to know so as to ensure its advantages can be utilized in a correct way. We at Witspry tried to make the explanation in the easiest possible way.
What is Big Data?
It is a data with these three characteristics:
1) Volume
2) Velocity, and
3) Variety
Big data can be subjected to three categories of data types:
1) Structured data for e.g. Relational data2) Semi-structured data for e.g. XML which is more or less flexible, and
3) Unstructured data for e.g. Word, PDF. Text, Audio, Video etc.
Big Data technology class
Big data is categorized majorly in two technology classes - Operational & Analytical.
One Big Data implementation - Hadoop Framework
Hadoop is a framework that has implemented the Big Data concept.
One top key technology in Big Data - No SQL
Due to the flexible data structure, No SQL is one of the top data access storage framework of the Big Data implementation suite.
One implementation of No SQL Document data model - MongoDB
What is MongoDB?
MongoDB is an open-source document database that provides high performance, high availability and automatic scaling.
Document Database
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays and arrays of documents.
Advantages of using documents:
1) Documents (i.e. objects) correspond to native data types in many programming languages.
2) Embedded documents and arrays reduce need for expensive joins.
3) Dynamic schema supports fluent polymorphism.
Key features:
High Performance
MongoDB provides high performance data persistence. In particular,
- Support for embedded data models reduces I/O activity on database system.
- Indexes support faster queries and can include keys from embedded documents and arrays.
High Availability
To provide high availability, MongoDB's replication facility, called replica sets, provide:
- Automatic fail-over.
- Data redundancy