A beginner-friendly explanation of how computers process information — from binary and bytes to how text, images, videos, and audio are represented and stored.

So, all right. Let’s rewire everything. This is my mental model of how MongoDB is working.
In the initial version of MongoDB, whenever a user wants to enter data, let’s say the user has some data, then the user performs CRUD operations. The first operation is Create.
So, let’s say the user is performing a create operation. The user has some data that he wants to insert into the database. In MongoDB’s initial version, that was mmap. In that, the JSON was converted into Binary JSON, which is also known as BSON, and it was stored into a page.
Obviously, the page was of fixed size length. Along with the BSON data, there were records that were also inserted along with the data to identify the data itself — that this is this data.
Now, whenever new data is inserted, an ID for that inserted data is created, which is _id.
Now, that _id, internally, like this ObjectId is mapped with the offset of that data, like where this data is stored. Typically, it contains the file name plus the offset of the file. Internally, it is used to find the data.
So, let’s say whenever the user wants to read the data, the _id is used to find the data.
Now, this _id is indexed. It contains, like, the first part in the _id which is a timestamp. So, it is easily sorted, and we can easily find _id-based data in the database itself.
And that is how Create and Read work.
Similarly, if you want to update or delete the data — let’s say I delete the data — then in that page, like in the particular page where the _id is stored, the record of that BSON is deleted, and there is empty fragmentation there.
Now, MongoDB readjusts — like it does not readjust, my bad — it does not readjust, but that place is left empty.
Now, whenever another record is to be inserted, it will insert into that fragmented area. So, that’s how Delete is working.
And for the Updation part, we may need to readjust the data.
And obviously, all of this will work by using the _id as a filtration.
So, that’s how CRUD operations worked in the initial version. The database engine — or it is search engine — was mapped.
Later, they used the wired engine — I don’t recall the exact correct name, it was something like Wired Engine or whatever.
So, the initial version had a problem with concurrency. Like, it was applying collection-level locking, I guess. So, you cannot have two simultaneous accesses on multiple documents.
Then, in the later version, after version 3.0, they changed the engine and supported document-level locking instead of collection-level locking.
Also, in the initial version, the durability was not good. I mean, there was durability like simple logging or whatever, but it was not good.
But after version 3.0, when the engine changed, they introduced Write-Ahead Logging (WAL), which is durable.
So, these two were the main points, and indexing got better as well.
So, this was how it was working before and how it worked after.
That’s it, I guess.
Then additional features were introduced in the later versions, like joining and whatever, like aggregation pipelines and all. But this is not a concern right now.
Basically, the difference between SQL and NoSQL is that:
In NoSQL, you just have to put the JSON data that is flexible. In JSON format, it will put the data, convert it into binary format (Binary JSON), and store it in the database.
In the later versions, like after version 3.0, it also compressed the data.
So, in a single page, you can have multiple different documents. Like before compression, it was storing three documents in a page. And after compression, it can store six to seven documents in a page.
That was a large difference as well.
Another difference is that SQL supports transaction support.
Although in the later version, NoSQL — like MongoDB — also supported the transaction option as well.
But in the initial version, there was no transaction management, like no commit and all. No ACID property, etc.
And there was no joining function as well in the initial version of MongoDB. But SQL has supported it from time to time.
That’s it.
Mainly, the preference is that:
Why?
Because NoSQLs are read-heavy. They are read-heavy operations. Writing is not their main use case.
But SQL can be good for write-heavy workloads.
NoSQL is not good for write-heavy; it is good for read-heavy.
That’s all that is the difference between them.
And this is what I understood today.
Here's a quick overview of the topics we covered in this post:
My Mental Model of How MongoDB Works
• Initial Version of MongoDB (MMAP)
• _id and Data Mapping
• Delete Operation
• Update Operation
• Engine Change and Concurrency
• Durability Improvements
• Additional Features (Later Versions)
• SQL vs NoSQL (Basic Difference)
• Compression in Later Versions
• Transactions and ACID
• Scaling and Use Case Preference