2

A Beginner’s Guide to MongoDB and CRUD Operations

 9 months ago
source link: https://www.analyticsvidhya.com/blog/2023/05/a-beginners-guide-to-mongodb-and-crud-operations/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Introduction

In this guide, we will explore the fundamentals of MongoDB and delve into the essential CRUD (Create, Read, Update, Delete) operations that form the backbone of any database system. Whether you’re new to databases or transitioning from a relational database management system, this guide will equip you with the knowledge to start working with MongoDB effectively.

This article was published as a part of the Data Science Blogathon.

Project Details

Imagine this: You are designing a schema for an e-commerce application. The requirement is straightforward: the application should display a person’s complete profile on its user interface. The profile includes details like name, dob, addresses, aadhar card, and the person’s communication information. Simple enough? But remember, requirements are dynamic!

The first table that you design is a person table. This table contains the usual identifiable information on a person – first_name, middle_name, last_name, data_of_birth, and aadhar_card_number. To this table, you assign a unique identifier person_id as the primary key. And since one person can have several addresses associated with them, it becomes a one-to-many relationship. Thus, after normalizing, the next table is an address_info table where you store the person_id from before and address_type, address_line_1, address_line_2, city, state, and country fields. And now, a third table called contact_info stores communication details just like in the previous scenario – person_id, contact_type (email, mobile, etc), and contact_value. The data gets loaded according to this design. So far, so good.

Requirements

The business has decided to accept pan cards and voting cards as identity proof. Well, no problem, you can just add two new columns to the Person table and you will be done. But the data needs to be loaded again. And if previously, you had made aadhar_card_number as a not null field, then now it becomes nullable. Ok, it can be done.

While traversing into the digital world, now your business has decided to accept debit and credit cards. Well, a person can have multiple cards. So you create a new table payment_method containing person_id, payment_method_identifier, payment_method_mode, and payment_card_number. To show all this information on your application, you have to make 4 joins between the person, address_info, communication, and payment_method tables. Well that’s not too good. The higher the number of joins, lower the performance and response time.

Can you think of a better design without joins? If you’re thinking in relational terms then probably not. But do you want to guess how many joins would be required if you used a non-relational design? The answer is none.

If you use a document database like MongoDB then you can embed all this information in a single document. Think of the document as a JSON object for now. Here’s an example denoting the storage of person and address details in the same document.

Figure 1: Data design for relational model

Figure 1: Data design for relational model

Creating Tables

In the case of MongoDB, we need not create 2 tables. We can store the complete information about a person in a JSON document as follows:

{_id: 1, 
first_name: "Shreya", 
last_name: "Chaturvedi", 
birth_dt: new ISODate("1999-09-09T00:00:00Z"),
addresses:[
	{type: "home", addr: ["B/104 XYZ heights"]}, 
	{type: "work", addr: ["ABX office"]}
	] 
},
{_id: 2, 
first_name: "Shanaya", 
last_name: "Park", 
birth_dt: new ISODate("2000-01-01T00:01:54Z"),
addresses:[
	{type: "home", addr: ["B/101 XYZ heights", "C/201 ABC apartments"]}, 
	{type: "work", addr: ["ABX office"]}
	] 
}

Figure 2 illustrates how the name and address information of a person are combined into a single document. In this NoSQL scenario, querying on the column “_id” allows obtaining all the information, eliminating the need for joining tables. This approach enables efficient queries and storage savings by avoiding duplication of key columns. It is known as embedding documents within documents.

And do you remember, how the data had to be reloaded when two new identity methods were added? In the case of MongoDB databases, we wouldn’t have to do that. Instead, from then on, we could just add a new field for pan card and/or voting card to the document depending upon which identity info a person had provided. Of course, the issue of nullability would not arise here, since we don’t add the aadhar card field explicitly in documents where it is null. This is known as schema flexibility and is one of the many charms of MongoDB and NoSQL databases in general. Let’s learn about them now.

For the uninitiated, we have two variants of the full forms of NoSQL – “non-SQL” and “not only SQL”. Either way, the central idea remains the same – NoSQL refers to non-relational databases. It is an approach that enables storing and querying data in/from structures that are not relational. This means that instead of storing the data within a table, NoSQL databases store it within non-relational structures such as JSON documents.

Features of NoSQL Databases

Let’s look at other distinguishing features of NoSQL databases:

  • Flexible Schema: NoSQL databases can store data that don’t have a fixed and same structure throughout. This means whenever a change arrives later in the development stage, the schema for previously stored records need not be revised instead new design can be implemented for the coming records. This leads to less cost overall
  • Scalability: NoSQL databases are horizontally scalable i.e. more capacity can be added by increasing the number of servers for our database as opposed to increasing the RAM/CPU or SSD capacity of a single server
  • Transaction Support: Some NoSQL databases like MongoDB do support ACID transactions. The way data is modeled in the NoSQL scenario usually
    eliminates the need for transactions since all data can be updated in a single document itself, however, a multi-record transaction option is also available.
  • Faster Query Capabilities: Flexible data model and a scale-out architecture leads to the faster querying ability for high volumes of data
  • Ingest Variety of Data: NoSQL databases can ingest structured, semi-structured, or unstructured data. This type of data is often close to the way it is used by the application.
  • Ease of Access: Due to the ease of data model design and data retrieval as well as schema updation properties, developers find working with NoSQL databases very easy for application development.

Types of NoSQL Database

We have four main flavors of NoSQL databases:

  1. Document Databases store data in the form of documents in JSON, BSON, and XML object formats. They also have support for nested documents. MongoDB is an example of this type
  2. KeyValue Stores are the simplest in this category. They have items containing key-value pairs. The most common use case is caching and storing user information. Examples are Redis and Memcached.
  3. Wide Column Stores have data organized in tables, rows, and dynamic columns. Dynamic columns enable users to access only the required columns thus not wasting memory. It is however a complex system. Apache Cassandra, Apache HBase, and Google BigTable are all examples of this type.
  4. Graph Databases are very useful for modeling network data since it houses data in the form of nodes, edges, and properties. Edges define the relationship between nodes. Neo4j is a graph-based database service.

MongoDB Database and its Features

As mentioned earlier, MongoDB is a document database which means it stores data in the form of documents.

  • MongoDB uses BSON, an extension of JSON, to store data in documents.
  • Documents in MongoDB map to objects in programming languages.
  • No need to join multiple tables in the application layer.
  • Data can often be accessed with just one read, depending on data model design.
  • MongoDB stores related data together based on the principle of data that needs to be accessed together.
  • Flexible schema model allows documents within a collection to have varying fields (polymorphism).
  • Schema validation is available in MongoDB to ensure data consistency and avoid unintended restrictions.
  • Document databases are also distributed in nature allowing for the scale-out model to work efficiently. In the case of MongoDB, this feature is provided through sharding. Sharding distributes data across servers called shards. This distributes the read/write workload across shards, each of which can be horizontally scaled. It also increases the storage capacity and leads to increased availability overall.
  • Replication helps in building resilient apps. MongoDB provides it through a replica set, which is, by default, a set of 3 servers, allowing for automatic failover and data redundancy.
  • Document and other NoSQL databases in general, have their own query language, allowing for manipulation of the data. MongoDB’s query API is called MQL. It supports
    • CRUD operations
    • Aggregation operations
    • Text search index
    • Geospatial queries
  • MongoDB provides High performance through support for embedded documents and indexing capabilities on such embedded data.

Terminology: MongoDB v/s SQL

Let’s compare the SQL and MongoDB terms, to begin with our understanding.

SQL Terms MongoDB terms
Database Database
Table Collection
Row/Record Document
Column Field
Index Index
Table joins $lookup
Transactions Transactions
Views Views

As can be seen in the above table, we have documents in MongoDB which correspond to rows in SQL. It is the basic unit of data in MongoDB and stores information in BSON objects – an extended version of JSON. (Read more about BSON in the section BSON – MongoDB’s language of documents).

A collection of documents is called, well, a collection. It’s the equivalent of a table in a relational system. Collections don’t usually enforce schema unlike relational tables, however, we can optionally turn on schema validation.

The documents store data in the form of fields which are name-value pairs.

Example

{_id: 1, 
name: {first_name: "Shreya", last_name: "Chaturvedi"}, 
birth_dt: new ISODate("1999-09-09T00:00:00Z"),
addresses:[
	{type: "home", addr: ["B/104 XYZ heights"]}, 
	{type: "work", addr: ["ABX office"]}
	] 
}

_id, first_ name, last_name, birth_dt, addresses all correspond to the fields of this document. _id is the default primary key of the document. In cases, when the user does not provide value for the _id field while document creation, MongoDB sets this field on its own.

Since a document is nothing but a name-value pair, the value stored by the name field is a document in itself. Hence, we say that the name field contains an embedded document. birth_dt field stores the value of the Date data type. Addresses field is an array of embedded documents. addr field within the addresses field is an array of strings.

MongoDB greatly reduces the need for joining multiple documents by facilitating the storage of all the data accessed together. Even then, MongoDB provides the option of joining using the $lookup operator.

The case of transactions differs for every NoSQL database. For MongoDB, we have the capability of ACID transactions. We will not touch upon this topic in this guide. Please check out the official MongoDB documentation here.

We also have the functionality of views in MongoDB which return the results from an aggregation pipeline. It provides two types of views – Standard and On-demand Materialized views. Standard views are the views that don’t store pre-computed results. Instead, the query is fired when the view is triggered. On-demand materialized views, on the other hand, store the result on the disk and read it when triggered. Hence, they provide better read performance than our standard views. The only point required is the upkeep of these views – where in we need to update the data in the view periodically.

BSON – MongoDB’s Language of Documents

BSON stands for Binary JSON and as mentioned earlier, it is based on JSON – Javascript Object Notation – a very popular data interchange format. As you might already know, JSON objects are associative arrays – or in other words – key-value pairs. This makes it easier to read as well as parse. Use JSON in many places like APIs, config files, log files, database storage, etc due to its features. However, it has 2 shortcomings:

  1. It does not have support for data types such as binary data or date or even specialized data types     for numbers such as integers and floating-point numbers.
  2. JSON objects don’t have a fixed length making traversal across the network slower.

BSON has support for many advanced datatypes such as date, binary data, 32-bit integers, 64-bit integers, decimal, etc. Also, BSON’s binary structure encodes information such as length and type allowing it to traverse quickly as compared to JSON.

Thus, MongoDB stores data in the format of BSON, and the traversal of this data also takes place in the same format. When we use the MongoDB driver in our application, the driver takes care of converting our data from native data structures back to BSON and vice-versa.

If you want to have a look at all the data types supported by BSON, check out this resource.

Setting up MongoDB Database and Mongosh

We will be using the MongoDB shell in this tutorial to query the data. To do this, we need to have a MongoDB deployment in place. There are two ways to do so:

  1. Through MongoDB’s free cloud-hosted deployment – MongoDB Atlas
  2. Running a local MongoDB deployment.

In this guide, we will be using the MongoDB Atlas deployment since it’s an easier option. No installation procedure is required for MongoDB, and it offers a free tier to start with. Feel free to use a local deployment as well.

Steps to Use a Local Deployment

Step 1

 To create a deployment on Atlas, go to https://www.mongodb.com/cloud/atlas/register and register for a free account.

Step 2

If not already present, create an organization and then create a project within this organization. An organization can have multiple projects grouped under it. However, the billing is not based on individual projects; instead, it is determined at the organizational level.

Step 3

Now, we will deploy a cluster within this project. Clusters are the environment that will hold our data. Atlas free cluster will not expire but we will have access to only a handful of features. Let’s create a free cluster.

a. Click on the database in the left-hand side navigation menu.

b. Click on Build a database.

c. Select the M0 cluster option. M0 is a free cluster and useful for developing practice apps.

d. Select any of the cloud providers from AWS, GCP, and Azure.

e. Select your region

f. Click Create. It will take a few minutes for the cluster to set up. Meanwhile grab yourself a coffee!

Step 4

Next, we will add our IP address to the IP access list. We will only be able to connect to our cluster from the IPs present in the trusted list.

a. Click on Network Access in the left-hand side navigation menu.

b. Click on +ADD IP ADDRESS.

c. Click on ADD CURRENT IP ADDRESS.

d. You can make it a temporary entry by turning on the radio button at the bottom-left corner.
e. Click on confirm and wait for the status of your IP address to activate in the IP Access List.

Step 5

Now, we will create a database user that will have access to databases and collections hosted in Atlas. Please note that this database user is different from the Atlas user created previously. Atlas users can log in to Atlas but cannot access databases. To create a database user:

a. Click on the database in the left-hand side navigation menu.

b. Click on connect.

c. You will get the setup connection security dialog box. Here, set the database username and password. Make sure that if you use special characters in your password here, later on, you will have to use escape characters to denote these special characters in your connection string.

Step 6

With all the above steps completed successfully, we are now ready to install Mongosh – a command line interface for MongoDB. It can be used to perform CRUD operations and create aggregation pipelines and much more for the data stored on the Atlas cluster.

a. Click on the database in the left-hand side navigation menu.

b. Click on connect.

c. Since the first step – Setup connection security is already completed, it will come with a green tick now. In the section of access your data through tools, select Shell.

d. After choosing a connection method, we will get 2 options – I don’t have the MongoDB shell installed and I have the MongoDB shell installed. Choose the former.

e. Select your OS from the dropdown and follow the installation instructions.

f. After you have installed the MongoDB shell, test it with the following command:

mongosh --version

Step 7

Since the setting up for Mongosh and MongoDB database both are completed now, we can connect to the database through Mongosh.

a. We will follow steps 6. a through 6. c

b. After choosing a connection method, choose the optionI have MongoDB shell installed.

c. Copy and paste the connection string in the terminal.

d. As soon as we hit enter, you will have to enter a password.

e. If you see something like this, it means the connection is complete, and now we can query our database

MongoDB and CRUD

Step 8

This is an optional step. If we want, we can load the sample datasets provided by MongoDB in our database deployment for practice purposes.

a. Click on the database in the left-hand side navigation menu.

b. Click on ellipses (…)

c. Click on Load Sample Dataset.

d. Once the dataset loading is complete, we can click on “Browse Collections.”

e. If we are still logged into Mongosh, we can execute the following command to view all the loaded sample databases.

show dbs

This completes our setup and we can move on to the next section now where we will use MongoDB query API to manipulate the data.

MongoDB Query API

We will use MongoDB query API to query our data. There are 2 major ways to do so:

               1. CRUD Operations

               2. Aggregation pipelines

However, this API allows performing more than just querying. We can also:

1. Combine data with $lookup and $unionWith operators

2. Analyze geospatial and graph data

3. Perform an efficient full-text search on data

4. Design indexes to improve query performance

5. Create views and on-demand materialized views

6. Create, query, and aggregate time-series collections

We will be looking at the CRUD operations using the MongoDB query API.

CRUD Operations

CRUD is the acronym for create, read, update, and delete.

Creating a MongoDB Database and Collection with CLI

To get a list of all the databases in our cluster, we can run the following command:

show dbs

If you have loaded the sample data sets previously mentioned in the setup section, you will get a list of all those databases along with 2 more admin and local. But even if you don’t have any other databases, these two will still show up in the list.

Admin database is a privileged database and users who have access to this database can run the administration commands such as collMod, create, createIndexes, drop, dropDatabase, etc. It’s used for authorization and authentication. The local database stores the metadata of the node running the MongoDB. However, the local database does not undergo replication.

An interesting thing to note here is that there is no create command in MongoDB. Let’s say we want to create a database named db_prod. To do so, we will switch to its context using the use command.

use db_prod

However, if at this time we run the show dbs command, we won’t see the db_prod database in the list yet. This is because MongoDB will create the database when data is first stored inside it. This data could either be a collection or a document. So let’s create a collection and a document inside this database.

db.prod_meta.insertOne({database: “db_prod”, created_date: new Date(), admin: “admin”}
)

Note that the above command inserts this document into the prod_meta collection.

Now, if we run show dbs, we will be able to see the database inserted within the list. Let’s also run:

 show collections

We can see the new collection inserted into this database. What if we want to look at the documents? We can run the find command as follows:

 db.prod_meta.find()
MongoDB and CRUD

Inserting Documents

There are two main functions for inserting documents in a collections:

  1. insertOne – inserts a single document
  2. insertMany – insert multiple documents into a collection

Following is the syntax for the insertOne method:

db.collection.insertOne( 
<document>, 
{ writeConcern: <document> } 
)

The first argument is the document that we want to insert into the given collection.

The second argument is optional and represents a document expressing the write concern. For our purposes, we will avoid setting it in the coming examples. However, it’s good to have an idea of what a write concern is.

As mentioned before, we have the concept of a replica set in MongoDB and by default, we have 3 members in this replica set – 1 primary and 2 secondary nodes. A write concern then describes the number of nodes that must acknowledge this write operation before it is deemed successful. A write operation is acknowledged by a node when it has applied the write successfully.

To know more about write concerns, refer to this link.

InsertOne method returns the following document:

  • The first field is acknowledged which is a boolean value returned as true if write concern was enabled otherwise false.
  • The second field is the insertedId field which returns the value of the _id field of the document inserted just now. If we don’t provide this field in our documents, it will be inserted automatically by MongoDB.

Example

db.prod_users.insertOne(
{ name: “admin1”, resource: “project1”, as_of_date: new Date()}
)

As can be seen from the command, we did not pass any write concern. However, in the response, the acknowledged field is true. This is because it ran with the default write concern of the majority. In the response, we also see the value of the _id field.

This is how the document looks in the collection:

MongoDB and CRUD

If we want to insert multiple documents, we should choose the insertMany method. In this case, the syntax just changes slightly:

db.collection.insertMany([<document1>, <document2>, ...], 
{ writeConcern: <document> }
 )

Instead of a single document, we pass an array of documents to the method.

Example

db.prod_users.insertMany([ {name: “admin2”, resource: “project1”, as_of_date: new Date()},
  {_id: 11, name: “dev1”, resource: “project1”, team: “MongoDB database dev”, as_of_date: new Date()}
])

As can be seen from the figure, we have successfully added 2 documents to our collection. But take notice of 2 more things:

  • For the second document, we also added an _id field and this was populated in the document instead of the automatic value generated by MongoDB
  • We changed the schema of the second document by adding a team field and it did not give any error. Also, we didn’t have to populate the other 2 documents with this field as null. This highlights the schema flexibility that we talked about earlier.

Querying Documents

If you have followed all the examples till now, you will have noticed that we have been using a find method with our collection. This find method is nothing but the method used to query our documents. The syntax is as follows:

db.collection.find(query, projection, options)

The first argument takes in a query document. If we want to see all the documents in a collection, we can pass an empty document {} in this parameter.

The second argument is a projection document that works on either the inclusion or exclusion of fields. Let’s say in the final output, we want to see only 2 of the 10 fields of all documents in a collection.

To do so, we will mention those fields as keys in this argument with their value as 1. Let’s say we want to omit these fields from the result, then we will make the value 0. We can set all values to 1 or 0. This means we can specify which fields to include in the output or which fields to exclude, but not a combination of both.

The only exception to this rule is the _id field since it will be shown by default for documents. To suppress this field in the output, we can mention _id as the key and the value as 0.

Let’s take a look at some of the examples so that we can better understand. As shown in previous figure, running the following command will return all the 3 documents inserted to the prod_users collection.

db.prod_users.find()

Querying for Equality Condition

If we want to query the document where the name is admin2, we will run the following query:

db.prod_users.find({name: “admin2”})

As seen in the figure, all the fields in this document are returned. But we want to see only name and resource fields. Here is where projection comes into picture.

db.prod_users.find({name: “admin2”},{name:1,resource:1})

Oh, but the _id field is still seen in the output. To remove it from the output we will explicitly mention it in the projection document.

db.prod_users.find({name: “admin2”}, {name:1, resource: 1, _id:0})

To achieve the same output, an alternate query would be

db.prod_users.find({name: “admin2”}, {_id:0, as_of_date:0})

What we cannot do is merge inclusion and exclusion fields like in the below query

db.prod_users.find({name: “admin2”}, {_id:0, name:1, resource:1, as_of_date:0})

Other Comparison Operators

$in operator has a function similar to its SQL counterpart. It will return all the documents where the value of fields matches anyone provided in the list.

db.prod_users.find({name: { $in: [“admin1”, “admin2”]}})

Apart from this, we also have operators like $gt (greater than), $gte (greater than or equal to), $lt (less than), $lte (less than or equal to), $ne (not equal to) and $nin (not in operator).

Let’s use the $gt operator. We will use listingsAndReviews collection from sample_airbnb database. We will switch the context using the use command.

There are many fields for one document in this collection however the one which we are interested in is beds. It is a numeric field containing the number of beds for a listing. We will query the collection to find the documents where the beds field has a value greater than 3. We will run the findOne method which works in a way similar to the find method but returns only one document in the result.

use sample_airbnb
db.listingsAndReviews.findOne({beds: {$gt: 3}}, {name:1, beds:1}

Other comparison operators can be used in a similar way.

Logical Operators

Next up are the logical operators – $and, $or, $not, and $nor. Let’s query the collection to see the listings where the number_of_reviews is greater than 10 and the number of beds is greater than 5.

db.listingsAndReviews.findOne( {number_of_reviews: {$gt: 10}, beds: {$gt: 5}}, {name: 1, number_of_reviews: 1, beds: 1})

Let’s try to find such listings where either the number_of_reviews is greater than 100 or the number of beds are greater than 5.

db.listingsAndReviews.find({$or: [{number_of_reviews: {$gt: 100}}, {beds: {$gt: 5}}]}, {name: 1, number_of_reviews:1, beds: 1}).limit(5)

Note one more thing, that here we have used the function limit at the end of our find query. This is called method chaining. The limit function helps us to limit the number of records on the terminal. In this case, it returns only 5 values.

$not and $nor can be used in a similar fashion.

Querying Embedded Documents

An example of embedded documents is the availability and host fields within the listingAndReviews collection.

Availability_30, availability_60, availability_90 and availability_365 are all the fields within the availability document which is embedded within the main document. Let’s say we want the documents where the value of availability_365 is 32. Since this lies within the availability document, we have two ways for querying this field.

The first method is to provide an exact match of the availability document:

db.listingsAndReviews.find({availability: {availability_30:0, availability_60: 0, availability_90:0, availability_365: 32}}, {name: 1, availability:1})

Providing an exact match means the order of the fields matching the availability document should be maintained too. For example, following query will return null results where we have just swapped the positions of availability_30 and availability_60 fields.

db.listingsAndReviews.find({availability: {availability_60:0, availability_30: 0, availability_90:0, availability_365: 32}}, {name: 1, availability:1})

A second way to query this embedded document would be to use the dot notation – “field.nestedField”.

db.listingsAndReviews.find({“availability.availability_365”:32}, {name: 1, availability: 1})

Note that here, the only equality condition checked is on availability_365 fields. Thus, the other three nested fields could have any value and that document would still qualify for the output of this query.

Querying an Array

Amenities, host_verifications (inside the host document) field inside the documents of listingsAndReviews collection is an example of array field.

As usual, we have multiple ways to query an array field.

The first is to match the array exactly. We will use the host_verifications field for this purpose. We want to find out listings where verifications are only email and phone and in the same order.

db.listingsAndReviews.findOne({“host.host_verifications”: [“email”, “phone”]}, {name:1, host:1})

However, if we want to query regardless of order and presence of other elements, we can use the $all operator.

db.listingsAndReviews.findOne({“host.host_verifications”: {$all: [“email”, “phone”]}}, {name:1, host:1})

Querying for Null or Missing Fields

To query a collection for documents where a field is not present or is null, we use the following statement

db.collectionName.find({field: null})

Check out other MongoDB Query operators at this link.

Updating Documents

Following are the update methods available in MongoDB:

  1. updateOne – updates the first document matching the filter criteria
  2. updateMany – updates all the documents matching the filter criteria
  3. replaceOne – replaces the first document matching the filter criteria with the update document

Following is the syntax for updateOne method:

db.collection.updateOne(
   <filter>,
   <update>,
   {
     upsert: <boolean>,
     writeConcern: <document>,
     collation: <document>,
     arrayFilters: [ <filterdocument1>, ... ],
     hint:  <document|string>        // Available starting in MongoDB 4.2.1
   }
)

The first argument is the filter criteria denoting the document to be updated. In simple SQL terms, this argument is equivalent of the where clause. If you specify an empty document here, it will update the first document returned by the collection. Second argument is the update document specifying the fields to be updated. The third argument is the options document. Out of all the options shown here, we only care about upsert as of now. If our filter criteria matches no document and upsert is set to true in the options, the update document gets inserted into our collection.

It returns a document that contains

  • matchedCount – number of documents matching the filter criteria,
  • modifiedCount – number of documents that were modified,
  • upsertedId – id of the document upserted if the option was set to true and
  • acknowledged – a boolean value denoting write concern.

To experiment with this method, let’s create a new database called test_db. Insert the following documents in the collection testColl using insertMany method:

db.testColl.insertMany(
[{_id:1, name: "name1", value: "value 1"},
{_id:2, name: "name2", value: "value 2"},
{_id:3, name: "name3", value: "value3", arr: [1,2,3]},
{_id:4, name: "name4", value: "value 4", nest_doc: {k1: 1, k2: 2, k3: 3}}
])

Updating the Value of a Field in a Document

Now, let’s say we want to update the value of ‘value’ field for _id=3 document. To do so, we will use the set operator.

db.testColl.updateOne(
{_id:3},
{$set: {value: "value 3 upd"}}
)

Inserting a New Field in the Document

Let’s say we want to add a new field to the document. $set operator aids in that too.

db.testColl.updateOne(
{_id:2},
{$set: {"newField": "new value"}}
)

Incrementing the Value of a Field

What if we want to increment the value of a field? $inc is the operator to use for that. We will increase the value of k1 field in nest_doc in _id=4 document by 10 units:

db.testColl.updateOne(
{_id:4},
{$inc: {"nest_doc.k1": 10}}
)

Appending Values to An Array

If we want to add a value to an array, we can use the $push operator.

db.testColl.updateOne(
{_id:3},
{$push: {arr: 4}}
)

Check out more such array update operators here.

UpdateMany Method

The syntax for updateMany method is as follows:

db.collection.updateMany(
   <filter>,
   <update>,
   {
     upsert: <boolean>,
     writeConcern: <document>,
     collation: <document>,
     arrayFilters: [ <filterdocument1>, ... ],
     hint:  <document|string>        // Available starting in MongoDB 4.2.1
   }
)

If we want to insert a status field in all documents with value as active, we will update the documents in the following manner:

db.testColl.updateMany(
{},
{$set: {status: "Active"}}
)

ReplaceOne Method

Now we will talk about the replaceOne method. This method has the same syntax as previous methods, the only condition being that the replacement document cannot have update operators.

db.collection.replaceOne(
   <filter>,
   <replacement>,
   {
     upsert: <boolean>,
     writeConcern: <document>,
     collation: <document>,
     hint: <document|string>                   // Available starting in 4.2.1
   }
)

An example will clear up the concept. Let’s say we want to replace the document with _id=1. We can write the following statement:

db.testColl.replaceOne(
{_id:1},
{name: "new name", value: "new value", status: "Inactive"}
)

Updating with Upsert Option

If the filter criteria doesn’t match any document and upsert value is true, the update document gets inserted into the collection.

Example

db.testColl.updateOne(
{_id:10},
{$set: {name: "upsert name", value: "upsert value"}},
{upsert: true}
)

This option is available for all three update methods mentioned before.

Deleting Documents

Now, we arrive at the final section of our guide – deleting documents. As you would have guessed, we have the following two methods for deleting MongoDB documents:

  1. deleteOne() – deletes one document from the list of documents matching the filter criteria
  2. deleteMany() – deletes all the documents matching the filter criteria

The syntax for deleteOne method is similar to its contemporaries from other CRUD methods:

db.collection.deleteOne(
   <filter>,
   {
      writeConcern: <document>,
      collation: <document>,
      hint: <document|string>        // Available starting in MongoDB 4.4
   }
)

The first argument is the filter document. Passing an empty document to this argument will delete the first document returned by the collection. The second argument is the options documents containing several fields such as writeConcern, collation, etc. For the purposes of this guide, we will ignore this option.

This method returns the following fields:

  • Acknowledged field which is a boolean value depending on whether the query ran with a write concern or not
  • DeletedCount field denoting the number of deleted documents

Let’s try to delete the document from our testColl where _id is 2.

db.testColl.deleteOne(
{_id:2})
MongoDB and CRUD

deleteMany method follows the predictable syntax:

db.collection.deleteMany(
   <filter>,
   {
      writeConcern: <document>,
      collation: <document>
   }
)

First argument is again the filter document. Passing an empty value here will delete all the documents from the collection.

Let’s try to delete documents having “name” in the name field followed by a digit. To find such documents, we will use the $regex
operator.

db.testColl.find({name: {$regex: 'name[0-9]{1}'}})
MongoDB and CRUD

Hence, if we run the deleteMany method with this regex filter criteria, these two documents should get deleted. Let’s check it out:

db.testColl.deleteMany({name: {$regex: 'name[0-9]{1}'}})
MongoDB and CRUD

As can be seen from the previous figure, all the matching documents are deleted.

With this we conclude our guide on MongoDB CRUD operations. Next section covers a summary of what we learned.

Conclusion

In this article, we have gained a basic understanding of the MongoDB NoSQL database. We now know that it is a document database supporting two essential features schema flexibility and data model scalability. The most important principle in MongoDB database design is that the data which has to be accessed together should be together to reduce the need for joins and make the model overall more comprehensible.

  • A database is called the same in MongoDB as in SQL. A table, however, translates to a collection in MongoDB. These databases and collections get created when data is stored in them for the first time.
  • We have two methods for inserting data in a document – insertOne() and insertMany().
  • The methods for querying data from a document are – findOne() and findMany().
  • We have three methods for updating data in a document – updateOne(), updateMany(), and replaceOne().
  • The methods for deleting data from a collection are – deleteOne() and deleteMany().

References

  • https://www.mongodb.com/nosql-explained
  • https://www.ibm.com/topics/nosql-databases
  • https://www.mongodb.com/docs/manual/sharding/
  • https://www.mongodb.com/docs/manual/introduction/
  • https://www.mongodb.com/docs/manual/core/views/
  • https://www.mongodb.com/docs/manual/reference/sql-comparison/
  • https://www.mongodb.com/docs/manual/core/document/#std-label-document-structure
  • https://www.mongodb.com/docs/manual/reference/sql-aggregation-comparison/
  • https://www.mongodb.com/json-and-bson
  • https://www.mongodb.com/docs/manual/query-api/
  • https://www.mongodb.com/docs/manual/core/replica-set-write-concern/
  • https://www.mongodb.com/docs/manual/reference/write-concern/
  • https://www.mongodb.com/docs/manual/reference/operator/query/#std-label-query-selectors

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Related


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK