top of page

How To Configure a MongoDB Replica Set on Ubuntu from Scratch

  • Writer: Shreyansh Kumar
    Shreyansh Kumar
  • Sep 5, 2023
  • 1 min read

Goal

Creating a replica-set for MongoDB with a primary node and two worker nodes using Ubuntu 20.04 instances. For this we will be referencing Digital Ocean and MongoDB official documentation.

Prerequisites

  • Three instances provisioned in the same VPC sharing the same security group.

  • Inbound traffic allowed in the security group shared by instances for port 27017.

Steps

Welcome to the steps! I am overjoyed that you are reading this.

At this point we have one EC2 instance(Ubuntu 20.04) which can communicate with other instances sharing same security group on 27017 port.

Step1: Install MongoDB Community Edition on EC2 Instances.


wget -qO — https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
echo “deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/5.0 multiverse” | sudo tee /etc/apt/sources.list.d/mongodb-org-5.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org=5.0.5 mongodb-org-database=5.0.5 mongodb-org-server=5.0.5 mongodb-org-shell=5.0.5 mongodb-org-mongos=5.0.5 mongodb-org-tools=5.0.5

Step 02: Update mongod.conf file.

Before proceeding, backup the original MongoDB configuration file :

cp -prv /etc/mongod.conf /etc/mongod.conf.bkp

Now update /etc/mongod.conf as below :

ree

Note: Replace <db name> with the desired database name wherever you find it. Copy contents of the file as it is(with proper indentation) and paste in mongod.conf. Don’t worry! I’m gonna explain each field of this configuration in detail at different section of this blog. Get pem key of EC2 instance copied in /home/ubuntu/.ssh folder so that instances would be able to communicate with one another on SSH connection(It’s not mandatory but recommended).

Step 03: Prerequisite for Keyfile Authentication.

Sign in to the instance and run below commands

sudo su
openssl rand -base64 768 > /etc/mongo-key
chmod 400 /etc/mongo-key
sudo chown mongodb:mongodb /etc/mongo-key

We’ve successfully created a Keyfile and updated valid permissions using which DB instances would securely communicate with each other for synchronisation.

Step 04: Update Limit configuration

We are going to update soft and hard limits by appending below lines at the end of the file /etc/security/limits.conf

soft nofile 8096
hard nofile 8096
root soft nofile 8096
root hard nofile 8096

Step 05: Attach an additional volume(EBS) of type io1

In this step we are provisioning one EBS volume of type io1. It would further be mounted at specific destination(in our case it’s /data) with filesystem type as XFS. Please note the type mentioned here is an ideal choice for production environment. You may consider different volume type as per your need.

lsblk #To ensure that recently attached volume appears to be available on the server

sudo file -s /dev/<volume name> #To ensure that recently attached volume is not formatted yet

Proceed with next step only if the output appears as /dev/<volume name>: data because this means that volume is not formatted yet. If you see any other output than that then its most likely that filesystem is already configured. It also implies that you have, unfortunately, selected the wrong volume to format.

sudo mkfs -t xfs /dev/<volume name> #format volume with XFS filesystem
sudo mkdir /data #Create a directory named data at root directory
sudo mount /dev/<volume name> /data #Mount recently added volume at /data
sudo cp /etc/fstab /etc/fstab.orig #Create a backup of your /etc/fstabfile that you can use if you accidentally destroy or delete this file while editing it.
sudo blkid #Command to find the UUID of the device
sudo vim /etc/fstab #Open the /etc/fstab file using vim text editor

Add following entry to /etc/fstab to mount an attached EBS volume on every system reboot

UUID=”<UUID of /dev/<volume name>>” /data xfs defaults,nofail 0 2

To verify that your entry works, run the following commands to unmount the device and then mount all file systems in /etc/fstab. If there are no errors, the /etc/fstab file is OK and your file system will mount automatically after it is rebooted.

sudo umount /data
sudo mount -a

If you receive an error message, address the errors in the file. Warning Errors in the /etc/fstab file can render a system unbootable. Do not shut down a system that has errors in the /etc/fstab file. Run below mentioned command to create directories and set permission if previous step succeeds :

sudo mkdir -p /data/<db name>/db
sudo chown -R mongodb:mongodb /data
sudo chown -R mongodb:mongodb /data/<db name>/
sudo chown -R mongodb:mongodb /data/<db name>/db
sudo mkdir -p /data/logs/mongodb/
sudo chown -R mongodb:mongodb /data/logs
sudo chown -R mongodb:mongodb /data/logs/mongodb

Step 06: Create an AMI of this instance

Note: The machine launched using this AMI will have same UUID for the attached additional EBS volume and same fstab entry will persist. Provision two more instances using above created AMI Create an A record for all of these instances in desired hosted zone. As a result instances would be able discover each other with Record name.

Step 07: Start MongoDB service and Test connectivity


sudo systemctl restart mongod
sudo systemctl enable mongod
sudo systemctl status mongod
nc -zv <Record name of first instance> 27017; nc -zv <Record name of second instance> 27017; nc -zv <Record name of third instance> 27017

If all previous steps have been successfully completed, proceed to the next step.

Step 08: Establish a Replica Set

SSH into any of these three instances and execute the command listed below.


mongo
rs.initiate( { _id: “POCRep”, members: [ { _id: 0, host: “<Record name of first instance>” }, { _id: 1, host: “<Record name of second instance>” }, { _id: 3, host: “<Record name of third instance>” } ] })

Step 09: Create database admin account

Navigate to primary Mongo instance, Sign in to Mongo shell & Switch to admin database

db.createUser( { user: “UserAdmin”, pwd: “password”,”roles” : [{“role”:”userAdminAnyDatabase”,”db”:”admin”}]}) #Create an administrator user account
db.auth( “UserAdmin”, “password” ) #Verify if password was properly entered
mongo -u “UserAdmin” -p — authenticationDatabase “admin”
or
mongo -u UserAdmin — authenticationDatabase “admin” -p #Connect with database using Administrator account
db.createUser( { user: “ClusterAdmin”, pwd: “password”,”roles” : [{“role”:”clusterAdmin”,”db”:”admin”}]}) #Create Cluster Administrator account
exit

Step 10: Enable database authentication

Sign in to secondary replica set members one by one, connect with Mongo shell and update security configuration of /etc/mongod.conf as below to allow authenticated as well as unauthenticated connection . . . security: keyFile: /etc/mongo-key transitionToAuth: true . . .

sudo systemctl restart mongod #restart Mongo

Return to the primary Mongo instance and make it secondary using below command so that our primary node is always operational while we make changes as part of this process.


mongo
rs.stepDown()
exit

Now, update security configuration on this instance as well by editing /etc/mongod.conf. Restart mongod service.

Step 11: Allow only authenticated connections

Comment transitionToAuth: true on all of the servers. As a result security configuration will look like this . . . security: keyFile: /etc/mongo-key #transitionToAuth: true . . .


sudo systemctl restart mongod #restart mongod service

Note: Personally, I prefer not to make changes on active primary members; instead, I prefer to perform on secondary instances first, then move on to primary, make it secondary, and repeat the process.


mongo — eval ‘rs.status()’ #It must fail from now on because only an admin can run these commands on clusters and there should not be any security related warning

Toggle several configuration functions using the same method of applying over secondary first, restarting service, and then moving on to primary, making it secondary, and repeating the process. We’ve successfully provisioned MongoDB replica set on Ubuntu 20.04 EC2 Instance. Give yourself a HUGE pat on the back! This is probably a good moment to grab a coffee (or tea) or something to eat to re-energize yourself.


ree

MongoDB configuration file options reference

It is not a required , but as promised earlier, I will explain each configuration field of /etc/mongod.conf in detail in this section.

storage.dbPath

Type: string Default:

  • /data/db on Linux and macOS

The directory where the mongod instance stores its data.

storage.directoryPerDB

Type: boolean Default: false When true, MongoDB uses a separate directory to store data for each database. The directories are under the storage.dbPath directory, and each subdirectory name corresponds to the database name.

storage.journal.enabled

Type: boolean Default: true on 64-bit systems, false on 32-bit systems Enable or disable the durability journal to ensure data files remain valid and recoverable. This option applies only when you specify the storage.dbPathsetting. mongod enables journaling by default. Starting in MongoDB 4.0, you cannot specify --nojournal option or storage.journal.enabled: false for replica set members that use the WiredTiger storage engine. For the WiredTiger storage engine, storage.journal.enabled: false cannot be used in conjunction with replication.replSetName.

storage.engine

Default: wiredTiger NOTE Starting in version 4.2, MongoDB removes the deprecated MMAPv1 storage engine. The storage engine for the mongod database. Available values include: If you attempt to start a mongod with a storage.dbPath that contains data files produced by a storage engine other than the one specified by storage.engine, mongod will refuse to start.

storage.wiredTiger.engineConfig.directoryForIndexes

Type: boolean Default: false When storage.wiredTiger.engineConfig.directoryForIndexes is true, mongod stores indexes and collections in separate subdirectories under the data (i.e. storage.dbPath) directory. Specifically, mongod stores the indexes in a subdirectory named index and the collection data in a subdirectory named collection. By using a symbolic link, you can specify a different location for the indexes. Specifically, when mongod instance is not running, move the indexsubdirectory to the destination and create a symbolic link named indexunder the data directory to the new destination.

systemLog.destination

Type: string The destination to which MongoDB sends all log output. Specify either fileor syslog. If you specify file, you must also specify systemLog.path. If you do not specify systemLog.destination, MongoDB sends all log output to standard output.

systemLog.logAppend

Type: boolean Default: false When true, mongos or mongod appends new entries to the end of the existing log file when the mongos or mongod instance restarts. Without this option, mongod will back up the existing log and create a new file.

systemLog.logRotate

Type: string Default: rename Determines the behavior for the logRotate command when rotating the server log and/or the audit log. Specify either rename or reopen:

  • rename renames the log file.

  • reopen closes and reopens the log file following the typical Linux/Unix log rotate behavior. Use reopen when using the Linux/Unix logrotate utility to avoid log loss.

  • If you specify reopen, you must also set systemLog.logAppend to true.

systemLog.path

Type: string The path of the log file to which mongod or mongos should send all diagnostic logging information, rather than the standard output or the host’s syslog. MongoDB creates the log file at the specified path. The Linux package init scripts do not expect systemLog.path to change from the defaults. If you use the Linux packages and change systemLog.path, you will have to use your own init scripts and disable the built-in scripts.

systemLog.verbosity

Type: integer Default: 0 The default log message verbosity level for components. The verbosity level determines the amount of Informational and Debug messages MongoDB outputs. [2] The verbosity level can range from 0 to 5:

  • 0 is the MongoDB's default log verbosity level, to include Informational messages.

  • 1 to 5 increases the verbosity level to include Debug messages.

To use a different verbosity level for a named component, use the component’s verbosity setting. For example, use the systemLog.component.accessControl.verbosity to set the verbosity level specifically for ACCESS components. See the systemLog.component.<name>.verbosity settings for specific component verbosity settings. For various ways to set the log verbosity level, see Configure Log Verbosity Levels.

systemLog.quiet

Type: boolean Default: false Run mongos or mongod in a quiet mode that attempts to limit the amount of output. systemLog.quiet is not recommended for production systems as it may make tracking problems during particular connections much more difficult.

systemLog.traceAllExceptions

Type: boolean Default: false Print verbose information for debugging. Use for additional logging for support-related troubleshooting.

systemLog.timeStampFormat

Type: string Default: iso8601-local The time format for timestamps in log messages. Specify one of the following values: net.port Type: integer Default:

  • 27017 for mongod (if not a shard member or a config server member) or mongos instance

  • 27018 if mongod is a shard member

  • 27019 if mongod is a config server member

The TCP port on which the MongoDB instance listens for client connections. net.bindIp Type: string Default: localhost The hostnames and/or IP addresses and/or full Unix domain socket paths on which mongos or mongod should listen for client connections. You may attach mongos or mongod to any interface. To bind to multiple addresses, enter a list of comma-separated values. To bind to all IPv4 addresses, enter 0.0.0.0. processManagement.timeZoneInfo Type: string The full path from which to load the time zone database. If this option is not provided, then MongoDB will use its built-in time zone database. The configuration file included with Linux and macOS packages sets the time zone database path to /usr/share/zoneinfo by default. The built-in time zone database is a copy of the Olson/IANA time zone database. It is updated along with MongoDB releases, but the time zone database release cycle differs from the MongoDB release cycle. The most recent release of the time zone database is available on our download site. security.keyFile Type: string The path to a key file that stores the shared secret that MongoDB instances use to authenticate to each other in a sharded cluster or replica set. keyFileimplies security.authorization. See Internal/Membership Authentication for more information. Starting in MongoDB 4.2, keyfiles for internal membership authenticationuse YAML format to allow for multiple keys in a keyfile. The YAML format accepts content of:

  • a single key string (same as in earlier versions),

  • multiple key strings (each string must be enclosed in quotes), or

  • sequence of key strings.

The YAML format is compatible with the existing single-key keyfiles that use the text file format.

security.transitionToAuth

Type: boolean Default: false New in version 3.4: Allows the mongod or mongos to accept and create authenticated and non-authenticated connections to and from other mongod and mongos instances in the deployment. Used for performing rolling transition of replica sets or sharded clusters from a no-auth configuration to internal authentication. Requires specifying a internal authentication mechanism such as security.keyFile. For example, if using keyfiles for internal authentication, the mongod or mongos creates an authenticated connection with any mongod or mongos in the deployment using a matching keyfile. If the security mechanisms do not match, the mongod or mongos utilizes a non-authenticated connection instead. A mongod or mongos running with security.transitionToAuth does not enforce user access controls. Users may connect to your deployment without any access control checks and perform read, write, and administrative operations.

security.clusterAuthMode

Type: string Default: keyFile The authentication mode used for cluster authentication. If you use internal x.509 authentication, specify so here. This option can have one of the following values: If --tlsCAFile or tls.CAFile is not specified and you are not using x.509 authentication, the system-wide CA certificate store will be used when connecting to an TLS-enabled server. If using x.509 authentication, --tlsCAFile or tls.CAFile must be specified unless using --tlsCertificateSelector. For more information about TLS and MongoDB, see Configure mongod and mongos for TLS/SSL and TLS/SSL Configuration for Clients .

security.authorization

Type: string Default: disabled Enable or disable Role-Based Access Control (RBAC) to govern each user’s access to database resources and operations. Set this option to one of the following: See Role-Based Access Control for more information. The security.authorization setting is available only for mongod.

replication.replSetName

Type: string The name of the replica set that the mongod is part of. All hosts in the replica set must have the same set name. If your application connects to more than one replica set, each set must have a distinct name. Some drivers group replica set connections by replica set name. The replication.replSetName setting is available only for mongod. Starting in MongoDB 4.0:

  • The setting replication.replSetName cannot be used in conjunction with storage.indexBuildRetry.

  • For the WiredTiger storage engine, storage.journal.enabled: falsecannot be used in conjunction with replication.replSetName.

References

 
 
 

Comments


bottom of page