Greetings, friends!

Imagine this scenario — you wake up at three in the morning to a call from an irate client or a message from management: "The website is down, we are losing orders!". You frantically open your laptop, log in to the server via SSH, and realize that the OOM-killer (Linux's internal defense mechanism) silently killed your Docker container with the database a few hours ago due to a lack of RAM. And all this time, you were losing clients and potential profits.

This is the worst-case scenario for any system administrator, developer, or business owner. Relying on user complaints as a notification system is a surefire way to lose reputation and money. In 2026, the reactive approach to infrastructure is dead. Monitoring must be proactive: you need to know that the server is about to crash before it actually does. Of course, there are situations where you can do nothing, for example, if you were subjected to a massive DDoS attack, but most of the time, everything can be identified in advance and prevented.

In this article, we will break down how to build a reliable monitoring system, which metrics to track, and how to set up instant alerts in Telegram or Discord.

Key Takeaways: Main Points About Server Monitoring

The golden rule of architecture: Never deploy a monitoring system on the same server you are monitoring. If the server "dies" completely, the monitoring will go down with it, and you won't get that crucial alert. Therefore, the right decision is to rent a minimal VPS and deploy the monitoring of your primary server on it.
Ping is an illusion: Just because a server pings doesn't mean the project is working. The server might respond to ICMP requests, but the Nginx web server could still be throwing a 502 Bad Gateway error.
Two levels of control: True monitoring consists of external (checking port availability and HTTP responses) and internal (gathering CPU, disk, RAM, and Docker container metrics).
Smart notifications: Set up alerts so that they only wake you up during critical emergencies. If you get spammed every 5 minutes about CPU usage being at 85%, you will simply mute the channel and miss a real catastrophe. This is why it is essential to configure the notification system properly.

External vs Internal Monitoring: What is the Difference?

To sleep peacefully, you need two independent layers of analytics. They solve completely different tasks, but combined, they give you 100% control over the situation.

Comparison Table of Monitoring Layers

Criterion	External Monitoring (Blackbox)	Internal Monitoring (Whitebox)
What it checks	External availability of the site, API, and SSL certificates.	Hardware state: CPU, RAM, NVMe, Docker, logs.
Popular tools	Uptime Kuma, Better Stack, Pingdom.	Prometheus + Grafana, Netdata, Telegraf.
Main benefit	Simulates a real user and their experience.	Shows the cause of a crash long before it begins.
Alert example	"Warning! The site returned status 502 instead of 200".	"Warning! Free disk space is below 10%".

For most of these tools, we have a video installation guide available on our YouTube channel — https://www.youtube.com/@MivoCloud

Which Metrics Are Critically Important to Track?

The metrics that need to be tracked can vary depending on your server. For instance, if you rent a powerful NVMe VDS server based on the AMD Ryzen 9 7950X, the performance headroom is usually more than enough. Therefore, if the server is powerful, it is not always necessary to check the load of the processor itself, but even the top-tier hardware can be clogged by incorrect code or endless logs. Here are the metrics you need to set triggers on:

Disk Usage (Disk Space / Inodes)

The silliest and most frequent reason for server crashes is a 100% full disk. A MySQL/MariaDB database immediately stops working if it has nowhere to write a temporary file or a transaction log. From my past work experience, I can tell you that this issue occurs very often, and it turns out to be a problem that people do not notice right away.

Important: In addition to gigabytes, monitor the number of Inodes (index descriptors). If your software generates millions of small session files, free disk space might remain, but you will run out of inodes, and the OS will block the creation of new files.

RAM (RAM / Swap)

Do not watch the average load, but the dynamics. If memory consumption by a web application grows monotonically without drops, you have a memory leak (Memory Leak). Sooner or later, the server will hit the ceiling, and Linux will start killing processes.

Response Time

If your site opens in 200 ms, but during peak hours the response time increases to 3-5 seconds, the server is formally working (returned a 200 status), but clients are already leaving. Monitoring response time helps to notice the need for a hardware upgrade or database optimization in time.

Practical Guide: Setting Up Monitoring in 5 Minutes Using Docker

To start out, you do not need to deploy heavy enterprise systems like Zabbix. The best solution for a small or medium project in 2026 is Uptime Kuma. It is a lightweight, beautiful, and absolutely free open-source tool that can be configured in a couple of clicks.

Let's deploy it on a separate (backup) VPS running Ubuntu 24.04 via Docker.

Command List:

Bash
sudo apt update && sudo apt upgrade -y
sudo apt install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
docker run -d --restart=always -p 3001:3001 -v uptime-kuma:/app/data --name uptime-kuma louislam/uptime-kuma:1 

What to do next:

Click "Add New Monitor".
Select the check type (e.g., HTTP(s) for a website, TCP Port for a CS2/GTA5 game server, or Ping).
Enter your project's URL.
In the "Setup Notification" section, select Telegram, paste your bot token and chat ID.

Now, if your main server stops responding, Uptime Kuma will detect it within 20-30 seconds and send a push notification to your phone.

Video Installation Guide for Uptime Kuma on Ubuntu 24.04

Many people need to see the installation process itself to understand at what point something went wrong. We filmed a video that shows the entire process using Docker:

FAQ: Quick Summary

Can Grafana be used without Prometheus?
Grafana is strictly a visualizer (beautiful graphs and dashboards). It needs a data source to build those graphs from. Prometheus acts precisely as that database that collects and stores metrics from your servers.
What is Alert Fatigue and how to deal with it?
This is a state where an admin receives so many non-critical notifications that they stop reacting to them. Separate your communication channels: send minor warnings (disk filled to 80%) to email or a quiet chat, and critical ones (site is down, database is dead) to a channel with a loud sound.
Will monitoring help protect against DDoS attacks?
Monitoring itself does not protect, but it is the first to signal an anomaly. If you see a sharp spike in incoming traffic (Bandwidth) and a simultaneous surge in PHP request processing time, it is a clear sign that it's time to enable traffic filtering.

Conclusion

Quality monitoring is an insurance policy for your infrastructure. You can use Docker, set up complex connections, and optimize software, but if you cannot see what is happening inside the system right now, you are managing the server blindly.

Start small: spin up a single independent virtual machine for Uptime Kuma, add your projects there, and set up notifications to your messenger. This will cover 90% of basic risks.

Article Author — Anatolie Cohaniuc

Server monitoring: How do you know that everything has fallen before your customers?