Monitoring Bare Metal Servers: Prometheus, Grafana, and Alerts

Datadog charges $23/host/month for infrastructure monitoring. New Relic starts at $0.30/GB of ingested data. On bare metal, you can run the same monitoring stack — Prometheus, Grafana, and alerting — for $0 extra. Here is how to set it up on a RAW server.

Why Self-Host Monitoring

SaaS monitoring tools are expensive and scale linearly with infrastructure. A 10-server fleet on Datadog costs $230+/mo just for basic metrics. Prometheus and Grafana are open-source, run on your own hardware, and give you unlimited retention, custom dashboards, and full data ownership.

On a RAW server, the entire monitoring stack consumes roughly 200 MB of RAM and negligible CPU. It runs alongside your application at no additional cost.

The Monitoring Stack

Prometheus — time-series database that scrapes metrics from your services
Node Exporter — exposes CPU, memory, disk, and network metrics from your server
Grafana — visualization layer for building dashboards and exploring data
Alertmanager — routes alerts to Slack, Discord, email, or PagerDuty

Step 1: Install Node Exporter

Node Exporter collects hardware and OS metrics. Install it as a systemd service so it starts automatically on boot.

# Deploy a RAW server
npx rawhq deploy

# SSH in
ssh root@your-server-ip

# Download and install Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar xzf node_exporter-1.8.2.linux-amd64.tar.gz
cp node_exporter-1.8.2.linux-amd64/node_exporter /usr/local/bin/

# Create systemd service
cat > /etc/systemd/system/node_exporter.service <<EOF
[Unit]
Description=Node Exporter
After=network.target
[Service]
ExecStart=/usr/local/bin/node_exporter
Restart=always
[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now node_exporter

Node Exporter now serves metrics on port 9100. Verify with curl localhost:9100/metrics.

Step 2: Install Prometheus

# Download Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.53.0/prometheus-2.53.0.linux-amd64.tar.gz
tar xzf prometheus-2.53.0.linux-amd64.tar.gz
cp prometheus-2.53.0.linux-amd64/prometheus /usr/local/bin/
cp prometheus-2.53.0.linux-amd64/promtool /usr/local/bin/
mkdir -p /etc/prometheus /var/lib/prometheus

Prometheus Configuration

Create /etc/prometheus/prometheus.yml with your scrape targets:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "alerts.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["localhost:9093"]

scrape_configs:
  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

Start Prometheus the same way — create a systemd unit pointing to /usr/local/bin/prometheus with flags --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus.

Step 3: Alert Rules

Create /etc/prometheus/alerts.yml to define conditions that trigger notifications:

groups:
  - name: server
    rules:
      - alert: HighCpuUsage
        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "CPU usage above 85% for 5 minutes"

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Disk space below 10%"

      - alert: HighMemoryUsage
        expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Memory usage above 90%"

Step 4: Alertmanager with Slack and Discord

Install Alertmanager and configure it to route alerts to your team channels.

# Download Alertmanager
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
tar xzf alertmanager-0.27.0.linux-amd64.tar.gz
cp alertmanager-0.27.0.linux-amd64/alertmanager /usr/local/bin/

Create /etc/alertmanager/alertmanager.yml:

route:
  receiver: "slack"
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: "slack"
    slack_configs:
      - api_url: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
        channel: "#alerts"
        title: "Server Alert"
        text: "Alert: {{ .CommonAnnotations.summary }}"

  - name: "discord"
    webhook_configs:
      - url: "https://discord.com/api/webhooks/YOUR/WEBHOOK"

For Discord, use the webhook_configs receiver with your Discord webhook URL. Alertmanager sends JSON payloads that Discord renders as embed messages.

Step 5: Install Grafana

apt install -y apt-transport-https software-properties-common
wget -qO- https://apt.grafana.com/gpg.key | gpg --dearmor > /usr/share/keyrings/grafana.gpg
echo "deb [signed-by=/usr/share/keyrings/grafana.gpg] https://apt.grafana.com stable main" > /etc/apt/sources.list.d/grafana.list
apt update && apt install -y grafana
systemctl enable --now grafana-server

Grafana runs on port 3000. Log in with admin/admin and add Prometheus as a data source at http://localhost:9090.

Dashboard Templates

Import community dashboards to get started instantly:

Dashboard 1860 — Node Exporter Full: CPU, RAM, disk, network in one view
Dashboard 13978 — Node Exporter Quick: lightweight overview for multi-server setups
Dashboard 3662 — Prometheus Stats: monitor Prometheus itself

Import via Grafana UI: Dashboards → Import → enter the dashboard ID → select your Prometheus data source.

Resource Monitoring Best Practices

Set retention wisely — 15 days of metrics at 15s intervals uses roughly 2 GB. Adjust with --storage.tsdb.retention.time=30d
Monitor the monitor — set up alerts for Prometheus itself (scrape failures, storage errors)
Use recording rules — pre-compute expensive queries to keep dashboards fast
Secure endpoints — bind Node Exporter and Prometheus to 127.0.0.1 or use firewall rules
Label consistently — use standardized labels (environment, service, region) across all exporters

Cost Comparison

Provider10 Servers50 Servers

Datadog$230/mo$1,150/mo

New Relic$150/mo$800/mo

Grafana Cloud$50/mo$290/mo

Self-hosted on RAW$0$0

Self-hosted monitoring costs nothing beyond the server you are already paying for. At 50 servers, you save over $13,000/year compared to Datadog.

Deploy Monitoring on RAW

npx rawhq deploy

7-day free trial. 13 seconds to deploy. Run Prometheus, Grafana, and Alertmanager on the same dedicated server as your app for $0 extra.

Deploy for Free →