Introduction

I’ve been using linux for a while(almost ten years), In the early days, I used to have this very cheap and slow puter. I always used to wonder why the hell it is so slow all of a sudden, since I was new to linux and have been a windows user previously I had no idea how to even open terminal and check what’s wrong. Eventually I ofcourse learned through tutorials and all… Uhmm, well that’s the intro man, I have nothing to say anymore. Let’s just go straight into this schize.

Note: This is a multi-part series. In this part, we’ll cover some basic tools.

Goal

The goal is simple: quickly collect system data and form a rough diagnosis.

We want to identify whether the issue is:

CPU issue?
Memory issue?
Disk issue?
Network issue?

Commands we will be covering

uptime
vmstat
dmesg
iostat
free
top
mpstat
pidstat

uptime

$ uptime
19:41:35 up 10:50,  1 user,  load average: 0.87, 1.06, 1.11

uptime is a quick way to check the load average’s over time. As you can see(got the reference??) above there are three numbers, linux updates them continuously, using an exponential moving average.
The kernel recalculates it roughly every 5 seconds, each value - 1, 5, 15 is just a different smoothing window
So:
- 1-min load → reacts quickly
- 5-min load → smoother
- 15-min load → very slow, has a stable trend
See below image, the 1-min load dramatically increased beacuse I started playing some random 4k video, whilst 5-min load increased a bit and 15-min load by just one.

💡 Tip: Use watch to observe load changes live:
watch -n 2 uptime
This is especially useful when testing something like this. Learn more about watch using man watch

Important:

Load average is NOT CPU usage. It includes:

Running processes
Runnable (waiting for CPU)
Uninterruptible sleep (usually I/O wait)

What can we infer from this and what to look at?

Compare load to CPU count:
- Load ≈ CPU cores → system is busy but fine
- Load » CPU cores → contention/bottleneck
If load spikes suddenly:
- Check top or pidstat
If load is high but CPU usage is low:
- Likely I/O bottleneck → check iostat
Trend matters:
- 1-min » 15-min → recent spike
- All high → sustained pressure

vmstat

> $ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st gu
 4  0      0 5521268 1002176 4910376 0    0   355   286 1793    5  8  1 91  0  0  0
 1  0      0 5522828 1002176 4910400 0    0     0     0  443 1167  2  0 97  0  0  0
 0  0      0 5522884 1002176 4910400 0    0     0     0 1049 3614  3  0 96  0  0  0
 0  0      0 5541796 1002176 4894200 0    0     0     0  871 1882  5  1 94  0  0  0
 0  0      0 5541816 1002176 4894200 0    0     0     0  327  844  3  0 97  0  0  0
 0  0      0 5546768 1002176 4894200 0    0     0     0  586 1292  3  0 96  0  0  0
 0  0      0 5546064 1002184 4894192 0    0     0    40  414 1005  2  0 97  1  0  0
 0  0      0 5546416 1002184 4894200 0    0     0     0  523 1118  2  0 97  0  0  0
 0  0      0 5548712 1002184 4894200 0    0     0     0  401 1063  2  0 98  0  0  0
 0  0      0 5548964 1002184 4894200 0    0     0     0  506 1074  2  0 97  0  0  0
 0  0      0 5548964 1002184 4894200 0    0     0     0  354  991  2  0 97  0  0  0

vmstat = Virtual Memory Statistics (not just memmory btw).
It gives a compact, real-time view of the entire system like:
- CPU
- memory
- processes
- I/O
- context switching.

Proc

r is no. of processes running on CPU and waiting for a turn (doesn’t include I/O).
b is blocked (waiting on I/O).

Memory And Swap

free is free memory that we have in KiloBytes.
These two mostly be 0, if not you are out of memory bro(mostly used when swap devices are configured).
si is swap-in
so is swap-out
swap used (I have used around ~295MB)
buff kernel buffers
cache filesystem cache (yea, very important)

IO

bi blocks read from disk
bo blocks written

Cpu

us user CPU %
sy kernel CPU %
id idle %
wa waiting on I/O
st stolen (VM)
gu guest (VMs)

What can we infer from this and what to look at?

r > CPU cores → CPU contention
b > 0 → I/O blocking → check disk
si/so > 0 → memory pressure (bad sign)
High wa → disk bottleneck
High us → user-space CPU heavy workload
High sy → kernel/system overhead

dmesg

> $ sudo dmesg | tail
[   10.707102] wlan0: associate with 20:0c:86:47:59:88 (try 1/3)
[   10.711612] wlan0: RX AssocResp from 20:0c:86:47:59:88 (capab=0x411 status=0 aid=4)
[   10.718726] wlan0: associated
[  992.059500] Bluetooth: RFCOMM TTY layer initialized
[  992.059508] Bluetooth: RFCOMM socket layer initialized
[  992.059511] Bluetooth: RFCOMM ver 1.11
[ 1031.713561] input: realme Buds Wireless 3 (AVRCP) as /devices/virtual/input/input32
[ 1266.860351] warning: `Socket Thread' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
[10544.899092] nvme nvme0: using unchecked data buffer
[10544.903147] block nvme0n1: No UUID available providing old NGUID

dmseg shows the kernel ring buffer, which has mostly informational messages, warnings, errors, sometimes debug logs.
You could see like:
- Hardware information
- Driver Messages
- FileSystem Events
- If you are on SELinux then security messages
- Kernel Warnings

What can we infer from this and what to look at?

Look for:
- Hardware errors (disk, GPU, USB)
- Driver failures
- Filesystem issues
Use:

$ sudo dmesg level=err,warn

If you see:
- Disk errors → check iostat
- OOM killer → check memory (free, vmstat)
- Repeated warnings → likely root cause

iostat

> $ iostat -xz 1
Linux 6.18.16-1-lts (vv) 	04/22/2026 	_x86_64_	(12 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.89    0.00    0.97    0.10    0.00   91.04

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
loop0            0.00      0.07     0.00   0.00    0.04    14.57    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
nvme0n1         18.64    345.87     7.12  27.64    0.19    18.55    9.90    282.46    11.42  53.56    3.12    28.52    0.00      0.00     0.00   0.00    0.00     0.00    1.06    1.90    0.04   1.26
zram0            0.01      0.12     0.00   0.00    0.00    16.46    0.00      0.00     0.00   0.00    0.00     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00

iostat is used to show I/O metrics.
Options:
- x = extended stats
- z = hide idle devices
- 1 = refresh every second

Key Fields

r/s, w/s, rkB/s, and wkB/s are the delivered reads, writes, read Kbytes, and write Kbytes.
user CPU time running user-space programs
nice user processes with modified priority (nice)
system kernel work (syscalls, interrupts, etc.)
iowait CPU waiting for disk I/O
steal time stolen by hypervisor (VMs)
aqu-sz average queue size

What can we infer from this and what to look at?

%util ~100% → disk saturated
High await (>10–20ms SSD, >50ms HDD) → latency issue
High aqu-sz → queue buildup
Low util but high await → possible driver/fs issue
High writes → check logs, journaling, apps

free

> $ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       4.9Gi       5.4Gi       308Mi       5.6Gi        10Gi
Swap:          7.7Gi          0B       7.7Gi

What can we infer from this and what to look at?

Focus on available, not free
Low available memory → pressure
High swap usage + growing → is a bad sign
High cache is GOOD (Linux uses memory efficiently)
If swapping:
- Check vmstat
- Identify heavy processes (top, pidstat)

top

top - 12:39:27 up  3:19,  1 user,  load average: 1.71, 1.38, 1.30
Tasks: 345 total, 3 running, 339 sleep, 0 d-sleep, 0 stopped, 3 zombie
%Cpu(s): 10.2 us,  0.7 sy,  0.0 ni, 88.9 id,  0.0 wa,  0.2 hi,  0.1 si,  0.0 st
MiB Mem :  15684.2 total,   5625.4 free,   4921.3 used,   5763.8 buff/cache
MiB Swap:   7842.0 total,   7842.0 free,      0.0 used.  10762.9 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  70437 vamsi     20   0 2677100 269184 207060 R  88.5   1.7   0:41.27 alacritty
   1724 sddm      20   0 3242032 390572 111668 S  20.3   2.4  39:12.91 sddm-greeter-qt
   2006 vamsi     20   0   25.2g 441924 408932 R  10.3   2.8   8:50.27 Xorg
   2435 vamsi     20   0 2667188 244772 163784 S   6.0   1.5  18:40.36 alacritty
  49158 vamsi     20   0 2666860 281160 202704 S   1.7   1.8   0:07.98 alacritty
    921 root      20   0 3090444 111104  79396 S   0.7   0.7   2:14.21 opensnitchd
  82014 vamsi     20   0   11140   8108   5812 R   0.7   0.1   0:00.05 top
    106 root     -51   0       0      0      0 S   0.3   0.0   0:14.27 irq/9-acpi
   2034 vamsi     20   0  128376  28972  21728 S   0.3   0.2   0:03.41 i3bar
   2305 vamsi     20   0 1582668  77056  48636 S   0.3   0.5   0:08.00 blueman-tray
   4139 vamsi     20   0  583744  28100  10432 S   0.3   0.2   1:06.48 btop
  78677 root      20   0       0      0      0 I   0.3   0.0   0:00.24 kworker/u48:2-i915
      1 root      20   0   23312  14100   9764 S   0.0   0.1   0:02.14 systemd

What can we infer from this and what to look at?

Identify top CPU consumers
Look for:
- Runaway processes
- Zombies
- High memory users
CPU breakdown:
- High us → apps
- High sy → kernel
- High wa → disk wait

mpstat

$ mpstat -P ALL 1

What can we infer from this and what to look at?

Per-core CPU usage
Detect:
- Single-core bottleneck (one core at 100%)
- Imbalanced workloads
High %iowait → disk issue
Useful for:
- Multi-threading issues
- CPU pinning problems

pidstat

> $ pidstat 1
Linux 6.18.16-1-lts (vv) 	04/22/2026 	_x86_64_	(12 CPU)

12:41:02 PM   UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
12:41:03 PM     0       921    0.00    0.99    0.00    0.00    0.99     3  opensnitchd
12:41:03 PM   965      1724   19.80    0.00    0.00    0.00   19.80     3  sddm-greeter-qt
12:41:03 PM  1000      2006    0.99    0.00    0.00    0.00    0.99     9  Xorg
12:41:03 PM  1000      2435    8.91    0.00    0.00    0.00    8.91     6  alacritty
12:41:03 PM  1000      2667    1.98    0.99    0.00    0.00    2.97     2  zen
12:41:03 PM  1000      4139    0.99    0.00    0.00    0.00    0.99     4  btop

What can we infer from this and what to look at?

Per-process breakdown over time
Identify:
- CPU-heavy processes
- Processes waiting on I/O (%wait)
This is better than top for trends
Use when:
- Issue is intermittent
- Need per-process historical view

Final Thoughts

A quick workflow:

uptime → is load high?
vmstat → CPU vs memory vs I/O?
iostat → disk bottleneck?
top / pidstat → which process?
dmesg → any kernel-level issues?

More advanced tools later 🙂

Linux Perf Analysis - Quickly Check Your Systems Health

Introduction

Goal

Commands we will be covering

uptime

What can we infer from this and what to look at?

vmstat

Proc

Memory And Swap

IO

Cpu

What can we infer from this and what to look at?

dmesg

What can we infer from this and what to look at?

iostat

Key Fields

What can we infer from this and what to look at?

free

What can we infer from this and what to look at?

top

What can we infer from this and what to look at?

mpstat

What can we infer from this and what to look at?

pidstat

What can we infer from this and what to look at?

Final Thoughts