Essential command line tools
In this series (15 parts)
- What is Linux and how it differs from other OSes
- Installing Linux and setting up your environment
- The Linux filesystem explained
- Users, groups, and permissions
- Essential command line tools
- Shell scripting fundamentals
- Processes and job control
- Standard I/O, pipes, and redirection
- The Linux networking stack
- Package management and software installation
- Disk management and filesystems
- Logs and system monitoring
- SSH and remote access
- Cron jobs and task scheduling
- Linux security basics for sysadmins
The command line is where Linux users spend most of their time. A handful of commands, combined properly, can replace entire GUI applications. This article covers the essential tools in three categories: file operations, text processing, and process management.
Prerequisites
You should understand the Linux filesystem layout and file permissions before starting.
File operations
Copying, moving, and deleting
# Copy a file
cp source.txt destination.txt
# Copy a directory recursively
cp -r src/ backup/
# Move (rename) a file
mv old-name.txt new-name.txt
# Move a file to another directory
mv file.txt /tmp/
# Delete a file
rm file.txt
# Delete a directory and everything in it
rm -rf old-project/
⚠ rm -rf is permanent. There is no recycle bin on the command line. Double check before you press Enter.
Finding files with find
find searches the filesystem based on file properties: name, type, size, modification time, permissions.
# Find all .py files under the current directory
find . -name "*.py"
# Find all directories named "test"
find /home -type d -name "test"
# Find files larger than 100MB
find / -type f -size +100M 2>/dev/null
# Find files modified in the last 24 hours
find /etc -type f -mtime -1
# Find files with specific permissions
find / -perm -4000 -type f 2>/dev/null # SUID files
# Find and delete all .tmp files
find /tmp -name "*.tmp" -type f -delete
# Find and execute a command on each result
find . -name "*.log" -exec gzip {} \;
Finding files with locate
locate uses a pre-built database, so it is much faster than find. But the database needs to be updated regularly.
# Update the database
sudo updatedb
# Find files
locate nginx.conf
Output:
/etc/nginx/nginx.conf
/usr/share/doc/nginx/nginx.conf.example
Text processing
Text processing is where Linux really shines. You can chain these tools together with pipes to build powerful data pipelines.
grep: search text
# Search for a pattern in a file
grep "error" /var/log/syslog
# Case-insensitive search
grep -i "warning" app.log
# Show line numbers
grep -n "TODO" *.py
# Recursive search in directories
grep -r "database_url" /etc/
# Invert match (show lines that DON'T match)
grep -v "DEBUG" app.log
# Count matches
grep -c "404" access.log
# Show context (2 lines before and after)
grep -B2 -A2 "FATAL" error.log
# Extended regex
grep -E "error|warning|critical" syslog
sed: stream editor
sed transforms text line by line. Most commonly used for search and replace.
# Replace first occurrence per line
sed 's/old/new/' file.txt
# Replace ALL occurrences per line
sed 's/old/new/g' file.txt
# Replace in-place (modifies the file)
sed -i 's/old/new/g' file.txt
# Delete lines matching a pattern
sed '/^#/d' config.txt # Remove comment lines
# Print only specific lines
sed -n '5,10p' file.txt # Print lines 5-10
# Insert text before a line
sed '/\[server\]/i # Server configuration' config.ini
awk: pattern scanning and processing
awk processes text field by field. Each line is split into fields by whitespace (by default).
# Print the second column
awk '{print $2}' file.txt
# Print specific columns with custom separator
awk -F: '{print $1, $3}' /etc/passwd
# Filter rows where column 3 > 100
awk '$3 > 100 {print $0}' data.txt
# Sum a column
awk '{sum += $5} END {print sum}' sales.txt
# Count unique values in a column
awk '{count[$1]++} END {for (k in count) print k, count[k]}' access.log
cut, sort, uniq: quick data transforms
# Extract the first field (colon-separated)
cut -d: -f1 /etc/passwd
# Sort alphabetically
sort names.txt
# Sort numerically, reverse
sort -rn numbers.txt
# Remove duplicate lines (input must be sorted)
sort data.txt | uniq
# Count duplicates
sort data.txt | uniq -c | sort -rn
Process management
Every running program is a process. Understanding how processes work at a deeper level is covered later, but here are the essential tools.
ps: list processes
# Show your processes
ps
# Show all processes with full details
ps aux
# Show process tree
ps auxf
# Find a specific process
ps aux | grep nginx
Output of ps aux | grep nginx:
root 1230 0.0 0.0 12345 2345 ? Ss 10:00 0:00 nginx: master process
www-data 1234 0.0 0.1 12345 5678 ? S 10:00 0:00 nginx: worker process
www-data 1235 0.0 0.1 12345 5678 ? S 10:00 0:00 nginx: worker process
top and htop: real-time monitoring
# Basic real-time process monitor
top
Output (top section):
top - 10:30:00 up 5 days, 3:00, 2 users, load average: 0.50, 0.45, 0.40
Tasks: 200 total, 1 running, 199 sleeping, 0 stopped, 0 zombie
%Cpu(s): 5.0 us, 2.0 sy, 0.0 ni, 92.0 id, 1.0 wa, 0.0 hi, 0.0 si
MiB Mem: 16000.0 total, 8000.0 free, 5000.0 used, 3000.0 buff/cache
MiB Swap: 8000.0 total, 8000.0 free, 0.0 used. 10000.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1234 www-data 20 0 123456 56789 12345 S 3.0 0.3 1:23.45 nginx
5678 postgres 20 0 234567 89012 23456 S 2.0 0.5 5:67.89 postgres
Key fields: PID (process ID), %CPU, %MEM, COMMAND. Press q to quit.
htop is an improved version (install with sudo apt install htop). It has colors, mouse support, and easier process management.
kill: stop processes
# Send SIGTERM (polite shutdown request)
kill 1234
# Send SIGKILL (force kill, cannot be caught)
kill -9 1234
# Kill by name
pkill nginx
# Kill all processes matching a pattern
pkill -f "python server.py"
nice and renice: process priority
# Start a process with lower priority (nice value 10)
nice -n 10 ./heavy-computation.sh
# Change priority of a running process
renice 15 -p 1234
Nice values range from -20 (highest priority) to 19 (lowest). Only root can set negative nice values.
Example 1: Filter a log file with grep and awk
Suppose you have a web server access log and you want to find the IP addresses that made the most requests returning a 404 status.
First, look at the log format:
head -3 /var/log/nginx/access.log
Output:
192.168.1.100 - - [15/Jun/2024:10:00:01 +0000] "GET /index.html HTTP/1.1" 200 612
10.0.0.50 - - [15/Jun/2024:10:00:02 +0000] "GET /missing.html HTTP/1.1" 404 169
192.168.1.100 - - [15/Jun/2024:10:00:03 +0000] "GET /api/users HTTP/1.1" 200 1234
The IP is field 1, the status code is field 9. Let’s filter for 404s and count by IP:
# Step 1: Filter for 404 status codes
grep '" 404 ' /var/log/nginx/access.log | head -3
Output:
10.0.0.50 - - [15/Jun/2024:10:00:02 +0000] "GET /missing.html HTTP/1.1" 404 169
10.0.0.50 - - [15/Jun/2024:10:00:15 +0000] "GET /old-page HTTP/1.1" 404 169
203.0.113.5 - - [15/Jun/2024:10:01:30 +0000] "GET /wp-login.php HTTP/1.1" 404 169
# Step 2: Extract just the IP addresses
grep '" 404 ' /var/log/nginx/access.log | awk '{print $1}' | head -5
Output:
10.0.0.50
10.0.0.50
203.0.113.5
203.0.113.5
203.0.113.5
# Step 3: Count and sort
grep '" 404 ' /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -10
Output:
147 203.0.113.5
89 198.51.100.23
45 10.0.0.50
12 192.168.1.100
3 172.16.0.5
203.0.113.5 triggered 147 404 errors. That /wp-login.php request suggests someone is scanning for WordPress vulnerabilities. You might want to block this IP in your firewall.
Example 2: Find all files modified in the last 24 hours
Suppose you deployed code yesterday and something broke. You need to know exactly which files changed.
# Find all files modified in the last 24 hours under /etc
find /etc -type f -mtime -1 -ls 2>/dev/null
Output:
12345 4 -rw-r--r-- 1 root root 234 Jun 15 14:30 /etc/nginx/sites-enabled/default
12346 4 -rw-r--r-- 1 root root 567 Jun 15 14:30 /etc/nginx/nginx.conf
12347 4 -rw-r--r-- 1 root root 89 Jun 15 15:00 /etc/resolv.conf
Now let’s make this more useful. Find recently changed files, show what changed, and save a report:
# Find modified config files and check each with git
find /etc -type f -mtime -1 2>/dev/null | while read file; do
echo "=== $file ==="
ls -l "$file"
echo "---"
done
Output:
=== /etc/nginx/sites-enabled/default ===
-rw-r--r-- 1 root root 234 Jun 15 14:30 /etc/nginx/sites-enabled/default
---
=== /etc/nginx/nginx.conf ===
-rw-r--r-- 1 root root 567 Jun 15 14:30 /etc/nginx/nginx.conf
---
=== /etc/resolv.conf ===
-rw-r--r-- 1 root root 89 Jun 15 15:00 /etc/resolv.conf
---
Combine with diff if you keep backups:
# Compare current config to backup
diff /etc/nginx/nginx.conf /etc/nginx/nginx.conf.bak
Output:
5c5
< worker_connections 2048;
---
> worker_connections 1024;
Someone doubled the worker connections. That might be the cause of the issue.
For a more comprehensive approach to finding what changed on a system, check the system logs and monitoring article.
Quick reference
| Task | Command |
|---|---|
| Copy file | cp src dst |
| Move/rename | mv old new |
| Delete file | rm file |
| Delete directory | rm -rf dir/ |
| Find by name | find . -name "*.py" |
| Find by time | find . -mtime -1 |
| Search text | grep "pattern" file |
| Replace text | sed 's/old/new/g' file |
| Column extract | awk '{print $2}' file |
| Sort | sort file |
| Count uniques | sort file | uniq -c |
| List processes | ps aux |
| Kill process | kill PID |
What comes next
You now know the core commands. The next article, Shell scripting fundamentals, teaches you how to combine these commands into reusable scripts with variables, loops, and conditionals.
If you want to understand how to chain commands together with pipes and redirection, see Standard I/O, pipes, and redirection.