Sed
Sed 基础
sed(Stream Editor)是一个面向字符流的编辑器,对文本进行过滤和替换操作。sed 一次仅读取一行进行操作,非常适合处理大文件。
sed 工作原理
sed 默认不修改源文件,仅仅将处理结果输出到屏幕。它的工作流程如下:
┌─────────────────────────────────────────────────────────┐ │ sed 工作流程 │ └─────────────────────────────────────────────────────────┘
输入文件 ──▶ 模式空间 ──▶ 处理命令 ──▶ 输出 (Buffer) (执行指令) (屏幕/文件) - 模式空间:sed 一次读取一行到模式空间 - 保持空间:用于暂存数据,可与模式空间交换
|
基本语法
sed [选项] '命令' [输入文件...] sed [选项] -f 脚本文件 [输入文件...]
|
常用选项
| 选项 |
说明 |
-n, --quiet, --silent |
静默模式,不自动打印模式空间内容 |
-e 脚本 |
添加多个编辑命令 |
-f 脚本文件 |
从文件读取编辑命令 |
-i[SUFFIX] |
直接修改源文件(in-place),可选备份 |
-r, --regexp-extended |
使用扩展正则表达式 |
-l N |
指定 l 命令的输出行长度 |
-s, --separate |
将多个文件视为独立文件 |
-u, --unbuffered |
最小化缓冲区输入输出 |
定界符
sed 默认使用 / 作为定界符,但可以更换为其他符号:
sed 's/old/new/' file
sed 's|/old/path|/new/path|' file
sed 's#old#new#' file
sed 's:old:new:' file
|
Sed 命令详解
sed 的基本命令格式:[地址]指令 内容
1. a\ - 追加
在指定行之后添加新内容:
sed '2a\新添加的内容' file
sed '/error/a\处理完成' file
sed '1,4a\新内容' file
sed '$a\最后一行内容' file
sed '1a\第一行\第二行\第三行' file
|
2. i\ - 插入
在指定行之前添加新内容:
sed '2i\插入的内容' file
sed '/start/i\--- 开始 ---' file
sed '1i\文件头部' file
|
3. c\ - 替换
替换整行内容:
sed '2c\替换后的内容' file
sed '/old/c\新内容' file
sed '1,3c\这是一行替换内容' file
|
4. d - 删除
删除指定的行:
sed '2d' file
sed '$d' file
sed '/^$/d' file
sed '/error/d' file
sed '1,3d' file
sed '/^[[:space:]]*$/d' file
sed '/^#/d' file
|
5. s - 替换
最常用的替换命令:
sed 's/old/new/' file
sed 's/old/new/g' file
sed 's/old/new/2' file
sed 's/old/new/2g' file
sed 's/old/new/I' file
sed -n 's/old/new/p' file
|
6. p - 打印
打印模式空间的内容:
sed 'p' file
sed -n '5p' file
sed -n '/pattern/p' file
sed -n '5,10p' file
|
7. n - 下一行
移动到下一行:
sed -n 'p;n' file
sed -n 'n;p' file
|
8. y - 字符转换
字符一对一转换:
sed 'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/' file
sed 'y/123/456/' file
|
9. w - 写入文件
将匹配行写入指定文件:
sed -n '/error/w error.log' file
sed '1,10w part.txt' file
|
10. r - 读取文件
读取文件内容插入:
sed '5r file2.txt' file1
sed '/END/r footer.txt' file
|
Sed 高级应用
地址与范围
sed '5d' file
sed '1,10d' file
sed '/start/,/end/d' file
sed '/pattern/,+5d' file
|
多命令组合
sed -e '1d' -e 's/old/new/' file
sed '1d;s/old/new/' file
sed '/pattern/{ s/old/new/ p }' file
|
正则表达式
sed '/^$/d' file
sed '/.$/d' file
sed 's/.../xxx/' file
sed '/ab*c/d' file
sed -r '/a+/d' file
sed -r '/colou?r/d' file
sed '/[aeiou]/d' file
sed '/[[:digit:]]/d' file sed '/[[:alpha:]]/d' file sed '/[[:space:]]/d' file
|
特殊字符应用
sed 's/word/[&]/g' file
sed 's/\(abc\)/\1def/' file sed 's/\(a\)\(b\)/\2\1/' file
echo "abc def" | sed 's/\(.*\) \(.*\)/\2 \1/'
|
保持空间应用
sed 有另一个缓冲区称为”保持空间”(hold space):
sed 'x' file
sed 'h' file
sed 'H' file
sed 'g' file
sed 'G' file
sed -n '1!G;h;$!d' file
sed 'N;s/\n/ /' file
|
标签与跳转
:b
sed ':b;s/old/new/;tb' file
sed '/error/b; s/old/new/' file
|
Sed 实战案例
案例1:批量替换文件中的路径
sed -i 's|/old/path|/new/path|g' file.txt
find /path -type f -name "*.txt" -exec sed -i 's|/old|/new|g' {} \;
|
案例2:删除 HTML 标签
sed 's/<[^>]*>//g' file.html
sed 's/<[^>]*><[^>]*>//g' file.html
|
案例3:添加行号
sed = file.txt | sed 'N;s/\n/\t/'
sed = file.txt | sed 'N;s/ /\t/'
|
案例4:提取 IP 地址
grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' file.txt
sed -n '/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/p' file.txt
|
案例5:格式化文本
sed 's/^[ ]*//' file
sed 's/[ ]*$//' file
sed 's/^[ ]*//;s/[ ]*$//' file
sed 's/\t/ /g' file
|
案例6:修改配置文件
sed -i '/^PORT/s/^/#/' config.conf
sed -i '/^#PORT/s/^#//' config.conf
sed -i '1i# Configuration File' config.conf
sed -i '$a# End of Config' config.conf
|
案例7:处理日志
sed -n '/2024-01-01/p' access.log
sed -n '/ERROR/p' application.log
sed -n 's/^.*from \([0-9.]*\).*/\1/p' access.log | sort | uniq -c
|
Awk
Awk 基础
Awk 是一种模式匹配的程序设计语言,用于对文本和数据进行扫描和处理。常见操作是将数据转换为格式化的报表。
工作原理
Awk 逐行扫描文件,寻找匹配特定模式的行,并对其进行处理:
┌─────────────────────────────────────────────────────────┐ │ Awk 工作流程 │ └─────────────────────────────────────────────────────────┘
输入文件 ──▶ 读取一行 ──▶ 匹配模式 ──▶ 执行动作 ──▶ 下一行 │ ▼ BEGIN { } # 初始化(执行一次) │ ▼ /pattern/ { } # 匹配模式(每行执行) │ ▼ END { } # 收尾(执行一次)
|
基本语法
awk 'pattern {action}' file awk -F ':' 'pattern {action}' file awk -f script.awk file
|
常用选项
| 选项 |
说明 |
-F |
指定字段分隔符 |
-v |
定义变量 |
-f |
从文件读取 awk 脚本 |
-v OFS='\t' |
指定输出字段分隔符 |
-v RS='' |
指定记录分隔符(多行记录) |
内置变量
| 变量 |
说明 |
$0 |
整行记录 |
$1-$n |
第 n 个字段 |
NF |
字段数量 (Number of Fields) |
NR |
当前记录号 (Number of Records) |
FNR |
当前文件的记录号 |
FS |
输入字段分隔符 (默认空格/tab) |
OFS |
输出字段分隔符 (默认空格) |
RS |
输入记录分隔符 (默认换行) |
ORS |
输出记录分隔符 (默认换行) |
FILENAME |
当前文件名 |
ARGC |
命令行参数数量 |
ARGV |
命令行参数数组 |
字段分隔符
awk '{print $1, $2}' file
awk -F ':' '{print $1, $3}' /etc/passwd
awk -F '[:/]' '{print $1, $2}' file
awk -F '/[:#]+/' '{print $1}' file
awk 'BEGIN{FS=":"} {print $1}' /etc/passwd
|
Awk 变量与操作符
变量使用
awk '{name="test"; print name}' file
awk '{print $1, $3}' file
awk 'NR==5 {print}' file
awk '{arr[$1]++} END{for(k in arr) print k, arr[k]}' file
|
运算符
+ - * / % ^
= += -= *= /= %= ^=
== != < > <= >= ~ !~
&& || !
condition ? value1 : value2
awk '{str = $1 " - " $2; print str}' file
|
模式匹配
awk '/error/ {print}' log.txt awk '$1 ~ /pattern/ {print}' file
awk '$1 == "value" {print}' file
awk '$3 > 100 {print}' file
awk '/error/ && $3 > 50 {print}' file awk '/error/ || /warning/ {print}' file
awk '/start/,/end/ {print}' file
|
Awk 流程控制
条件判断
awk '{if ($1 > 100) print "big"}' file
awk '{if ($1 > 100) print "big"; else print "small"}' file
awk '{ if ($1 >= 90) print "A" else if ($1 >= 80) print "B" else if ($1 >= 70) print "C" else print "D" }' file
|
循环
awk '{ for (i=1; i<=NF; i++) sum += $i print sum }' file
awk '{arr[$1]++} END{for (k in arr) print k, arr[k]}' file
awk '{ i=1 while (i<=NF) { print $i i++ } }' file
awk '{ do { print $0 getline } while ($0 != "end") }' file
|
数组
awk 'BEGIN{ arr[0] = "zero" arr[1] = "one" arr["name"] = "value" }'
awk '{ for (key in array) print key, array[key] }' file
awk 'BEGIN{ arr["a"] = 1 if ("b" in arr) print "exists" else print "not exists" }'
awk 'BEGIN{ arr[1] = "one" delete arr[1] }'
|
Awk 函数
内置函数
awk 'BEGIN{ str = "Hello World" print length(str) # 字符串长度 print index(str, "World") # 子串位置 print substr(str, 1, 5) # 子串截取 print toupper(str) # 转大写 print tolower(str) # 转小写 print split(str, arr, " ") # 分割字符串 print gsub(/o/, "0", str) # 全局替换(返回替换次数) print sub(/o/, "0", str) # 替换第一个 print sprintf("%05d", 42) # 格式化 }'
awk 'BEGIN{ print sqrt(16) # 平方根 print exp(1) # e 的幂 print log(2.718) # 自然对数 print sin(3.14) # 三角函数 print int(3.14) # 取整 print rand() # 随机数 0-1 srand(10) # 设置随机种子 }'
awk 'BEGIN{ print systime() # Unix 时间戳(秒) print strftime("%Y-%m-%d %H:%M:%S") # 格式化时间 }'
|
自定义函数
awk 'function max(a, b) { return (a > b) ? a : b } { print max($1, $2) }' file
|
Awk 实战案例
案例1:统计文件大小
ls -l | awk '{sum += $5} END {print "Total:", sum, "bytes"}'
du -sh /path | awk '{print $1}'
|
案例2:分析日志
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10
awk '{print $9}' access.log | sort | uniq -c | sort -rn
awk '{print $4}' access.log | cut -d: -f1,2 | sort | uniq -c | sort -rn
awk '/ERROR/ {print}' application.log
awk '{sum+=$11; count++} END {print "Average:", sum/count}' access.log
|
案例3:处理 CSV 文件
awk -F',' '{print $1, $3}' data.csv
awk -F',' 'NR==1 {print}' data.csv
awk -F',' '$3 > 1000 {print}' data.csv
awk -F',' '{print NR, $0}' data.csv
|
案例4:数据格式化
awk 'BEGIN{printf "%-10s %-5s\n", "Name", "Score"} {printf "%-10s %-5d\n", $1, $2}' data.txt
awk 'BEGIN{print "=========="} {printf "| %-8s |\n", $1} END{print "=========="}' file
|
案例5:文本处理
awk 'NR==FNR{a[$1]=$2; next} {print $1, a[$1]}' file1 file2
awk '!seen[$0]++' file
awk 'seen[$0]++' file
awk '{sum=0; for(i=1;i<=NF;i++) sum+=$i; print sum}' file
|
案例6:条件统计
awk '$3=="M" {male++} $3=="F" {female++} END {print "Male:", male, "Female:", female}' data.txt
awk '{ if ($2 >= 90) grade="A" else if ($2 >= 80) grade="B" else if ($2 >= 70) grade="C" else grade="D" count[grade]++ } END { for (g in count) print g, count[g] }' scores.txt
|
Shell 编程
Shell 基础
Shell 脚本执行方式
bash script.sh
chmod +x script.sh ./script.sh
source script.sh
echo 'echo "hello"' | bash
|
Shell 类型
cat /etc/shells
echo $SHELL echo $0
bash sh zsh
|
注释
: << 'EOF' 这是 多行 注释 EOF
: ' 这是 多行 注释 '
|
变量与参数
系统变量
echo $HOME echo $USER echo $UID echo $PATH echo $PWD echo $LANG echo $IFS echo $HOSTNAME echo $RANDOM echo $SECONDS
|
变量定义
NAME="Tom" AGE=25
readonly PI=3.14159
declare -i NUM=10
ARR=(1 2 3 4) ARR[0]=1 echo ${ARR[0]} echo ${ARR[@]} echo ${#ARR[@]}
declare -A DICT DICT[name]="Tom" DICT[age]=25 echo ${DICT[name]}
|
环境变量
export VAR=value
echo 'export PATH=$PATH:/new/path' >> ~/.bashrc source ~/.bashrc
echo 'PATH=$PATH:/new/path' >> /etc/profile
|
变量引用
NAME="Tom"
echo $NAME echo ${NAME}
echo ${#NAME}
STR="Hello World" echo ${STR:0:5} echo ${STR:6} echo ${STR:-default}
echo ${STR/Hello/Hi} echo ${STR//l/L}
echo ${STR#Hello} echo ${STR##Hello} echo ${STR%World} echo ${STR%%World}
|
位置参数
#!/bin/bash echo $0 echo $1 echo $2 echo ${10} echo $# echo $@ echo $*
|
shift 命令
#!/bin/bash while [ $# -gt 0 ]; do echo "Current: $1, Remaining: $#" shift done
shift 3
|
特殊变量
运算符
算术运算
result=$[ 1 + 2 ] result=$[ $a * $b ]
result=$(( 1 + 2 )) result=$(($a + $b))
result=`expr 1 + 2` result=$(expr $a + $b)
let result=1+2 let a+=1
result=$(echo "1.5 + 2.3" | bc) result=$(echo "scale=2; 10/3" | bc)
|
关系运算
[ $a -eq $b ] [ $a -ne $b ] [ $a -gt $b ] [ $a -lt $b ] [ $a -ge $b ] [ $a -le $b ]
[ "$str1" = "$str2" ] [ "$str1" != "$str2" ] [ -z "$str" ] [ -n "$str" ] [ "$str" ]
[ -e file ] [ -f file ] [ -d file ] [ -r file ] [ -w file ] [ -x file ] [ -s file ] [ -L file ] [ file1 -nt file2 ] [ file1 -ot file2 ]
|
逻辑运算
[ $a -gt 0 -a $b -gt 0 ] [[ $a -gt 0 && $b -gt 0 ]]
[ $a -gt 0 -o $b -gt 0 ] [[ $a -gt 0 || $b -gt 0 ]]
[ ! $a -gt 0 ] [[ ! $a -gt 0 ]]
|
流程控制
条件判断
if [ condition ]; then command fi
if [ condition ]; then command1 else command2 fi
if [ $a -gt 10 ]; then echo "big" elif [ $a -gt 5 ]; then echo "medium" else echo "small" fi
if [ -f /path/to/file ]; then echo "File exists" fi
if [ -d /path/to/dir ]; then echo "Directory exists" fi
if [ -z "$var" ]; then echo "Empty" fi
if [ $a -gt 0 ] && [ $b -gt 0 ]; then echo "Both positive" fi
|
case 语句
case $variable in value1) command1 ;; value2) command2 ;; value3|value4) command3 ;; *) default_command ;; esac
case $1 in start) echo "Starting..." ;; stop) echo "Stopping..." ;; restart) echo "Restarting..." ;; *) echo "Usage: $0 {start|stop|restart}" exit 1 ;; esac
|
循环
for 循环
for i in 1 2 3 4 5; do echo $i done
for i in {1..5}; do echo $i done
for i in {0..10..2}; do echo $i done
for ((i=0; i<10; i++)); do echo $i done
for file in *.txt; do echo "Processing $file" done
for item in "${array[@]}"; do echo $item done
|
while 循环
while [ condition ]; do command done
while read line; do echo "$line" done < file.txt
while IFS=: read -r user pass uid gid; do echo "User: $user, UID: $uid" done < /etc/passwd
while true; do command done
while :; do command done
|
until 循环
until [ condition ]; do command done
until [ $i -gt 10 ]; do echo $i ((i++)) done
|
循环控制
for i in 1 2 3 4 5; do if [ $i -eq 3 ]; then break fi echo $i done
for i in 1 2 3 4 5; do if [ $i -eq 3 ]; then continue fi echo $i done
|
函数
函数定义
function name { commands return value }
name() { commands return value }
|
函数参数
#!/bin/bash
greet() { echo "Hello, $1!" echo "You are $2 years old" }
greet "Tom" 25
count_args() { echo $# }
echo_all() { echo "$@" echo "$*" }
|
函数返回值
get_status() { return 0 return 1 }
if get_status; then echo "Success" fi
get_data() { echo "result1" echo "result2" }
result=$(get_data)
|
局部变量
example() { local var="I'm local" echo $var }
|
递归函数
factorial() { if [ $1 -le 1 ]; then echo 1 else local temp=$(($1 - 1)) local result=$(factorial $temp) echo $(($1 * $result)) fi }
factorial 5
|
文本处理实战
案例1:日志分析脚本
#!/bin/bash
LOG_FILE=${1:-/var/log/nginx/access.log}
echo "========== 日志分析报告 ==========" echo "文件: $LOG_FILE" echo ""
total=$(wc -l < "$LOG_FILE") echo "总请求数: $total"
ips=$(awk '{print $1}' "$LOG_FILE" | sort -u | wc -l) echo "独立 IP 数: $ips"
echo "" echo "状态码统计:" awk '{print $9}' "$LOG_FILE" | sort | uniq -c | sort -rn
echo "" echo "访问最频繁的 Top 10 IP:" awk '{print $1}' "$LOG_FILE" | sort | uniq -c | sort -rn | head -10
echo "" echo "访问最频繁的 Top 10 URL:" awk '{print $7}' "$LOG_FILE" | sort | uniq -c | sort -rn | head -10
echo "" echo "按小时统计访问量:" awk '{print $4}' "$LOG_FILE" | cut -d: -f1 | sort | uniq -c | sort -rn
|
案例2:批量重命名文件
#!/bin/bash
for file in *.txt; do if [ -f "$file" ]; then mv "$file" "$file.bak" echo "Renamed: $file -> $file.bak" fi done
for file in *\ *; do if [ -f "$file" ]; then newname=$(echo "$file" | tr ' ' '_') mv "$file" "$newname" fi done
PREFIX="backup_" for file in *.txt; do [ -f "$file" ] && mv "$file" "$PREFIX$file" done
for file in *.TXT; do [ -f "$file" ] && mv "$file" "${file%.TXT}.txt" done
|
案例3:数据库备份脚本
#!/bin/bash
DB_HOST="localhost" DB_USER="backup" DB_PASS="password" DB_NAME="mydb" BACKUP_DIR="/backup" DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p "$BACKUP_DIR"
echo "Starting backup of $DB_NAME..." mysqldump -h "$DB_HOST" -u "$DB_USER" -p"$DB_PASS" "$DB_NAME" | gzip > "$BACKUP_DIR/${DB_NAME}_${DATE}.sql.gz"
if [ $? -eq 0 ]; then echo "Backup completed successfully!" find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete echo "Old backups cleaned up." else echo "Backup failed!" exit 1 fi
|
案例4:系统监控脚本
#!/bin/bash
cpu_usage() { top -bn1 | grep "Cpu(s)" | awk '{print "CPU: " 100 - $8 "%"}' }
mem_usage() { free -h | awk '/Mem:/ {print "Memory: " $3 "/" $2}' }
disk_usage() { df -h | awk '/\/$/ {print "Disk: " $3 "/" $2 " (" $5 " used)"}' }
load_avg() { uptime | awk -F'load average:' '{print "Load:" $2}' }
online_users() { who | wc -l | awk '{print "Online users:" $1}' }
process_count() { ps aux | wc -l | awk '{print "Total processes:" $1-1}' }
echo "========== 系统监控 ==========" echo "$(date)" echo "" cpu_usage mem_usage disk_usage load_avg online_users process_count
|
案例5:自动部署脚本
#!/bin/bash
set -e
APP_DIR="/opt/myapp" BACKUP_DIR="/opt/backup" TIMESTAMP=$(date +%Y%m%d_%H%M%S)
echo "========== 开始部署 ==========" echo "时间: $(date)" echo ""
echo "[1/6] 停止服务..." systemctl stop myapp || true
if [ -d "$APP_DIR" ]; then echo "[2/6] 备份当前版本..." mv "$APP_DIR" "$BACKUP_DIR/myapp_$TIMESTAMP" fi
echo "[3/6] 创建应用目录..." mkdir -p "$APP_DIR"
echo "[4/6] 解压新版本..." unzip -q /tmp/myapp.zip -d "$APP_DIR"
echo "[5/6] 设置权限..." chown -R myuser:mygroup "$APP_DIR" chmod +x "$APP_DIR/bin/myapp"
echo "[6/6] 启动服务..." systemctl start myapp
if systemctl is-active --quiet myapp; then echo "" echo "========== 部署成功 ==========" else echo "部署失败!请检查服务状态" exit 1 fi
|
案例6:文本格式化工具
#!/bin/bash
trim_left() { sed 's/^[ ]*//' }
trim_right() { sed 's/[ ]*$//' }
remove_blank() { sed '/^$/d' }
add_line_numbers() { awk '{printf "%5d %s\n", NR, $0}' }
to_upper() { tr 'a-z' 'A-Z' }
to_lower() { tr 'A-Z' 'a-z' }
tab_to_space() { expand -t 4 }
space_to_tab() { unexpand -t 4 }
case "$1" in -l|--trim-left) trim_left ;; -r|--trim-right) trim_right ;; -b|--remove-blank) remove_blank ;; -n|--number) add_line_numbers ;; -u|--upper) to_upper ;; -d|--down) to_lower ;; -t|--tab2space) tab_to_space ;; *) echo "Usage: $0 {option}" echo " -l, --trim-left 去除行首空格" echo " -r, --trim-right 去除行尾空格" echo " -b, --remove-blank 去除空行" echo " -n, --number 添加行号" echo " -u, --upper 转为大写" echo " -d, --down 转为小写" echo " -t, --tab2space Tab 转空格" ;; esac
|
Shell 脚本调试
调试选项
bash -n script.sh
bash -x script.sh
set -x set +x
set -e set -u set -o pipefail
|
常用调试技巧
DEBUG=1 [ "$DEBUG" = "1" ] && echo "Debug: variable = $var"
debug() { echo "Called from: ${FUNCNAME[1]}" }
start_time=$(date +%s)
end_time=$(date +%s) echo "Elapsed: $((end_time - start_time)) seconds"
|
常见问题与技巧
字符串处理技巧
str="hello world" echo ${str/world/linux}
filename="/path/to/file.txt" echo ${filename##*/} echo ${filename#*/}
arr=(one two three) echo ${arr[@]:1:2} arr+=(four) unset arr[0]
|
文件操作技巧
mapfile -t lines < file.txt
while IFS= read -r line; do echo "$line" done < file.txt
shred -u file
|
性能优化
for i in $(cat file); do echo $i done
while read -r line; do echo "$line" done < file
result=$(echo "1 2 3 4 5" | tr ' ' '\n' | sort -n | tail -1)
arr=(1 2 3 4 5) max=${arr[0]} for i in "${arr[@]}"; do ((i > max)) && max=$i done echo $max
|
参考资料
- 《Shell 从入门到精通》
- GNU Coreutils 文档
- AWK Language Programming
- Sed - An Introduction and Tutorial