匹配文件中的邮箱地址
grep -Eiorh '([[:alnum:]_.-]+@[[:alnum:]_.-]+?.[[:alpha:].]{2,6})' [filename] | sort | uniq
使用sed获取proxy中的ip和port
这里的proxy很好,但问题是list中proxy的字段顺序并不是每一行都是一致的,这样的话正则表达式不是很好写
# 样本
{"type": "https", "host": "180.210.201.57", "export_address": ["180.210.201.57"], "response_time": 11.28, "country": "SG", "anonymity": "high_anonymous", "from": "proxylist", "port": 3130}
{"country": "US", "anonymity": "high_anonymous", "export_address": ["12.162.8.175"], "host": "12.162.8.175", "response_time": 6.42, "type": "https", "port": 80, "from": "txt"}
# sed命令
# 正则匹配的$1/$2/$3/$4,自由$2和$3是我们需要的
sed -r 's/(\{.*"host":\s")([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})".*"port":\s([0-9]{2,5})(.*\})/\2:\3/g'
# grep命令
# @TODO 需要进一步完善
grep -o -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"