正则表达式关于 Metacharacter [ ]
引用自 wiki
A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches “a”, “b”, or “c”. [a-z] specifies a range which matches any lowercase letter from “a” to “z”. These forms can be mixed: [abcx-z] matches “a”, “b”, “c”, “x”, “y”, or “z”, as does [a-cx-z].
The - character is treated as a literal character if it is the last or the first (after the ^, if present) character within the brackets: [abc-], [-abc]. Note that backslash escapes are not allowed. The ] character can be included in a bracket expression if it is the first (after the ^) character: []abc].
说的很明白了不翻译了,重点:
- ‘-‘ 的用法需特殊注意
- ‘]’ 的用法需要特殊注意
其他
1. sed如何对待一些特殊字符
sed 匹配一些非常规字符,比如字符串 “helloæworld”,当我们想替换中间的 æ,操作如下:
1 | ...]$ echo helloæworld | sed -n 'l0' |
打印 sed 看到的特殊字符是什么,便于操作1
sed -n 'l0'
参考并感谢相应作者
2. 正则表达式 Metacharacter [] 中 ‘-’的用法
当 ‘-’ 被用在[]之间表示字符范围,那么是否任意两个字符都可以作为范围的起始和结束。比如,用十进制表示字符的ASCII码,[\d1-\d255] 是否表示任意一个字符。将范围表示为[begin - end]
测试结果
- 正则表达式显示声明的三类范围支持这种用法,比如0-9,a-z,A-Z。
例如 [0-9] 等于 [\d48-\d57] - begin ASCII码 必须小于 end ASCII码
- 只在相应区间0-9,a-z,A-Z内成立,不能跨区间,不能用其他字符
3. 正则表达式 Metacharacter [] 中 无法用 ‘\s’ 匹配空格或tab
- 空格用 ‘ ‘
- tab用 ‘\t’
4. 其他
sed 中用不同进制的数位表示ASCII
\dxxx
Produces or matches a character whose decimal ASCII value is xxx.
\oxxx
Produces or matches a character whose octal ASCII value is xxx.
\xxx
Produces or matches a character whose hexadecimal ASCII value is xx