Store the content in a file named regex-content-01. The HTML content used for the demonstration follows. Otherwise, I'd say that rather than trying to rely on a bogus non-standard feature of your particular implementation of sed/ grep, it would be better to stick with the standard and use ]. Running regular expressions using grep against a single HTML file In this section, you'll see a variety of regular expressions executed against a single file of HTML. If you want Perl's regular expressions, just use perl though. There's some evidence in the code that it has been attempted before, but it doesn't seem to be on the agenda anymore. It uses it in such a way though that it doesn't have the same bug as GNU grep. GNU sed also uses the GNU libc's regexs for its own regexps. It can be worked around there though by using (*UCP) (though that also has side-effects in non-UTF8 locales). Now, with the way GNU grep -P uses PCRE, it's got the same issue as without -P. PCRE/perl are not POSIX regular expressions, they're just another thing altogether. There also is a \w regexp operator in perl regexp and in PCRE. However, it currently has a bug in that it only matches single-byte characters (for instance, not é in a UTF-8 locale even though that's clearly a letter and even though it does match é in all the locales where é is a single character). It's meant to match alnums and underscore in your locale. GNU grep used to have its own regexp engine however it now uses the GNU libc's one (though it does embed its own copy). The behaviour for \w alone is not specified by POSIX, so implementations are allowed to do what they want. So you won't find a grep or sed implementation where that's available (unless via non-standard options). is required by POSIX to match either backslash or w. That matches letters and digits in your locale (note that often includes a lot more than a-zA-Z0-9 unless the locale is C). In POSIX BRE and ERE, you have the character class. Standards exist so that one can rely on a minimum set of features that are available across all conforming applications.įor instance, all modern implementations of sed and grep implement basic regular expressions as specified by POSIX (at least one version or the other of the standard, but that standard has not evolved a lot in that regard in the last few decades). The documentation of each will tell you what they support. Different tools and versions thereof support different variants of regular expressions.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |