Assuming $_ contains HTML, which of the following substitutions will remove all tags in it?

Q

Assuming $_ contains HTML, which of the following substitutions will remove all tags in it?
1.s/<.*>//g;
2.s/<.*?>//gs;
3.s/<\/?[A-Z]\w*(?:\s+[A-Z]\w*(?:\s*=\s*(?:(["']).*?\1|[\w-.]+))?)*\s*>//gsix;

✍: Guest

A

You can't do that.
If it weren't for HTML comments, improperly formatted HTML, and tags with interesting data like < SCRIPT >, you could do this. Alas, you cannot. It takes a lot more smarts, and quite frankly, a real parser.

2013-08-29, 1807👍, 0💬