In this post we are sharing tips to remove html tags by using sed command. Today, while working on migration of site from wordpress to static site, I used lots of sed command.
I am sharing some sed command which are always successful for me.
In html tags generally the tags are coded in this manner i.e <tag-name> , here for eg. tag-name can be <h1>,</h1>,<body>, <pre> etc. You may have heard about replacing keyword with this sed command syntax
sed 's/Keyword/Replacing-keyword/g' FileName
Now here is the trick. To replace html tags which are enclosed with < and > signs ,we will use below given syntax
sed 's|HTML-tag|Replacing-Keyword|g' file-name
We will see some examples related to this task
Example 1:Remove all h2 tags from file
In this example we are selecting h2 tags and all h2 tags in file will be removed , means it will be blank there.
sed 's|<h2>||g' file-name
Example 2: Replace specific HTML tag with other tag
In this example the closing br tag will be replaced with closing p tag
sed 's|</br>|</p>|g' file-name
Example 3: Replacing html tag with some other non-html tag
In this example, the html tag will be replaced with non-html tags.
sed 's|</p>|Mywords|g' file-name
In similar manner you can try with other keywords.