remove html tags by using sed command

In this post we are sharing tips to remove html tags by using sed command. Today, while working on migration of site from wordpress to static site, I used lots of sed command.

I am sharing some sed command which are always successful for me.

In html tags generally the tags are coded in this manner i.e <tag-name> , here for eg. tag-name can be <h1>,</h1>,<body>, <pre> etc. You may have heard about replacing keyword with this sed command syntax

sed 's/Keyword/Replacing-keyword/g' FileName

Now here is the trick. To replace html tags which are enclosed with < and > signs ,we will use below given syntax

sed 's|HTML-tag|Replacing-Keyword|g' file-name

We will see some examples related to this task

Example 1:Remove all h2 tags from file
In this example we are selecting h2 tags and all h2 tags in file will be removed , means it will be blank there.

sed 's|

||g' file-name

Example 2: Replace specific HTML tag with other tag

In this example the closing br tag will be replaced with closing p tag

sed 's|

|g' file-name

Example 3: Replacing html tag with some other non-html tag

In this example, the html tag will be replaced with non-html tags.

sed 's|

|Mywords|g' file-name

In similar manner you can try with other keywords.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.