Skip to main content

Posts

Showing posts with the label Regex

How To Convert HTML to Text, Easily

Whether you want to convert an HTML page into pure text so you can parse out that special piece of information, or you simply want to load a page from the Net into your own word processing package, this mini function could come in handy. It’s called StripTags and accepts an HTML string. Using a regular expression, it identifies all <tags>, removes them, and returns the modified string. Here’s the code: <%@ Page Language="C#" ValidateRequest="False" AutoEventWireup="true" CodeFile="StripTag.aspx.cs" Inherits="StripTag" %><!DOCTYPEhtmlPUBLIC"-//W3C//DTD XHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><htmlxmlns="http://www.w3.org/1999/xhtml"><headrunat="server"><title>Untitled Page</title></head><body><formid="form1"runat="server"><div><asp:TextBoxID="TextBox1&quo…