HTML Entities Class


By Peter Bromberg
Printer Friendly Version
  

I could not find an easy way to get HTML Entity strings (e.g. "& g t ;" - without the spaces) based on the character value, so I created this easy HtmlEntities static class based on the Generic Dictionary.



There are many situations (such as blog posts, web.config entries and others) where it is necessary to find and use the HTML Entity value string for a given character. Rather than always have to look them up, as an experiment, I created this handy static HtmlEntities class.

It was an instructional experiment, since I started out with a NameValueCollection, only to find out that it can't handle case-sensitive keys. In addition, if NameValueCollection finds multiple entries, it returns them as a comma-delimited string.

By switching over to the Generic Dictionary class, I got what I wanted, and was able to quickly eliminate duplicates through a simple try / catch in the static constructor of the class.

On a side note, this is an example of where one might rightfully use a generic System.Exception catch in debugging new code - without wrapping the constructor in a try / catch block, it would be virtually impossible to tell which statement actually throws the type initializer exception. With the Debug.WriteLine(ex.Message +ex.StackTrace) code in the catch, the StackTrace will provide you with the exact line number of the offender:

 

using System;
using System.Collections.Generic;

public class HtmlEntities
{
 public static Dictionary<string,string> Entities = new  Dictionary<string,string>();
 
    static HtmlEntities ()
 {
     try
     {
         //Entities.Add("","&#032;");
         Entities.Add("!", "&#033;");
         // Entities.Add("\"","&#034;");
         Entities.Add("#", "&#035;");
         Entities.Add("$", "&#036;");
         Entities.Add("%", "&#037;");
         //Entities.Add("&","&#038;");
         Entities.Add("'", "&#039;");
         Entities.Add("(", "&#040;");
         Entities.Add(")", "&#041;");
         Entities.Add("*", "&#042;");
         Entities.Add("+", "&#043;");
         Entities.Add(",", "&#044;");
         Entities.Add("-", "&#045;");
         Entities.Add(".", "&#046;");
         Entities.Add("/", "&#047;");
         Entities.Add("0", "&#048;");
         Entities.Add("1", "&#049;");
         Entities.Add("2", "&#050;");
         Entities.Add("3", "&#051;");
         Entities.Add("4", "&#052;");
         Entities.Add("5", "&#053;");
         Entities.Add("6", "&#054;");
         Entities.Add("7", "&#055;");
         Entities.Add("8", "&#056;");
         Entities.Add("9", "&#057;");
         Entities.Add(":", "&#058;");
         Entities.Add(";", "&#059;");
         //Entities.Add("<","&#060;");
         Entities.Add("=", "&#061;");
         //Entities.Add(">","&#062;");
         Entities.Add("?", "&#063;");
         Entities.Add("@", "&#064;");
         Entities.Add("A", "&#065;");
         Entities.Add("B", "&#066;");
         Entities.Add("C", "&#067;");
         Entities.Add("D", "&#068;");
         Entities.Add("E", "&#069;");
         Entities.Add("F", "&#070;");
         Entities.Add("G", "&#071;");
         Entities.Add("H", "&#072;");
         Entities.Add("I", "&#073;");
         Entities.Add("J", "&#074;");
         Entities.Add("K", "&#075;");
         Entities.Add("L", "&#076;");
         Entities.Add("M", "&#077;");
         Entities.Add("N", "&#078;");
         Entities.Add("O", "&#079;");
         Entities.Add("P", "&#080;");
         Entities.Add("Q", "&#081;");
         Entities.Add("R", "&#082;");
         Entities.Add("S", "&#083;");
         Entities.Add("T", "&#084;");
         Entities.Add("U", "&#085;");
         Entities.Add("V", "&#086;");
         Entities.Add("W", "&#087;");
         Entities.Add("X", "&#088;");
         Entities.Add("Y", "&#089;");
         Entities.Add("Z", "&#090;");
         Entities.Add("[", "&#091;");
         Entities.Add(@"\", "&#092;");
         Entities.Add("]", "&#093;");
         Entities.Add("^", "&#094;");
         Entities.Add("_", "&#095;");
         Entities.Add("`", "&#096;");
         Entities.Add("a", "&#097;");
         Entities.Add("b", "&#098;");
         Entities.Add("c", "&#099;");
         Entities.Add("d", "&#100;");
         Entities.Add("e", "&#101;");
         Entities.Add("f", "&#102;");
         Entities.Add("g", "&#103;");
         Entities.Add("h", "&#104;");
         Entities.Add("i", "&#105;");
         Entities.Add("j", "&#106;");
         Entities.Add("k", "&#107;");
         Entities.Add("l", "&#108;");
         Entities.Add("m", "&#109;");
         Entities.Add("n", "&#110;");
         Entities.Add("o", "&#111;");
         Entities.Add("p", "&#112;");
         Entities.Add("q", "&#113;");
         Entities.Add("r", "&#114;");
         Entities.Add("s", "&#115;");
         Entities.Add("t", "&#116;");
         Entities.Add("u", "&#117;");
         Entities.Add("v", "&#118;");
         Entities.Add("w", "&#119;");
         Entities.Add("x", "&#120;");
         Entities.Add("y", "&#121;");
         Entities.Add("z", "&#122;");
         Entities.Add("{", "&#123;");
         Entities.Add("|", "&#124;");
         Entities.Add("}", "&#125;");
         Entities.Add("~", "&#126;");
         Entities.Add("", "&#127;");
         Entities.Add("", "&#128;");
         Entities.Add("", "&#129;");
         Entities.Add("", "&#130;");
         Entities.Add("ƒ", "&#131;");
         Entities.Add("", "&#132;");
         Entities.Add("", "&#133;");
         Entities.Add("", "&#134;");
         Entities.Add("", "&#135;");
         Entities.Add("ˆ", "&#136;");
         Entities.Add("", "&#137;");
         Entities.Add("Š", "&#138;");
         Entities.Add("", "&#139;");
         Entities.Add("Œ", "&#140;");
         Entities.Add("", "&#141;");
         Entities.Add("Ž", "&#142;");
         Entities.Add("", "&#143;");
         Entities.Add("", "&#144;");
         Entities.Add("", "&#145;");
         Entities.Add("", "&#146;");
         Entities.Add("", "&#147;");
         Entities.Add("", "&#148;");
         Entities.Add("", "&#149;");
         Entities.Add("", "&#150;");
         Entities.Add("", "&#151;");
         Entities.Add("˜", "&#152;");
         Entities.Add("", "&#153;");
         Entities.Add("š", "&#154;");
         Entities.Add("", "&#155;");
         Entities.Add("œ", "&#156;");
         Entities.Add("", "&#157;");
         Entities.Add("ž", "&#158;");
         Entities.Add("Ÿ", "&#159;");
         //Entities.Add("","&#160;");
         Entities.Add("¡", "&#161;");
        // Entities.Add("¢", "&#162;");
         Entities.Add("£", "&#163;");
         Entities.Add("¤", "&#164;");
         Entities.Add("¥", "&#165;");
         Entities.Add("¦", "&#166;");
         Entities.Add("§", "&#167;");
         Entities.Add("¨", "&#168;");
        // Entities.Add("©", "&#169;");
         Entities.Add("ª", "&#170;");
         Entities.Add("«", "&#171;");
         Entities.Add("¬", "&#172;");
         Entities.Add("­", "&#173;");
         Entities.Add("®", "&#174;");
         Entities.Add("¯", "&#175;");
         //Entities.Add("°", "&#176;");
         Entities.Add("±", "&#177;");
         Entities.Add("²", "&#178;");
         Entities.Add("³", "&#179;");
         Entities.Add("´", "&#180;");
         Entities.Add("µ", "&#181;");
         Entities.Add("", "&#182;");
       //  Entities.Add("·", "&#183;");
         Entities.Add("¸", "&#184;");
         Entities.Add("¹", "&#185;");
         Entities.Add("º", "&#186;");
         Entities.Add("»", "&#187;");
         Entities.Add("¼", "&#188;");
         Entities.Add("½", "&#189;");
         Entities.Add("¾", "&#190;");
         Entities.Add("¿", "&#191;");
         Entities.Add("À", "&#192;");
         Entities.Add("Á", "&#193;");
         Entities.Add("Â", "&#194;");
         Entities.Add("Ã", "&#195;");
         Entities.Add("Ä", "&#196;");
         Entities.Add("Å", "&#197;");
         Entities.Add("Æ", "&#198;");
         Entities.Add("Ç", "&#199;");
         Entities.Add("È", "&#200;");
         Entities.Add("É", "&#201;");
         Entities.Add("Ê", "&#202;");
         Entities.Add("Ë", "&#203;");
         Entities.Add("Ì", "&#204;");
         Entities.Add("Í", "&#205;");
         Entities.Add("Î", "&#206;");
         Entities.Add("Ï", "&#207;");
         Entities.Add("Ð", "&#208;");
         Entities.Add("Ñ", "&#209;");
         Entities.Add("Ò", "&#210;");
         Entities.Add("Ó", "&#211;");
         Entities.Add("Ô", "&#212;");
         Entities.Add("Õ", "&#213;");
         Entities.Add("Ö", "&#214;");
         Entities.Add("×", "&#215;");
         Entities.Add("Ø", "&#216;");
         Entities.Add("Ù", "&#217;");
         Entities.Add("Ú", "&#218;");
         Entities.Add("Û", "&#219;");
         Entities.Add("Ü", "&#220;");
         Entities.Add("Ý", "&#221;");
         Entities.Add("Þ", "&#222;");
         Entities.Add("ß", "&#223;");
         Entities.Add("à", "&#224;");
         Entities.Add("á", "&#225;");
         Entities.Add("â", "&#226;");
         Entities.Add("ã", "&#227;");
         Entities.Add("ä", "&#228;");
         Entities.Add("å", "&#229;");
         Entities.Add("æ", "&#230;");
         Entities.Add("ç", "&#231;");
         Entities.Add("è", "&#232;");
         Entities.Add("é", "&#233;");
         Entities.Add("ê", "&#234;");
         Entities.Add("ë", "&#235;");
         Entities.Add("ì", "&#236;");
         Entities.Add("í", "&#237;");
         Entities.Add("î", "&#238;");
         Entities.Add("ï", "&#239;");
         Entities.Add("ð", "&#240;");
         Entities.Add("ñ", "&#241;");
         Entities.Add("ò", "&#242;");
         Entities.Add("ó", "&#243;");
         Entities.Add("ô", "&#244;");
         Entities.Add("õ", "&#245;");
         Entities.Add("ö", "&#246;");
         Entities.Add("÷", "&#247;");
         Entities.Add("ø", "&#248;");
         Entities.Add("ù", "&#249;");
         Entities.Add("ú", "&#250;");
         Entities.Add("û", "&#251;");
         Entities.Add("ü", "&#252;");
         Entities.Add("ý", "&#253;");
         Entities.Add("þ", "&#254;");
         Entities.Add("ÿ", "&#255;");
         Entities.Add("&", "&amp;");
         Entities.Add("¢", "&cent;");
         Entities.Add("©","&copy;");
         Entities.Add("°", "&deg;");
         Entities.Add("·", "&middot;");
         Entities.Add(">", "&gt;");
         Entities.Add("<", "&lt;");
         Entities.Add(" ", "&nbsp;");
       Entities.Add("¬", "&not;");
       Entities.Add("", "&para;");
         Entities.Add("\"", "&quot;");
         Entities.Add("®", "&reg;");
     }
        catch(Exception ex)
        {
            System.Diagnostics.Debug.WriteLine(ex.Message + ex.StackTrace);
        }
 }
}
I've left the "offenders" in there, commented of course.  The downloadable zip file contains a simple Console app that goes through all of them.  To use this, all you need to do is pass the "key" to the static Entities Dictionary, and you get back the HTML entity value you need:

Console.WriteLine(HtmlEntities.Entities["&"]);



Biography
Peter Bromberg is a C# MVP, MCP, and .NET expert who has worked in banking ,financial and telephony for 20 years. Pete focuses exclusively on the .NET Platform, and his samples at GotDotNet.com have been downloaded over 56,000 times. Peter enjoys producing 3D raytraced digital photo collage with Maya, the beach, and fine wines. You can view Peter's UnBlogIttyUrl, and BlogMetafinder sites.
Please post questions at forums, not via email!

button
 
Article Discussion: HTML Entities Class
Peter Bromberg posted at 03-Jul-07 12:01
Original Article

 
System.Web.HtmlEntities and HttpUtility.HtmlEncode
Simone Busoli replied to Peter Bromberg at 31-Aug-07 08:46
The .NET framework already as a class - although internal - called System.Web.HtmlEntities which keeps a list of mappings.

Besides, the HttpUtility.HtmlEncode class does just that, escaping HTML entities characters.

 
System.Web.HtmlEntities
Peter Bromberg replied to Simone Busoli at 01-Sep-07 08:43
besides being internal, also "looks up" entities in the wrong direction. However there are 252 entries, so the string[] array _entities is useful. Good point on the HtmlEncode method.