One of the most annoying issues that surface after you have delivered a site (and users start creating pages and entering content) are the appearance of characters that don’t display correctly. We’ve all encountered this at some stage browsing the interwebs – those annoying characters/symbols that rear their ugly heads from time to time:
An this is what we should see:

What is particularly frustrating is that these strange characters appear only after a page has been published to the live website – not within SmartEdit mode.
This occurs when the characters (letters, numbers, symbols) making up your page’s text are not uniformly encoded in the same character set. Some might be in Western European (iso-8859-1) and others might be saved in Unicode (utf-8).
To fix this, you need to specifically tell the browser what character set you are using to display the page content, and that all characters on your page are stored using the same character set. Alternatively, characters that cannot be expressed within a specific character set can be embedded into the page using character entity references (in the form or numeric or named values).
It’s particularly important that you specify the correct character set or include the appropriate character entities when publishing RSS feeds out of RedDot CMS, otherwise you will encounter XML parsing issues.
Selecting the correct published character set will fix the majority of your issues.
For each language variant, select the appropriate published character set from the drop down list under ‘Edit Language Variant’.
As a rule of thumb, I would suggest selecting UTF-8 as the character encoding set as this encoding can support many languages and can accommodate pages displaying content using a mixture of those languages.
Also, ensure that the appropriate declaration is added within your page. For XML (including XHTML), use the encoding pseudo-attribute in the XML declaration at the start of a document or the text declaration at the start of an entity.
<?xml version="1.0" encoding="utf-8" ?>
For HTML or XHTML served as HTML, you should always use the <meta> tag inside <head>. Example:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
For XHTML, you need a slash at the end:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
NOTE: When using UTF-8, make sure that the BOM (Byte Order Mark) under the publishing target is not included. Check out this article for more information about BOM.

Convert specific characters into the appropriate character entity
Most developers generally rely on the ISO 8859-1 – Western European character set for English based sites, however this set does not include Unicode characters – such as m & ndashes, left and right double quotes etc – that inevitably finding their way into HTML Text placeholders when cut and pasting content from Word or PDF documents. I’ve found this to be the main cause for most of those annoying character issues.
If you need to include these specific kinds of characters within your page when using a character set other than UTF-8, the HTML Convert table within the CMS will convert them to the appropriate entity so they can be displayed correctly.
I’ve attached a HTML Convert table that I use frequently that includes most commonly used characters that need to be encoded. (NOTE: I’ve found that this file needs to be saved in ANSI format in order to work within Red Dot, however some other users find that Unicode works fine for them. Just make sure you test out any changes thoroughly!!!)
Copy this file within the ‘ASP’ folder where the CMS is installed (typically C:\Program Files\Open Text\WS\MS\ASP). Within the Project Variant settings, make sure you specify the file you wish to use:

For more information about the HTML Convert Table, check out Stefen Buchali’s post.
Your character encoding issues within your RedDot CMS projects should now be a thing of the past!
































Kim Dezen is a Senior RedDot CMS (Open Text Web Solutions) CMS Consultant, Developer and Freelancer. Part time DJ and obsessed music / vinyl junkie. This site is my personal blog for all things related to Red Dot, SEO/SEM and Web Development.
6 comments
Tweets that mention RedDot CMS Character encoding issues -- Topsy.com says:
Feb 24, 2011
[...] This post was mentioned on Twitter by Owen Brandt and Robert Daniel Moore, Kim Dezen. Kim Dezen said: New blog post: RedDot CMS Character encoding issues http://bit.ly/igu0xI #reddotcms #reddot #opentext #otws #html [...]
carole says:
Mar 16, 2011
After I followed these steps, all my pages that had headlines containing double-quote marks (“) failed to publish and generated the following error when Previewed:
BuildPage_PreExecute: The remote server returned an error: (500) Internal Server Error.
Any fix for that?
Kim Dezen says:
Mar 16, 2011
Hi Carole,
Are you assigning the headline values to a VBScript/ASP variable???
Kim
carole says:
Mar 18, 2011
Yes.
Kim Dezen says:
Mar 22, 2011
An error will occur when assigning the string value (containing quotes) to a vbscript variable unless it is ‘escaped’ properly.
check out this thread which will help you fix this issue:
http://groups.google.com/group/reddot-cms-users/browse_thread/thread/288add801395efd7/ff00fc7c6d046045?lnk=gst&q=quote#ff00fc7c6d046045
hytest says:
Apr 6, 2012
Thanks! It’s very helpful.