Print special characters in php

After much banging-head-on-table, I have a bit better understanding of the issue that I wanted to post for anyone else who may have had this issue.

While the UTF-8 character set will display special characters on the client, the server, on the other hand, may not be so accomodating and would print special characters such as à and è as and .

To make sure your server will print them correctly, use the ISO-8859-1 charset:




    
        
        Untitled Document
    
    
        
    

This will print correctly: àè


Edit (4 years later):

I have a little better understanding now. The reason this works is that the client (browser) is being told, through the response header(), to expect an ISO-8859-1 text/html file. (As others have mentioned, you can also do this by updating your .ini or .htaccess files.) Then, once the browser begins to parse that given file into the DOM, the output will obey any rule but keep your ISO characters intact.

❮ PHP String Reference

Example

Convert the predefined characters "<" (less than) and ">" (greater than) to HTML entities:

$str = "This is some bold text.";
echo htmlspecialchars($str);
?>

The HTML output of the code above will be (View Source):




This is some <b>bold</b> text.

The browser output of the code above will be:

This is some bold text.

Try it Yourself »


Definition and Usage

The htmlspecialchars() function converts some predefined characters to HTML entities.

The predefined characters are:

  • & (ampersand) becomes &
  • " (double quote) becomes "
  • ' (single quote) becomes '
  • < (less than) becomes <
  • > (greater than) becomes >

Tip: To convert special HTML entities back to characters, use the htmlspecialchars_decode() function.


Syntax

htmlspecialchars(string,flags,character-set,double_encode)

Parameter Values

ParameterDescription
string Required. Specifies the string to convert
flags Optional. Specifies how to handle quotes, invalid encoding and the used document type.

The available quote styles are:

  • ENT_COMPAT - Default. Encodes only double quotes
  • ENT_QUOTES - Encodes double and single quotes
  • ENT_NOQUOTES - Does not encode any quotes

Invalid encoding:

  • ENT_IGNORE - Ignores invalid encoding instead of having the function return an empty string. Should be avoided, as it may have security implications.
  • ENT_SUBSTITUTE - Replaces invalid encoding for a specified character set with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD; instead of returning an empty string.
  • ENT_DISALLOWED - Replaces code points that are invalid in the specified doctype with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD;

Additional flags for specifying the used doctype:

  • ENT_HTML401 - Default. Handle code as HTML 4.01
  • ENT_HTML5 - Handle code as HTML 5
  • ENT_XML1 - Handle code as XML 1
  • ENT_XHTML - Handle code as XHTML
character-set Optional. A string that specifies which character-set to use.

Allowed values are:

  • UTF-8 - Default. ASCII compatible multi-byte 8-bit Unicode
  • ISO-8859-1 - Western European
  • ISO-8859-15 - Western European (adds the Euro sign + French and Finnish letters missing in ISO-8859-1)
  • cp866 - DOS-specific Cyrillic charset
  • cp1251 - Windows-specific Cyrillic charset
  • cp1252 - Windows specific charset for Western European
  • KOI8-R - Russian
  • BIG5 - Traditional Chinese, mainly used in Taiwan
  • GB2312 - Simplified Chinese, national standard character set
  • BIG5-HKSCS - Big5 with Hong Kong extensions
  • Shift_JIS - Japanese
  • EUC-JP - Japanese
  • MacRoman - Character-set that was used by Mac OS

Note: Unrecognized character-sets will be ignored and replaced by ISO-8859-1 in versions prior to PHP 5.4. As of PHP 5.4, it will be ignored an replaced by UTF-8.

double_encode Optional. A boolean value that specifies whether to encode existing html entities or not.
  • TRUE - Default. Will convert everything
  • FALSE - Will not encode existing html entities


Technical Details

Return Value:Returns the converted string

If the string contains invalid encoding, it will return an empty string, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set

PHP Version:4+
Changelog:PHP 5.6 - Changed the default value for the character-set parameter to the value of the default charset (in configuration).
PHP 5.4 - Changed the default value for the character-set parameter to UTF-8.
PHP 5.4 - Added ENT_SUBSTITUTE, ENT_DISALLOWED, ENT_HTML401, ENT_HTML5, ENT_XML1 and ENT_XHTML
PHP 5.3 - Added ENT_IGNORE constant.
PHP 5.2.3 - Added the double_encode parameter.
PHP 4.1 - Added the character-set parameter.

More Examples

Example

Convert some predefined characters to HTML entities:

$str = "Jane & 'Tarzan'";
echo htmlspecialchars($str, ENT_COMPAT); // Will only convert double quotes
echo "
";
echo htmlspecialchars($str, ENT_QUOTES); // Converts double and single quotes
echo "
";
echo htmlspecialchars($str, ENT_NOQUOTES); // Does not convert any quotes
?>

The HTML output of the code above will be (View Source):




Jane & 'Tarzan'

Jane & 'Tarzan'

Jane & 'Tarzan'

The browser output of the code above will be:

Jane & 'Tarzan'
Jane & 'Tarzan'
Jane & 'Tarzan'

Try it Yourself »

Example

Convert double quotes to HTML entities:

$str = 'I love "PHP".';
echo htmlspecialchars($str, ENT_QUOTES); // Converts double and single quotes
?>

The HTML output of the code above will be (View Source):




I love "PHP".

The browser output of the code above will be:

I love "PHP".

Try it Yourself »


❮ PHP String Reference


How do I allow special characters in PHP?

Tip: To convert special HTML entities back to characters, use the htmlspecialchars_decode() function..
& (ampersand) becomes &.
" (double quote) becomes ".
' (single quote) becomes '.
< (less than) becomes <.
> (greater than) becomes >.

What does Htmlspecialchars mean in PHP?

Description. The htmlspecialchars() function is used to converts special characters ( e.g. & (ampersand), " (double quote), ' (single quote), < (less than), > (greater than)) to HTML entities ( i.e. & (ampersand) becomes &, ' (single quote) becomes ', < (less than) becomes < (greater than) becomes > ).

What is the difference between Htmlentities and Htmlspecialchars in PHP?

Difference between htmlentities() and htmlspecialchars() function: The only difference between these function is that htmlspecialchars() function convert the special characters to HTML entities whereas htmlentities() function convert all applicable characters to HTML entities.

Can UTF 8 handle special characters?

UTF-8 represents ASCII invariant characters a-z, A-Z, 0-9, and certain special characters such as ' @ , . + - = / * ( ) the same way that they are represented in ASCII.