I have a database which stores video game names with Unicode characters but I can't figure out how to properly escape these Unicode characters when printing them to an HTML response.
For instance, when I print all games with the name like Uncharted, I get this:
Uncharted: Drake's Fortuneâ„¢
Uncharted 2: Among Thievesâ„¢
Uncharted 3: Drake's Deceptionâ„¢
but it should display this:
Uncharted: Drake's Fortune™
Uncharted 2: Among Thieves™
Uncharted 3: Drake's Deception™
I ran a quick JavaScript escape function to see which Unicode character the ™
is and found that it's \u2122
.
I don't have a problem fully
escaping every character in the string if I can get the ™
character to display correctly. My guess is to somehow find the hex representation of each character in the string and have PHP render the Unicode characters like this:
print "™";
Please guide me through the best approach for Unicode escaping a string for being HTML friendly. I've done something similar for JavaScript a while back, but JavaScript has a built in function for escape and unescape.
I'm not aware of any PHP
functions of similar functionality however. I have read about the ord function, but it just returns the ASCII character code for a given character, hence the improper display of the ™
or the ™
. I would like this function to be versatile enough to apply to any string containing valid Unicode characters.
In PHP, we can use the mb_ord[] function to get the Unicode code point value of a given character. This function is supported in PHP 7 or higher versions. The mb_ord[] function complements the mc_chr[] function.
Syntax
int mb_ord[$str_string, $str_encoding]
Parameters
mb_ord[] accepts the following two parameters −
$str_string − This parameter is used for the string.
$str_encoding − This is the character encoding parameter. If it is absent or NULL, then we can use the internal encoding value.
Return Values
mb_ord[] returns the Unicode point value for the first character from the given string. It will return False on failure.
Example
Output
It will produce the following output −
Get the numeric value of characters int[66] int[100] int[128] int[1026]
Updated on 11-Oct-2021 13:22:14
- Related Questions & Answers
- PHP – How to return character by Unicode code point value using mb_chr[]?
- How to return a number indicating the Unicode value of the character?
- How to find the unicode category for a given character in Java?
- Java Program to Determine the Unicode Code Point at a given index
- Convert the value of the specified string to its equivalent Unicode character in C#
- How to print Unicode character in C++?
- Check whether the Unicode character is a separator character in C#
- PHP – How to get the substitution character using mb_substitute_character[]?
- Java Program to Get a Character From the Given String
- Get the hyperbolic arc-cosine of a floating-point value in C#
- How to fetch character from Unicode number - JavaScript?
- How to convert an integer to a unicode character in Python?
- How to correctly get a value from a JSON PHP?
- Get the hyperbolic cosine of a given value in Java
- Get the hyperbolic sine of a given value in Java
[PHP 7 >= 7.2.0, PHP 8]
mb_chr — Return character by Unicode code point value
Description
mb_chr[int $codepoint
, ?string $encoding
= null
]:
string|false
This function complements mb_ord[].
Parameters
codepoint
A Unicode codepoint value, e.g. 128024
for U+1F418 ELEPHANT
encoding
The encoding
parameter is the character encoding. If it is omitted or
null
, the internal character encoding value will be used.
Return Values
A string containing the requested character, if it can be represented in the specified encoding or false
on failure.
Changelog
8.0.0 | encoding is nullable now.
|
Examples
Example #1 Testing different code points
The above example will output:
string[1] "A" string[1] "A" string[1] "?" string[1] "?" string[3] "€" bool[false] string[4] "🐘" bool[false]
See Also
- mb_internal_encoding[] - Set/Get internal character encoding
- mb_ord[] - Get Unicode code point of character
- IntlChar::ord[] - Return Unicode code point value of character
- chr[] - Generate a single-byte string from a number
There are no user contributed notes for this page.