The answer by Aasmund Eldhuset is what I was attempting to do but I was beaten to the punch. It shows a lot of research and should definitely be the accepted answer.
If you want confirmation of that answer [or just want to test it in a different implementation, such as a non-CPython one, or a later one which may use a different Unicode standard under the covers], the following short program will print out the actual characters that cause a split when using
.split[]
with no arguments.
It does this by constructing a string with the a
and b
characters[a] separated by the character being tested, then detecting if split
creates an array more than one element:
int_ch = 0
while True:
try:
test_str = "a" + chr[int_ch] + "b"
except Exception as e:
print[f'Stopping, {e}']
break
if len[test_str.split[]] != 1:
print[f'0x{int_ch:06x} [{int_ch}]']
int_ch += 1
The output [for my system] is as follows:
0x000009 [9]
0x00000a [10]
0x00000b [11]
0x00000c [12]
0x00000d [13]
0x00001c [28]
0x00001d [29]
0x00001e [30]
0x00001f [31]
0x000020 [32]
0x000085 [133]
0x0000a0 [160]
0x001680 [5760]
0x002000 [8192]
0x002001 [8193]
0x002002 [8194]
0x002003 [8195]
0x002004 [8196]
0x002005 [8197]
0x002006 [8198]
0x002007 [8199]
0x002008 [8200]
0x002009 [8201]
0x00200a [8202]
0x002028 [8232]
0x002029 [8233]
0x00202f [8239]
0x00205f [8287]
0x003000 [12288]
Stopping, chr[] arg not in range[0x110000]
You can ignore the error at the end, that's just to confirm it doesn't fail until we've moved out of the valid Unicode area [code points 0x000000 - 0x10ffff
making up the seventeen planes].
[a]
I'm hoping that no future version of Python ever considers a
or b
to be whitespace, as that would totally break this [and a lot of other] code.
I think the chances of that are rather slim, so it should be fine :-]
View Discussion
Improve Article
Save Article
View Discussion
Improve Article
Save Article
Python String isspace[] method returns “True” if all characters in the string are whitespace characters, Otherwise, It returns “False”. This function is used to check if the argument contains all whitespace characters, such as:
- ‘ ‘ – Space
- ‘\t’ – Horizontal tab
- ‘\n’ – Newline
- ‘\v’ – Vertical tab
- ‘\f’ – Feed
- ‘\r’ – Carriage return
Python String isspace[] Method Syntax
Syntax: string.isspace[]
Returns:
- True – If all characters in the string are whitespace characters.
- False – If the string contains 1 or more non-whitespace characters.
Python String isspace[] Method Example
Python3
string
=
"\n\t\n"
print
[string.isspace[]]
Output:
True
Example 1: Basic Intuition of isspace[] in Program
Here we will check whitespace in the string using isspace[] program.
Python3
string
=
'Geeksforgeeks'
print
[string.isspace[]]
string
=
'\n \n \n'
print
[string.isspace[]]
string
=
'Geeks\nfor\ngeeks'
print
[ string.isspace[]]
Output:
False True False
Example 2: Practical Application
Given a string in Python, count the number of whitespace characters in the string.
Input : string = 'My name is Ayush' Output : 3 Input : string = 'My name is \n\n\n\n\nAyush' Output : 8
Algorithm:
- Traverse the given string character by character up to its length, check if the character is a whitespace character.
- If it is a whitespace character, increment the counter by 1, else traverse to the next character.
- Print the value of the counter.
Python3
string
=
'My name is Ayush'
count
=
0
for
a
in
string:
if
[a.isspace[]]
=
=
True
:
count
+
=
1
print
[count]
string
=
'My name is \n\n\n\n\nAyush'
count
=
0
for
a
in
string:
if
[a.isspace[]]
=
=
True
:
count
+
=
1
print
[count]
Output:
3 8