I first saw it used in building regular expressions across multiple lines as a method argument to re.compile[]
, so I assumed that r
stands for RegEx.
For example:
regex = re.compile[
r'^[A-Z]'
r'[A-Z0-9-]'
r'[A-Z]$', re.IGNORECASE
]
So what does r
mean in this case? Why do we need it?
Remi Guan
20.5k17 gold badges62 silver badges83 bronze badges
asked Jan 24, 2011 at 8:48
1
The r
means that the string is to be treated as a raw string, which means all escape codes will be ignored.
For an example:
'\n'
will be treated as a newline character, while r'\n'
will be treated as the characters \
followed by n
.
When an
'r'
or'R'
prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string. For example, the string literalr"\n"
consists of two characters: a backslash and a lowercase'n'
. String quotes can be escaped with a backslash, but the backslash remains in the string; for example,r"\""
is a valid string literal consisting of two characters: a backslash and a double quote;r"\"
is not a valid string literal [even a raw string cannot end in an odd number of backslashes]. Specifically, a raw string cannot end in a single backslash [since the backslash would escape the following quote character]. Note also that a single backslash followed by a newline is interpreted as those two characters as part of the string, not as a line continuation.
Source: Python string literals
answered Jan 24, 2011 at 8:49
2
It means that escapes won’t be translated. For example:
r'\n'
is a string with a backslash followed by the letter n
. [Without the r
it would be a newline.]
b
does stand for byte-string and is used in Python 3, where strings are
Unicode by default. In Python 2.x strings were byte-strings by default and you’d use u
to indicate Unicode.
answered Jan 24, 2011 at 8:49
1
Not the answer you're looking for? Browse other questions tagged python string syntax literals rawstring or ask your own question.
Summary: in this tutorial, you will learn about the Python raw strings and how to use them to handle strings that treat the backslashes as literal characters.
Introduction the Python raw strings
In Python, when you prefix a string with the letter r
or R
such as r'...'
and R'...'
, that string becomes a raw string. Unlike a regular string, a raw string treats the backslashes [\
] as literal
characters.
Raw strings are useful when you deal with strings that have many backslashes, for example, regular expressions or directory paths on Windows.
To represent special characters such as tabs and newlines, Python uses the backslash [\
] to signify the start of an escape sequence. For example:
Code language: Python [python]
s = 'lang\tver\nPython\t3' print[s]
Output:
Code language: Python [python]
lang ver Python 3
However, raw strings treat the backslash
[\
] as a literal character. For example:
Code language: Python [python]
s = r'lang\tver\nPython\t3' print[s]
Output:
Code language: Python [python]
lang\tver\nPython\t3
A raw string is like its regular string with the backslash [\
] represented as double backslashes [\\
]:
Code language: Python [python]
s1 = r'lang\tver\nPython\t3' s2 = 'lang\\tver\\nPython\\t3' print[s1 == s2] # True
In a regular string, Python counts an escape sequence as a single character:
Code language: Python [python]
s = '\n' print[len[s]] # 1
However, in a raw string, Python counts the backslash [\
] as one character:
Code language: Python [python]
s = r'\n' print[len[s]] # 2
Since the backslash [\
] escapes the single quote ['
] or double quotes ["
], a
raw string cannot end with an odd number of backslashes.
For example:
Code language: Python [python]
s = r'\'
Error:
Code language: Python [python]
SyntaxError: EOL while scanning string literal
Or
Code language: Python [python]
s = r'\\\'
Error:
Code language: Python [python]
SyntaxError: EOL while scanning string literal
Use raw strings to handle file path on Windows
Windows OS uses backslashes to separate paths. For example:
Code language: Python [python]
c:\user\tasks\new
If you use this path as a regular string, Python will issue a number of errors:
Code language: Python [python]
dir_path = 'c:\user\tasks\new'
Error:
Code language: Python [python]
SyntaxError: [unicode error] 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape
Python treats \u in the path as a Unicode escape but couldn’t decode it.
Now, if you escape the first backslash, you’ll have other issues:
Code language: Python [python]
dir_path = 'c:\\user\tasks\new' print[dir_path]
Output:
Code language: Python [python]
c:\user asks ew
In this example, the \t
is a tab and \n
is the newline.
To make it easy, you can turn the path into a raw string like this:
Code language: Python [python]
dir_path = r'c:\user\tasks\new' print[dir_path]
Convert a regular string into a raw string
To convert a regular string into a raw string, you use the built-in repr[] function. For example:
Code language: Python [python]
s = '\n' raw_string = repr[s] print[raw_string]
Output:
Code language: Python [python]
'\n'
Note that the result raw string has the quote at the beginning and end of the string. To remove them, you can use slices:
Code language: Python [python]
s = '\n' raw_string = repr[s][1:-1] print[raw_string]
Summary
- Prefix a literal string by the letter r or R to turn it into a raw string.
- Raw strings treat backslash a literal character.
Did you find this tutorial helpful ?