How do i extract numbers from a string in python pandas?

Given the following data frame:

import pandas as pd
import numpy as np
df = pd.DataFrame[{'A':['1a',np.nan,'10a','100b','0b'],
                   }]
df

    A
0   1a
1   NaN
2   10a
3   100b
4   0b

I'd like to extract the numbers from each cell [where they exist]. The desired result is:

    A
0   1
1   NaN
2   10
3   100
4   0

I know it can be done with str.extract, but I'm not sure how.

In this article, we will learn to extract the numbers from a given string in Python.

Table Of Contents

  • What is a String in Python
  • Extract numbers from string using isdigit[] in List Comprehension :
  • Extract numbers from string using re.findall[] method
  • Extract numbers from string using split[] and append[] methods :
  • Extract numbers from string using nums_from_string library :

What is a String in Python

A String is an array of bytes representing Unicode characters enclosed in single, double or triple quotes. The Enclosed characters can be digits, alphabets or special symbols. A String is just a normal text and is human readable. Strings are immutable in Python. It means that once a string object is defined then it can not be changed.

Here we will have a string that is made up of numbers and alphabets,

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50 in 350 matched.'

# type[] will print data type of string_var
print[type[string_var]]

OUTPUT :

Advertisements

You can see we have a string with some numbers in it. Our job is to extract those numbers using python programming language.

In this method we are going to use a combination of three different methods to extract number from a given string. The List Comprehension, isdigit[] method and the split[] method are the three different methods.

List Comprehension is a condition based shorter syntax through which you can filter values in a new list. Here in this method,

  • The split[] method converts the string to list of substrings.
  • List Comprehension iterates over this list of sub-string,
  • During iteration of substrings, isdigit[] method helps to check for digits

This we can extract all numbers from a string in a list. Let’s see the complete example,

EXAMPLE :

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'

numbers = [int[new_string] for new_string in str.split[string_var] if new_string.isdigit[]]

print[numbers]

# type[] will print data type of string_var
print[type[numbers]]

OUTPUT :

[10773, 350]

Here you can see with the combination of three different methods we have successfully extracted numbers from a string. But this method has a flaw as you can see it doesn’t prints the avg, which is of float data type.

Now we will use findall[] method of the regex module in Python. The re module stands for Regular Expression, which comes bundled with python library.

It uses the backslash character [‘\’] to indicate special forms. The re.findall[] scans the given string from left to right and checks if the given string has a specified pattern which may be in the form of digits or any other data type. It return a list with all the matching values.Lets see an example .

EXAMPLE :

import re

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'

x = [float[x] for x in re.findall[r'-?\d+\.?\d*',string_var]]

print[x]

OUTPUT :

[10773.0, 50.58, 350.0]

In above example you can see using re.findall[] has returned all the numbers in the str_var in a list x using List Comprehension.

Extract numbers from string using split[] and append[] methods :

Another alternative through which we can extract numbers from a given string is using a combination of split[] and append[] function. In this method we will use the split[] method to split the given string and append it to a list.

  • split[] : A built in function of python used to split string into a list.
  • append[] : Built in function of python used to add an item to the end of a list.

Lets see an example of this mehtod.

EXAMPLE :

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'
x = []

# Iterate over the words in a string
for i in string_var.split[]:
    try:
        # Convert word to float and add in list
        x.append[float[i]]
    except ValueError :
        pass

print[x]

OUTPUT :

[10773.0, 50.58, 350.0]

In code above example, you can see how we used both split[] and append[] methods to extract numbers from str_var. Here we always except a ValueError. If try and except are not used here, then it will throw an error like this:

    x.append[float[i]] 
ValueError: could not convert string to float: 'MSD'

Basically we iterated over all words in a string and for each word we converted it to float and added in list. If any word was not numeric then float[] will throw error, which we catched and skipped.

Next method that we will use is get_nums[] function of nums_from_string library. This library doesn’t comes pre bundled with Python, so we have to install it.Just type pip insttall nums_from_string in your terminal. After installing this is the most easiest method through which we can extract numbers from string.

Look the code below .

EXAMPLE :

import nums_from_string

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'
print[nums_from_string.get_nums[string_var]]

OUTPUT :

[10773, 50.58, 350]

You can see in above example through nums_from_string we can successfully extract numbers from string without specifying any data type like float or int etc.

Summary

So we have seen four different methods through which we can extract numbers from a string in Python. The most easiest method is get_nums[], which is a function of nums_from_string library. Its only drwaback is that, it doesn’t comes bundled with python and you have to install it. Other methods like isdigit[] may not be useful because it dosen’t extracts float type numbers. In method 3 you have to do error handling otherwise it will throw a ValueError. We have used Python 3.10.1 for writing example codes. To check your version write python –version in your terminal.

How do I extract numbers from a string in Python?

Summary: To extract numbers from a given string in Python you can use one of the following methods:.
Use the regex module..
Use split[] and append[] functions on a list..
Use a List Comprehension with isdigit[] and split[] functions..
Use the num_from_string module..

How do I extract numbers from a string?

With our Ultimate Suite added to your Excel ribbon, this is how you can quickly retrieve number from any alphanumeric string:.
Go to the Ablebits Data tab > Text group, and click Extract:.
Select all cells with the source strings..
On the Extract tool's pane, select the Extract numbers radio button..

How do I extract a numeric value from a column?

Step 1 - Identify numbers. The ISNUMBER function checks if a value is a number, returns TRUE or FALSE. Function syntax: ISNUMBER[value] ... .
Step 2 - Filter numbers. The FILTER function extracts values/rows based on a condition or criteria..

How do you extract values from a DataFrame in Python?

get_value[] function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.

Chủ Đề