Hướng dẫn does not contain python

I've done some searching and can't figure out how to filter a dataframe by

df["col"].str.contains[word]

however I'm wondering if there is a way to do the reverse: filter a dataframe by that set's compliment. eg: to the effect of

![df["col"].str.contains[word]]

Can this be done through a DataFrame method?

asked Jun 13, 2013 at 21:43

You can use the invert [~] operator [which acts like a not for boolean data]:

new_df = df[~df["col"].str.contains[word]]

where new_df is the copy returned by RHS.

contains also accepts a regular expression...

If the above throws a ValueError or TypeError, the reason is likely because you have mixed datatypes, so use na=False:

new_df = df[~df["col"].str.contains[word, na=False]]

Or,

new_df = df[df["col"].str.contains[word] == False]

fantabolous

19.6k6 gold badges52 silver badges46 bronze badges

answered Jun 13, 2013 at 21:51

Andy HaydenAndy Hayden

337k96 gold badges603 silver badges517 bronze badges

2

I was having trouble with the not [~] symbol as well, so here's another way from another StackOverflow thread:

df[df["col"].str.contains['this|that']==False]

Shaido

26.3k21 gold badges68 silver badges72 bronze badges

answered Dec 15, 2016 at 21:10

nanselm2nanselm2

1,27710 silver badges11 bronze badges

3

You can use Apply and Lambda :

df[df["col"].apply[lambda x: word not in x]]

Or if you want to define more complex rule, you can use AND:

df[df["col"].apply[lambda x: word_1 not in x and word_2 not in x]]

answered Jan 14, 2019 at 3:13

ArashArash

8241 gold badge8 silver badges17 bronze badges

6

I hope the answers are already posted

I am adding the framework to find multiple words and negate those from dataFrame.

Here 'word1','word2','word3','word4' = list of patterns to search

df = DataFrame

column_a = A column name from DataFrame df

values_to_remove = ['word1','word2','word3','word4'] 

pattern = '|'.join[values_to_remove]

result = df.loc[~df['column_a'].str.contains[pattern, case=False]]

answered Feb 8, 2019 at 13:37

NursnaazNursnaaz

1,89319 silver badges26 bronze badges

1

I had to get rid of the NULL values before using the command recommended by Andy above. An example:

df = pd.DataFrame[index = [0, 1, 2], columns=['first', 'second', 'third']]
df.ix[:, 'first'] = 'myword'
df.ix[0, 'second'] = 'myword'
df.ix[2, 'second'] = 'myword'
df.ix[1, 'third'] = 'myword'
df

    first   second  third
0   myword  myword   NaN
1   myword  NaN      myword 
2   myword  myword   NaN

Now running the command:

~df["second"].str.contains[word]

I get the following error:

TypeError: bad operand type for unary ~: 'float'

I got rid of the NULL values using dropna[] or fillna[] first and retried the command with no problem.

answered Nov 22, 2016 at 22:06

ShoreshShoresh

2,40315 silver badges9 bronze badges

2

Additional to nanselm2's answer, you can use 0 instead of False:

df["col"].str.contains[word]==0

answered Oct 16, 2018 at 7:01

U12-ForwardU12-Forward

66k12 gold badges76 silver badges95 bronze badges

1

To negate your query use ~. Using query has the advantage of returning the valid observations of df directly:

df.query['~col.str.contains["word"].values']

answered Apr 16 at 21:09

rachwarachwa

7904 silver badges15 bronze badges

To compliment to the above question, if someone wants to remove all the rows with strings, one could do:

df_new=df[~df['col_name'].apply[lambda x: isinstance[x, str]]]

answered Aug 5, 2021 at 14:28

vasanthvasanth

331 silver badge5 bronze badges

somehow '.contains' didn't work for me but when I tried with '.isin' as mentioned by @kenan in the answer [How to drop rows from pandas data frame that contains a particular string in a particular column?] it works. Adding further, if you want to look at the entire dataframe and remove those rows which has the specific word [or set of words] just use the loop below

for col in df.columns:
    df = df[~df[col].isin[['string or string list separeted by comma']]]

just remove ~ to get the dataframe that contains the word

answered Jun 15 at 12:03

Bhanu ChanderBhanu Chander

3381 gold badge5 silver badges15 bronze badges

Not the answer you're looking for? Browse other questions tagged python pandas contains or ask your own question.

Chủ Đề