Scatter plot with condition python

Suppose I have a data frame

name = ['A', 'B', 'C'] 
score = [2,4,6] 

I want to create a scatter plot with the following conditions, color the bubble as green if the score is greater than 3 and red otherwise. I'd also like to label the bubble with its respective name.

I'm only able to create a scatter plot with the bubble having the respective name.

j.doe

6624 silver badges18 bronze badges

asked May 11, 2019 at 7:21

You can use list comprehension to create a list of the colors for every score and use the c parameter of scatter[] to set the color inside the plot.

To lable the bubbles you can use annotate[] on the axis, see an example below.

import matplotlib.pyplot as plt

name = ['A', 'B', 'C'] 
score = [2,4,6] 

# Set color for every score
color = ['green' if x>3 else 'red' for x in score]

# Create scatter plot
fig, ax = plt.subplots[]
ax.scatter[name, score, c=color]

# Set label for every score inside scatter plot
for i, n in enumerate[name]:
    ax.annotate[n, [n,score[i]]]

plt.show[]

answered May 11, 2019 at 10:24

iljailja

2,4042 gold badges13 silver badges21 bronze badges

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    A conditioning plot or co-plot or subset plot is a scatter plot of two variables when conditioned on a third variable. The third variable is called the conditioning variable. This variable can have both values either continuous or categorical. In the continuous variable, we created subsets by dividing them into a smaller range of values. In categorical variables, the subsets are created based on different categories.

    Let’s take three variables X, Y and Z. Z be the variable which we divided into the k groups. Here, there are many ways in which a group can be formed such as:

    • By dividing the data into equal size of k groups.
    • By dividing the data into different clusters on the basis of scatter plot.
    • By dividing the range of data points into equal values.
    • The categorical data have natural grouping on the basis of different categories of the dataframe.

    Then, we plot n rows and m columns matrix where n*m >= k. Each set of [row, column] represents an individual scatter plot, in which each scatters plot consists of the following components.

    • Vertical Axis: Variable Y
    • Horizontal Axis: Variable X

    where, points in the group corresponding to row i and column j are used. 

    The conditioning plot provides the answer to the following questions:

    • Is there any relationship between the two variables?
    • If there is a relationship then, does the nature of the relationship depend upon the third variable?
    • Do different groups in the data behave similarly?
    • Are there any outliers in the data?

    Implementation

    Python3

    % matplotlib inline

    import numpy as np

    import seaborn as sns

    import matplotlib.pyplot as plt

    import pandas as pd

    titanic_dataset =pd.read_csv['train.csv']

    titanic_dataset.head[]

    sns.lmplot[x='Age', y ='Fare',hue='Survived', col ='Sex',data=titanic_dataset]

    sns.lmplot[x='Age', y ='Fare',hue='Survived', col ='Pclass',data=titanic_dataset]

    df1, df2 = titanic_dataset.loc[titanic_dataset['Age'] < 20 ] ,

        titanic_dataset.loc[titanic_dataset['Age'] >= 20 ]

    lm = sns.lmplot[x='Parch', y ='Fare',hue='Survived',data=df1]

    ax1 =lm.axes

    ax1=plt.gca[]

    ax1.set_title['Age < 20']

    lm_2 = sns.lmplot[x='Parch', y ='Fare',hue='Survived',data=df2]

    ax2 =lm_2.axes

    ax2=plt.gca[]

    ax2.set_title['Age >= 20']

    Conditional Plot on the basis of Sex

    Conditional Plot on the basis of Passenger_Class

    Conditional Plot on the basis of Age

    References:

    • NIST handbook

    What is a conditional scatter plot?

    A conditional plot, also known as a coplot or subset plot, is a plot of two variables contional on the value of a third variable [called the conditioning variable].

    How do you make a scatter plot in Python?

    Scatterplot example.
    import numpy as np..
    import matplotlib.pyplot as plt..
    # Create data..
    N = 500..
    colors = [0,0,0].
    area = np.pi*3..
    # Plot..
    plt.scatter[x, y, s=area, c=colors, alpha=0.5].

    How do you plot multiple features in Python?

    You can plot multiple lines from the data provided by an array in python using matplotlib. You can do it by specifying different columns of the array as the x and y-axis parameters in the matplotlib. pyplot. plot[] function.

    Chủ Đề