How do i print a list of column names in python?

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    While analyzing the real datasets which are often very huge in size, we might need to get the column names in order to perform some certain operations. Let’s discuss how to get column names in Pandas dataframe. First, let’s create a simple dataframe with nba.csv file. 

    Python3

    How do i print a list of column names in python?
     

    Now let’s try to get the columns name from above dataset.

    Method #1: Simply iterating over columns 

    Python3

    import pandas as pd 

    data = pd.read_csv("nba.csv"

    for col in data.columns:

        print(col)

    Output:

     

    How do i print a list of column names in python?
     

     Method #2: Using columns attribute with dataframe object 

    Python3

    import pandas as pd 

    data = pd.read_csv("nba.csv"

    list(data.columns)

    Output:

     

    How do i print a list of column names in python?
      

    Method #3: Using keys() function: It will also give the columns of the dataframe.

    Python3

    Output:

    Method #4:  column.values method returns an array of index. 

    Python3

    import pandas as pd 

    data = pd.read_csv("nba.csv"

    list(data.columns.values)

    Output: 

    How do i print a list of column names in python?
     

     Method #5: Using tolist() method with values with given the list of columns. 

    Python3

    import pandas as pd 

    data = pd.read_csv("nba.csv"

    list(data.columns.values.tolist())

    How do i print a list of column names in python?

    Output:

     

    How do i print a list of column names in python?
     

     Method #6: Using sorted() method : sorted() method will return the list of columns sorted in alphabetical order. 

    Python3

    import pandas as pd 

    data = pd.read_csv("nba.csv"

    sorted(data)

    Output:

     

    How do i print a list of column names in python?


    How to get or print Pandas DataFrame Column Names? You can get the Pandas DataFrame Column Names by using DataFrame.columns.values method and to get it as a list use tolist(). Each column in a Pandas DataFrame has a label/name that specifies what type of value it holds/represents. Getting a column names is useful when you wanted to access all columns by name programmatically or manipulate the values of all columns. In this article, I will explain different ways to get column names from pandas DataFrame headers with examples.

    To get a list of columns from the DataFrame header use DataFrame.columns.values.tolist() method. Below is an explanation of each section of the statement.

    • .columns returns an Index object with column names. This preserves the order of column names.
    • .columns.values returns an array and this has a helper function .tolist() that returns a list of column names.

    1. Quick Examples of Get Column Names

    Following are some quick examples of how to get column names from pandas DataFrame, If you wanted to print it to console just use the print() statment.

    
    # Below are some quick examples
    
    # Get the list of all column names from headers
    column_names = list(df.columns.values)
    
    # Get the list of all column names from headers
    column_names = df.columns.values.tolist()
    
    #Using list(df) to get the column headers as a list
    column_names = list(df.columns)
    
    #Using list(df) to get the list of all Column Names
    column_names = list(df)
    
    # Dataframe show all columns sorted list
    column_names=sorted(df)
    
    # Get all Column Header Labels as List
    for column_headers in df.columns: 
        print(column_headers)
        
    column_names = df.keys().values.tolist()
    
    # Get all numeric columns
    numeric_columns = df._get_numeric_data().columns.values.tolist()
    
    # Simple Pandas Numeric Columns Code
    numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist()
    

    Create a Pandas DataFrame from Dict with a few rows and with columns names CoursesFeeDuration and Discount.

    
    import pandas as pd
    import numpy as np
    
    technologies= {
        'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
        'Fee' :[22000,25000,23000,24000,26000],
        'Duration':['30days','50days','30days', None,np.nan],
        'Discount':[1000,2300,1000,1200,2500]
              }
    df = pd.DataFrame(technologies)
    print(df)
    

    You can get the column names from pandas DataFrame using df.columns.values, and pass this to python list() function to get it as list, once you have the data you can print it using print() statement. I will take a moment to explain what is happening on this statement, df.columns attribute returns an Index object which is a basic object that stores axis labels. Index object provides a property Index.values that returns data in an array, in our case it returns column names in an array.

    Note that df.columns preserve the order of the columns as-is.

    To convert an array of column names into a list, we can use either .toList() on array object or use list(array object).

    
    # Get the list of all column names from headers
    column_headers = list(df.columns.values)
    print("The Column Header :", column_headers)
    

    Yields below output.

    
    The Column Header : ['Courses', 'Fee', 'Duration', 'Discount']
    

    You can also use df.columns.values.tolist() to get the DataFrame column names.

    
    # Get the list of all column names from headers
    column_headers = df.columns.values.tolist()
    print("The Column Header :", column_headers)
    

    3. Use list(df) to Get Column Names from DataFrame

    Use list(df) to get the column header from pandas DataFrame. You can also use list(df.columns) to get column names.

    
    #Using list(df) to get the column headers as a list
    column_headers = list(df.columns)
    
    #Using list(df) to get the list of all Column Names
    column_headers = list(df)
    

    4. Get Column Names in Sorting order

    In order to get a list of column names in a sorted order use sorted(df) function. this function returns column names in alphabetical order.

    
    # Dataframe show all columns sorted list
    col_headers=sorted(df)
    print(col_headers)
    

    Yields below output. Notice the difference of output from above.

    
    ['Courses', 'Discount', 'Duration', 'Fee']
    

    5. Access All Column Names by Iterating

    Sometimes you may need to iterate over all columns and apply some function, you can do this as below.

    
    # Get all Column Header Labels as List
    for column_headers in df.columns: 
        print(column_headers)
    

    Yields below output.

    
    Courses
    Fee
    Duration
    Discount
    

    6. Get Column Headers Using the keys() Method

    df.keys() is another approach to get all column names as a list from pandas DataFrame.

    
    column_headers = df.keys().values.tolist()
    print("The Column Header :", column_headers)
    

    Yields below output.

    
    The Column Header : Index(['Courses', 'Fee', 'Duration', 'Discount'], dtype='object')
    

    7. Get All Numeric Column Names

    Sometimes while working on the analytics, you may need to work only on numeric columns, hence you would be required to get all columns of a specific data type. For example, getting all columns of numeric data type can get using undocumented function df._get_numeric_data().

    
    # Get all numeric columns
    numeric_columns = df._get_numeric_data().columns.values.tolist()
    print(numeric_columns)
    

    Yields below output.

    
    ['Fee', 'Discount']
    

    Use for df.dtypes[df.dtypes!="Courses"].index: This is another simple code for finding numeric columns in a pandas DataFrame.

    
    # Simple Pandas Numeric Columns Code
    numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist()
    

    Yields same output as above.

    9. Complete Example of pandas Get Columns Names

    
    import pandas as pd
    import numpy as np
    
    technologies= {
        'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
        'Fee' :[22000,25000,23000,24000,26000],
        'Duration':['30days','50days','30days', None,np.nan],
        'Discount':[1000,2300,1000,1200,2500]
              }
    df = pd.DataFrame(technologies)
    print(df)
    
    # Get the list of all column names from headers
    column_headers = list(df.columns.values)
    print("The Column Header :", column_headers)
    
    # Get the list of all column names from headers
    column_headers = df.columns.values.tolist()
    print("The Column Header :", column_headers)
    
    #Using list(df) to get the column headers as a list
    column_headers = list(df.columns)
    
    #Using list(df) to get the list of all Column Names
    column_headers = list(df)
    
    # Dataframe show all columns sorted list
    col_headers=sorted(df)
    print(col_headers)
    
    # Get all Column Header Labels as List
    for column_headers in df.columns: 
        print(column_headers)
        
    column_headers = df.keys().values.tolist()
    print("The Column Header :", column_headers)
    
    # Get all numeric columns
    numeric_columns = df._get_numeric_data().columns.values.tolist()
    print(numeric_columns)
    
    # Simple Pandas Numeric Columns Code
    numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist()
    print(numeric_columns)
    

    Conclusion

    In this article, you have learned how to get or print the column names using df.columns, list(df), df.keys, and also learned how to get all column names of type integer, finally getting column names in a sorted order e.t.c

    Happy Learning !!

    You May Also Like

    • How to Combine Two Columns of Text in Pandas DataFrame
    • What is a Pandas DataFrame Explained With Examples

    References

    • https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html

    How do I get a list of column names in Python?

    You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.

    How do I print a list of column names?

    First, we use the DataFrame.columns method to print all names:.
    Get the Column Names Using the columns() Method. ... .
    Using the keys() Method. ... .
    By Iterating of the Columns. ... .
    Using list() to Print the Names as a list. ... .
    Using tolist() to Print the Names as a List. ... .
    Using sorted() to Get an Ordered List..

    How do I print multiple columns in Python?

    There are three basic methods you can use to select multiple columns of a pandas DataFrame:.
    Method 1: Select Columns by Index df_new = df. iloc[:, [0,1,3]].
    Method 2: Select Columns in Index Range df_new = df. iloc[:, 0:3].
    Method 3: Select Columns by Name df_new = df[['col1', 'col2']].

    How do you print columns in Python?

    Print With Column Alignment in Python.
    Use the % Formatting to Print With Column Alignment in Python..
    Use the format() Function to Print With Column Alignment in Python..
    Use f-strings to Print With Column Alignment in Python..
    Use the expandtabs() Function to Print With Column Alignment in Python..