Hello, readers! In this article, we will be focusing on different ways to print column names in Python. So, let us get started! We often come across questions and problem statements wherein we feel the need to deal with data in an excel or
csv file i.e. in the form of rows and columns. Python, as a programming language, provides us with a data structure called ‘DataFrame’ to deal with rows and columns. A Python DataFrame consists of rows and columns and the Pandas module offers us various functions to manipulate and deal with the data occupied within these rows and columns. Today, we will be having a look at the various different ways through which we can fetch and display the column header/names of a dataframe or a csv file. We would be referring the below csv
file in the below examples–First, where do you find columns in Python?
1. Using pandas.dataframe.columns to print column names in Python
We can use pandas.dataframe.columns
variable to print the column tags or headers at ease. Have a look at the below syntax!
Example:
import pandas file = pandas.read_csv["D:/Edwisor_Project - Loan_Defaulter/bank-loan.csv"] for col in file.columns: print[col]
In this example, we have loaded the csv file into the environment. Further, we have printed the column names through a for loop using dataframe.columns variable.
Output:
age ed employ address income debtinc creddebt othdebt default
2. Using pandas.dataframe.columns.values
Python provides us with pandas.dataframe.columns.values
to extract the column names from the dataframe or csv file and print them.
Syntax:
Example:
import pandas file = pandas.read_csv["D:/Edwisor_Project - Loan_Defaulter/bank-loan.csv"] print[file.columns.values]
So, the data.columns.values gives us a list of column names/headers present in the dataframe.
Output:
['age' 'ed' 'employ' 'address' 'income' 'debtinc' 'creddebt' 'othdebt' 'default']
3. Python sorted[] method to get the column names
Python sorted[]
method can be used to get the list of column names of a dataframe in an
ascending order of columns.
Have a look at the below syntax!
Syntax:
Example:
import pandas file = pandas.read_csv["D:/Edwisor_Project - Loan_Defaulter/bank-loan.csv"] print[sorted[file]]
Output:
['address', 'age', 'creddebt', 'debtinc', 'default', 'ed', 'employ', 'income', 'othdebt']
Conclusion
By this, we have come to the end of this topic. Hope this article turns out to be a hack for you in terms of different solutions for a single problem statement.
For more such posts related to Python, Stay tuned and till then, Happy Learning!! 🙂
Comment: still not working when header of
fieldnames = ['Cascade', 'Market', 'Schedule', 'Name]
and
Sheet[['Cascade', 'Schedule', 'Name', 'Market']]
are equal.
Keep order of fieldnames
in col_idx
, was not my initial goal.
Question: I want to fetch data by column name
The following OOP
solution will work:
class OrderedByName[]:
"""
Privides a generator method, to iterate in Column Name ordered sequence
Provides subscription, to get columns index by name. using class[name]
"""
def __init__[self, sheet, fieldnames, row=0]:
"""
Create a OrderedDict {name:index} from 'fieldnames'
:param sheet: The Worksheet to use
:param fieldnames: Ordered List of Column Names
:param row: Default Row Index for the Header Row
"""
from collections import OrderedDict
self.columns = OrderedDict[].fromkeys[fieldnames, None]
for n in range[sheet.ncols]:
self.columns[sheet.cell[row, n].value] = n
@property
def ncols[self]:
"""
Generator, equal usage as range[xlrd.ncols],
to iterate columns in ordered sequence
:return: yield Column index
"""
for idx in self.columns.values[]:
yield idx
def __getitem__[self, item]:
"""
Make class object subscriptable
:param item: Column Name
:return: Columns index
"""
return self.columns[item]
Usage:
# Worksheet Data
sheet[[['Schedule', 'Cascade', 'Market'],
['SF05UB0', 'DO Macro Upgrade', 'Upper Cnetral Valley'],
['DE03HO0', 'DO Macro Upgrade', 'Toledo'],
['SF73XC4', 'DO Macro Upgrade', 'SF Bay']]
]
# Instantiate with Ordered List of Column Names
# NOTE the different Order of Column Names
by_name = OrderedByName[sheet, ['Cascade', 'Market', 'Schedule']]
# Iterate all Rows and all Columns Ordered as instantiated
for row in range[sheet.nrows]:
for col in by_name.ncols:
value = sheet.cell[row, col].value
print["cell[{}].value == {}".format[[row,col], value]]
Output:
cell[[0, 1]].value == Cascade cell[[0, 2]].value == Market cell[[0, 0]].value == Schedule cell[[1, 1]].value == DO Macro Upgrade cell[[1, 2]].value == Upper Cnetral Valley cell[[1, 0]].value == SF05UB0 cell[[2, 1]].value == DO Macro Upgrade cell[[2, 2]].value == Toledo cell[[2, 0]].value == DE03HO0 cell[[3, 1]].value == DO Macro Upgrade cell[[3, 2]].value == SF Bay cell[[3, 0]].value == SF73XC4
Get Index of one Column by Name
print["cell{}.value == {}".format[[1, by_name['Schedule']], sheet.cell[1, by_name['Schedule']].value]] #>>> cell[1, 0].value == SF05UB0
Tested with Python: 3.5