How do i read a csv file in a row wise in python?
The Show
Besides, there are 2 ways to get all (or specific) columns with pure simple Python code. 1. csv.DictReader
It is easy to understand. 2. csv.reader with zip
This is not very clear, but efficient.
Demo ResultThe content of
The result of 1:
The result of 2: In this article we will discuss how to read a CSV file line by line with or without header. Also select specific columns while iterating over a CSV file line by line. Suppose we have a csv file students.csv and its contents are, We want to read all the rows of this csv file line by line and process
each line at a time. Also note that, here we don’t want to read all lines into a list of lists and then iterate over it, because that will not be an efficient solution for large csv file i.e. file with size in GBs. We are looking for solutions where we read & process only one line at a time while iterating through all rows of csv, so that minimum memory is utilized. Let’s see how to do this, Advertisements Python has a csv module, which provides two different classes to read the contents of a csv file i.e. csv.reader and csv.DictReader. Let’s discuss & use them one by one to read a csv file line by line, Read a CSV file line by line using csv.readerWith csv module’s reader class object we can iterate over the lines of a csv file as a list of values, where each value in the list is a cell value. Let’s understand with an example, from csv import reader # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to reader() to get the reader object csv_reader = reader(read_obj) # Iterate over each row in the csv using reader object for row in csv_reader: # row variable is a list that represents a row in csv print(row) Output: ['Id', 'Name', 'Course', 'City', 'Session'] ['21', 'Mark', 'Python', 'London', 'Morning'] ['22', 'John', 'Python', 'Tokyo', 'Evening'] ['23', 'Sam', 'Python', 'Paris', 'Morning'] ['32', 'Shaun', 'Java', 'Tokyo', 'Morning'] It iterates over all the rows of students.csv file. For each row it fetched the contents of that row as a list and printed that list. How did it work ? It performed the following steps,
This way only one line will be in memory at a time while iterating through csv file, which makes it a memory efficient solution. In the previous example we iterated through all the rows of csv file including header. But suppose we want to skip the header and iterate over the remaining rows of csv file. from csv import reader # skip first line i.e. read header first and then iterate over each row od csv as a list with open('students.csv', 'r') as read_obj: csv_reader = reader(read_obj) header = next(csv_reader) # Check file as empty if header != None: # Iterate over each row after the header in the csv for row in csv_reader: # row variable is a list that represents a row in csv print(row) Output: ['21', 'Mark', 'Python', 'London', 'Morning'] ['22', 'John', 'Python', 'Tokyo', 'Evening'] ['23', 'Sam', 'Python', 'Paris', 'Morning'] ['32', 'Shaun', 'Java', 'Tokyo', 'Morning'] Header was: ['Id', 'Name', 'Course', 'City', 'Session'] It skipped the header row of csv file and iterate over all the remaining rows of students.csv file. For each row it fetched the contents of that row as a list and printed that list. In initially saved the header row in a separate variable and printed that in end. How did it work ? As reader() function returns an iterator object, which we can use with Python for loop to iterate over the rows. But in the above example we called the next() function on this iterator object initially, which returned the first row of csv. After that we used the iterator object with for loop to iterate over remaining rows of the csv file. Read csv file line by line using csv module DictReader objectWith csv module’s DictReader class object we can iterate over the lines of a csv file as a dictionary i.e. from csv import DictReader # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to DictReader() to get the DictReader object csv_dict_reader = DictReader(read_obj) # iterate over each line as a ordered dictionary for row in csv_dict_reader: # row variable is a dictionary that represents a row in csv print(row) Output: {'Id': '21', 'Name': 'Mark', 'Course': 'Python', 'City': 'London', 'Session': 'Morning'} {'Id': '22', 'Name': 'John', 'Course': 'Python', 'City': 'Tokyo', 'Session': 'Evening'} {'Id': '23', 'Name': 'Sam', 'Course': 'Python', 'City': 'Paris', 'Session': 'Morning'} {'Id': '32', 'Name': 'Shaun', 'Course': 'Java', 'City': 'Tokyo', 'Session': 'Morning'} It iterates over all the rows of students.csv file. For each row it fetches the contents of that row as a dictionary and printed that list. How did it work ? It performed the following steps,
It is a memory efficient solution, because at a time only one line is in memory. Get column names from header in csv fileDictReader class has a member function that returns the column names of the csv file as list. from csv import DictReader # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to DictReader() to get the DictReader object csv_dict_reader = DictReader(read_obj) # get column names from a csv file column_names = csv_dict_reader.fieldnames print(column_names) Output: ['Id', 'Name', 'Course', 'City', 'Session'] Read specific columns from a csv file while iterating line by lineRead specific columns (by column name) in a csv file while iterating row by row Iterate over all the rows of students.csv file line by line, but print only two columns of for each row, from csv import DictReader # iterate over each line as a ordered dictionary and print only few column by column name with open('students.csv', 'r') as read_obj: csv_dict_reader = DictReader(read_obj) for row in csv_dict_reader: print(row['Id'], row['Name']) Output: 21 Mark 22 John 23 Sam 32 Shaun DictReader returns a dictionary for each line during iteration. As in this dictionary keys are column names and values are cell values for that column. So, for selecting specific columns in every row, we used column name with the dictionary object. Read specific columns (by column Number) in a csv file while iterating row by row Iterate over all rows students.csv and for each row print contents of 2ns and 3rd column, from csv import reader # iterate over each line as a ordered dictionary and print only few column by column Number with open('students.csv', 'r') as read_obj: csv_reader = reader(read_obj) for row in csv_reader: print(row[1], row[2]) Output: Name Course Mark Python John Python Sam Python Shaun Java With csv.reader each row of csv file is fetched as a list of values, where each value represents a column value. So, selecting 2nd & 3rd column for each row, select elements at index 1 and 2 from the list. The
complete example is as follows, from csv import reader from csv import DictReader def main(): print('*** Read csv file line by line using csv module reader object ***') print('*** Iterate over each row of a csv file as list using reader object ***') # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to reader() to get the reader object csv_reader = reader(read_obj) # Iterate over each row in the csv using reader object for row in csv_reader: # row variable is a list that represents a row in csv print(row) print('*** Read csv line by line without header ***') # skip first line i.e. read header first and then iterate over each row od csv as a list with open('students.csv', 'r') as read_obj: csv_reader = reader(read_obj) header = next(csv_reader) # Check file as empty if header != None: # Iterate over each row after the header in the csv for row in csv_reader: # row variable is a list that represents a row in csv print(row) print('Header was: ') print(header) print('*** Read csv file line by line using csv module DictReader object ***') # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to DictReader() to get the DictReader object csv_dict_reader = DictReader(read_obj) # iterate over each line as a ordered dictionary for row in csv_dict_reader: # row variable is a dictionary that represents a row in csv print(row) print('*** select elements by column name while reading csv file line by line ***') # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to DictReader() to get the DictReader object csv_dict_reader = DictReader(read_obj) # iterate over each line as a ordered dictionary for row in csv_dict_reader: # row variable is a dictionary that represents a row in csv print(row['Name'], ' is from ' , row['City'] , ' and he is studying ', row['Course']) print('*** Get column names from header in csv file ***') # open file in read mode with open('students.csv', 'r') as read_obj: # pass the file object to DictReader() to get the DictReader object csv_dict_reader = DictReader(read_obj) # get column names from a csv file column_names = csv_dict_reader.fieldnames print(column_names) print('*** Read specific columns from a csv file while iterating line by line ***') print('*** Read specific columns (by column name) in a csv file while iterating row by row ***') # iterate over each line as a ordered dictionary and print only few column by column name with open('students.csv', 'r') as read_obj: csv_dict_reader = DictReader(read_obj) for row in csv_dict_reader: print(row['Id'], row['Name']) print('*** Read specific columns (by column Number) in a csv file while iterating row by row ***') # iterate over each line as a ordered dictionary and print only few column by column Number with open('students.csv', 'r') as read_obj: csv_reader = reader(read_obj) for row in csv_reader: print(row[1], row[2]) if __name__ == '__main__': main() Output: *** Read csv file line by line using csv module reader object *** *** Iterate over each row of a csv file as list using reader object *** ['Id', 'Name', 'Course', 'City', 'Session'] ['21', 'Mark', 'Python', 'London', 'Morning'] ['22', 'John', 'Python', 'Tokyo', 'Evening'] ['23', 'Sam', 'Python', 'Paris', 'Morning'] ['32', 'Shaun', 'Java', 'Tokyo', 'Morning'] *** Read csv line by line without header *** ['21', 'Mark', 'Python', 'London', 'Morning'] ['22', 'John', 'Python', 'Tokyo', 'Evening'] ['23', 'Sam', 'Python', 'Paris', 'Morning'] ['32', 'Shaun', 'Java', 'Tokyo', 'Morning'] Header was: ['Id', 'Name', 'Course', 'City', 'Session'] *** Read csv file line by line using csv module DictReader object *** {'Id': '21', 'Name': 'Mark', 'Course': 'Python', 'City': 'London', 'Session': 'Morning'} {'Id': '22', 'Name': 'John', 'Course': 'Python', 'City': 'Tokyo', 'Session': 'Evening'} {'Id': '23', 'Name': 'Sam', 'Course': 'Python', 'City': 'Paris', 'Session': 'Morning'} {'Id': '32', 'Name': 'Shaun', 'Course': 'Java', 'City': 'Tokyo', 'Session': 'Morning'} *** select elements by column name while reading csv file line by line *** Mark is from London and he is studying Python John is from Tokyo and he is studying Python Sam is from Paris and he is studying Python Shaun is from Tokyo and he is studying Java *** Get column names from header in csv file *** ['Id', 'Name', 'Course', 'City', 'Session'] *** Read specific columns from a csv file while iterating line by line *** *** Read specific columns (by column name) in a csv file while iterating row by row *** 21 Mark 22 John 23 Sam 32 Shaun *** Read specific columns (by column Number) in a csv file while iterating row by row *** Name Course Mark Python John Python Sam Python Shaun Java How do I read a column wise data from a CSV file in Python?Python3. In this method we will import the csv library and open the file in reading mode, then we will use the DictReader() function to read the data of the CSV file. This function is like a regular reader, but it maps the information to a dictionary whose keys are given by the column names and all the values as keys.
How do I read a CSV file row by row in Python using pandas?15 ways to read CSV file with pandas. Example 1 : Read CSV file with header row.. Example 2 : Read CSV file with header in second row.. Example 3 : Skip rows but keep header.. Example 4 : Read CSV file without header row.. Example 5 : Specify missing values.. Example 6 : Set Index Column.. Example 7 : Read CSV File from External URL.. How do you read a CSV file in a table in Python?Steps to read a CSV file:. Import the csv library. import csv.. Open the CSV file. The . ... . Use the csv.reader object to read the CSV file. csvreader = csv.reader(file). Extract the field names. Create an empty list called header. ... . Extract the rows/records. ... . Close the file.. How do I make pandas read only a few rows?2 Answers. nrows : int, default None Number of rows of file to read. Useful for reading pieces of large files*. skiprows : list-like or integer Row numbers to skip (0-indexed) or number of rows to skip (int) at the start of the file.. chunksize : int, default None Return TextFileReader object for iteration.. |