Python csv count value in column

I am currently trying to count repeated values in a column of a CSV file and return the value to another CSV column in a python.

For example, my CSV file :

KeyID    GeneralID
145258   KL456
145259   BG486
145260   HJ789
145261   KL456

What I want to achieve is to count how many data have the same GeneralID and insert it into a new CSV column. For example,

KeyID    Total_GeneralID
145258   2
145259   1
145260   1
145261   2

I have tried to split each column using the split method but it didn't work so well.

My code :

case_id_list_data = []

with open[file_path_1, "rU"] as g:
    for line in g:
        case_id_list_data.append[line.split['\t']]
        #print case_id_list_data[0][0] #the result is dissatisfying 
        #I'm stuck here.. 

Stephen Rauch

45.7k30 gold badges105 silver badges126 bronze badges

asked Apr 25, 2017 at 5:21

And if you are adverse to pandas and want to stay with the standard library:

Code:

import csv
from collections import Counter
with open['file1', 'rU'] as f:
    reader = csv.reader[f, delimiter='\t']
    header = next[reader]
    lines = [line for line in reader]
    counts = Counter[[l[1] for l in lines]]

new_lines = [l + [str[counts[l[1]]]] for l in lines]
with open['file2', 'wb'] as f:
    writer = csv.writer[f, delimiter='\t']
    writer.writerow[header + ['Total_GeneralID']]
    writer.writerows[new_lines]

Results:

KeyID   GeneralID   Total_GeneralID
145258  KL456   2
145259  BG486   1
145260  HJ789   1
145261  KL456   2

answered Apr 25, 2017 at 5:42

Stephen RauchStephen Rauch

45.7k30 gold badges105 silver badges126 bronze badges

2

You have to divide the task in three steps: 1. Read CSV file 2. Generate new column's value 3. Add value to the file back import csv import fileinput import sys

# 1. Read CSV file
# This is opening CSV and reading value from it.
with open["dev.csv"] as filein:
    reader = csv.reader[filein, skipinitialspace = True]
    xs, ys = zip[*reader]

result=["Total_GeneralID"]

# 2. Generate new column's value
# This loop is for counting the "GeneralID" element.
for i in range[1,len[ys],1]:
    result.append[ys.count[ys[i]]]

# 3. Add value to the file back
# This loop is for writing new column
for ind,line in enumerate[fileinput.input["dev.csv",inplace=True]]:
    sys.stdout.write["{} {}, {}\n".format["",line.rstrip[],result[ind]]]

I haven't use temp file or any high level module like panda or anything.

answered Apr 25, 2017 at 6:39

Raj DamaniRaj Damani

7756 silver badges19 bronze badges

2

import pandas as pd
#read your csv to a dataframe
df = pd.read_csv['file_path_1']
#generate the Total_GeneralID by counting the values in the GeneralID column and extract the occurrance for the current row.
df['Total_GeneralID'] = df.GeneralID.apply[lambda x: df.GeneralID.value_counts[][x]]
df = df[['KeyID','Total_GeneralID']]
Out[442]: 
    KeyID  Total_GeneralID
0  145258                2
1  145259                1
2  145260                1
3  145261                2

answered Apr 25, 2017 at 5:29

Allen QinAllen Qin

18.8k6 gold badges48 silver badges62 bronze badges

You can use pandas library:

  • first read_csv
  • get counts of values in column GeneralID by value_counts, rename by output column
  • join to original DataFrame
import pandas as pd

df = pd.read_csv['file']
s = df['GeneralID'].value_counts[].rename['Total_GeneralID']
df = df.join[s, on='GeneralID']
print [df]
    KeyID GeneralID  Total_GeneralID
0  145258     KL456                2
1  145259     BG486                1
2  145260     HJ789                1
3  145261     KL456                2

answered Apr 25, 2017 at 5:37

jezraeljezrael

756k83 gold badges1203 silver badges1139 bronze badges

Use csv.reader instead of split[] method. Its easier.

Thanks

answered Apr 25, 2017 at 6:48

Vikram SharmaVikram Sharma

1992 gold badges2 silver badges13 bronze badges

How do you count values in a column in Python?

We can count by using the value_counts[] method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

How do I count data in a CSV file in Python?

Use len[] and list[] on a CSV reader to count lines in a CSV file.
Open the CSV file within Python using the open[file] function with file as a CSV file..
Create a CSV reader by calling the function csv. ... .
Get a list representation of the CSV file by calling list[[*args]] with *args as the reader from the previous step..

How do I count the number of entries in a column in pandas?

How do you Count the Number of Occurrences in a data frame? To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts[] method. For example, if you type df['condition']. value_counts[] you will get the frequency of each unique value in the column “condition”.

How do I count the number of rows and columns in a csv file in Python?

To get the number of rows, and columns we can use len[df. axes[]] function in Python.

Chủ Đề