Bạn có thể sử dụng cú pháp cơ bản sau để các hàng nhóm theo tháng trong một bản dữ liệu gấu trúc:
df.groupby[df.your_date_column.dt.month]['values_column'].sum[]
Công thức cụ thể này nhóm các hàng theo ngày trong your_date_column và tính tổng các giá trị cho các giá trị_column trong DataFrame.your_date_column and calculates the sum of values for the values_column in the DataFrame.
Lưu ý rằng hàm dt.month [] trích xuất vào tháng từ cột ngày trong gấu trúc.dt.month[] function extracts the month from a date column in pandas.
Ví dụ sau đây cho thấy cách sử dụng cú pháp này trong thực tế.
Giả sử chúng ta có khung dữ liệu Pandas sau đây cho thấy doanh số được thực hiện bởi một số công ty vào các ngày khác nhau:
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
Liên quan: Cách tạo phạm vi ngày trong gấu trúc How to Create a Date Range in Pandas
Chúng ta có thể sử dụng cú pháp sau để tính tổng doanh số được nhóm theo tháng:
#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
Ở đây, cách diễn giải đầu ra:
- Tổng doanh số được thực hiện trong tháng 1 [tháng 1] là 34.34.
- Tổng doanh số được thực hiện trong tháng 2 [tháng 2] là 44.44.
- Tổng doanh số được thực hiện trong tháng 3 [tháng 3] là 31.31.
Chúng ta có thể sử dụng cú pháp tương tự để tính toán tối đa các giá trị bán hàng được nhóm theo tháng:
#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
Chúng tôi có thể sử dụng cú pháp tương tự để tính toán bất kỳ giá trị nào mà chúng tôi thích được nhóm theo giá trị tháng của cột ngày.
LƯU Ý: Bạn có thể tìm thấy tài liệu đầy đủ cho hoạt động nhóm trong gấu trúc tại đây.: You can find the complete documentation for the GroupBy operation in pandas here.
Tài nguyên bổ sung
Các hướng dẫn sau đây giải thích cách thực hiện các hoạt động phổ biến khác trong gấu trúc:
Pandas: Cách tính tổng tích lũy theo nhóm nhóm: Cách đếm các giá trị duy nhất theo nhóm gấu trúc: Cách tính tương quan theo nhóm
Pandas: How to Count Unique Values by Group
Pandas: How to Calculate Correlation By Group
Trong bài viết này, chúng tôi sẽ thảo luận về cách nhóm theo DataFrame trên cơ sở ngày và thời gian trong Pandas. Chúng tôi sẽ thấy cách nhóm A Timesereries DataFrame theo năm, tháng, ngày, v.v. Ngoài ra, chúng tôi cũng sẽ thấy cách các đối tượng thời gian nhóm như phút.
Pandas Groupby cho phép chúng tôi chỉ định hướng dẫn nhóm cho một đối tượng. Lệnh được chỉ định này sẽ chọn một cột thông qua tham số chính của hàm Grouper cùng với các tham số Cấp và/hoặc Trục nếu được đưa ra, một mức của chỉ mục của đối tượng/cột đích.
Cú pháp: pandas.grouper [key = none, level = none, freq = none, axis = 0, sort = false]pandas.Grouper[key=None, level=None, freq=None, axis=0, sort=False]
Dưới đây là một số ví dụ mô tả cách nhóm theo DataFrame trên cơ sở ngày và thời gian sử dụng lớp cá mú Pandas.
Ví dụ 1: Nhóm theo tháng Group by month
Python3
import
pandas as pd
df
=
pd.DataFrame[
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
0import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
2import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
0import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
8import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
2import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
6#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import
3import
4import
5import
4import
7import
4import
9import
4pandas as pd
1import
4pandas as pd
3#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1pandas as pd
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3pandas as pd
8import
44406044444244444444444
=
1
#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7=
3
=
4=
=
6=
7=
=
9pd.DataFrame[
0
pd.DataFrame[
1pd.DataFrame[
2=
pd.DataFrame[
4pd.DataFrame[
5pd.DataFrame[
6pd.DataFrame[
7
Output:
Trong ví dụ trên, DataFrame được nhóm theo cột ngày. Như chúng tôi đã cung cấp freq = ’m, có nghĩa là tháng, vì vậy dữ liệu được nhóm lại một tháng cho đến ngày cuối cùng của mỗi tháng và cung cấp tổng số cột giá. Chúng tôi đã không cung cấp giá trị cho tất cả các tháng, sau đó chức năng nhóm được hiển thị dữ liệu cho tất cả các tháng và giá trị được gán 0 cho các tháng khác.
Ví dụ 2: Nhóm theo ngày Group by days
Python3
import
pandas as pd
df
=
pd.DataFrame[
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
0import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
2import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1pandas as pd
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3pandas as pd
8import
44406044444244444444444import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
8import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
2import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
6#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import
3import
4import
5import
4import
7import
4import
9import
4pandas as pd
1import
4pandas as pd
3#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1pandas as pd
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3pandas as pd
8import
4df
0import
4df
2import
4df
4import
4df
6import
4df
8df
9
=
1
#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7=
3
Trong ví dụ trên, DataFrame được nhóm theo cột ngày. Như chúng tôi đã cung cấp freq = ’m, có nghĩa là tháng, vì vậy dữ liệu được nhóm lại một tháng cho đến ngày cuối cùng của mỗi tháng và cung cấp tổng số cột giá. Chúng tôi đã không cung cấp giá trị cho tất cả các tháng, sau đó chức năng nhóm được hiển thị dữ liệu cho tất cả các tháng và giá trị được gán 0 cho các tháng khác.
pd.DataFrame[
1pd.DataFrame[
2=
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
68import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
69=
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
71pd.DataFrame[
5pd.DataFrame[
6pd.DataFrame[
7Output:
Trong ví dụ trên, DataFrame được nhóm theo cột ngày. Như chúng tôi đã cung cấp freq = ‘5d, có nghĩa là năm ngày, vì vậy dữ liệu được nhóm theo khoảng 5 ngày mỗi tháng cho đến ngày cuối cùng được đưa ra trong cột ngày.
Ví dụ 3: Nhóm theo năm Group by year
Python3
import
pandas as pd
df
=
pd.DataFrame[
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
0import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
2import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
87import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
91import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
95import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
99import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
03import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
07#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1import
1import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3import
3import
4import
5import
4import
7import
4import
9import
4pandas as pd
1import
4pandas as pd
3#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
9import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
1pandas as pd
6import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
3pandas as pd
8import
4df
0import
4df
2import
4df
4import
4df
6import
4df
8df
9
=
1
#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
7=
3
=
4=
=
6
#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
48=
#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
50pd.DataFrame[
5pd.DataFrame[
6pd.DataFrame[
7Output:
Trong ví dụ trên, DataFrame được nhóm theo cột ngày. Như chúng tôi đã cung cấp freq = ‘2y, có nghĩa là 2 năm, vì vậy dữ liệu được nhóm lại trong khoảng 2 năm.
Ví dụ 4: Nhóm theo phút Group by minutes
Python3
import
pandas as pd
df
=
pd.DataFrame[
import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
4import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
5#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
03import pandas as pd
#create DataFrame
df = pd.DataFrame[{'date': pd.date_range[start='1/1/2020', freq='W', periods=10],
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]}]
#view DataFrame
print[df]
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
7Trong ví dụ trên, DataFrame được nhóm theo cột ngày. Như chúng tôi đã cung cấp freq = ‘2y, có nghĩa là 2 năm, vì vậy dữ liệu được nhóm lại trong khoảng 2 năm.
=
3
=
4=
=
6
#calculate sum of sales grouped by month
df.groupby[df.date.dt.month]['sales'].sum[]
date
1 34
2 44
3 31
Name: sales, dtype: int64
48=
#calculate max of sales grouped by month
df.groupby[df.date.dt.month]['sales'].max[]
date
1 11
2 15
3 22
Name: sales, dtype: int64
23pd.DataFrame[
5pd.DataFrame[
6pd.DataFrame[
7Output:
Ví dụ 4: Nhóm theo phút