updating and including two Pandas' DataFrames
I would like to update the Pandas' DataFrame by summation, and if the ID does not exist in the merged DataFrame, then I would like to include the ID's correspon开发者_运维问答ding row. For example, let's say there are two DataFrames like this:
import pandas as pd
d1 = pd.DataFrame({'ID': ["A", "B", "C", "D"], "value": [2, 3, 4, 5]})
d2 = pd.DataFrame({'ID': ["B", "D", "E"], "value": [1, 3, 2]})
Then, the final output that I would like to produce is as follows:
ID value
0 A 2
1 B 4
2 C 4
3 D 8
4 E 2
Do you have any ideas on this? I have tried to do it with update
or concat
functions, but this is not the way for producing the results that I want to produce. Thanks in advance.
Use concat
and aggregate sum
:
df = pd.concat([d1, d2]).groupby('ID', as_index=False).sum()
print (df)
ID value
0 A 2
1 B 4
2 C 4
3 D 8
4 E 2
Another idea if unique ID
in both DataFrames with convert ID
to index and use DataFrame.add
:
df = d1.set_index('ID').add(d2.set_index('ID'), fill_value=0).reset_index()
print (df)
ID value
0 A 2.0
1 B 4.0
2 C 4.0
3 D 8.0
4 E 2.0
精彩评论