开发者

updating and including two Pandas' DataFrames

I would like to update the Pandas' DataFrame by summation, and if the ID does not exist in the merged DataFrame, then I would like to include the ID's correspon开发者_运维问答ding row. For example, let's say there are two DataFrames like this:

import pandas as pd

d1 = pd.DataFrame({'ID': ["A", "B", "C", "D"], "value": [2, 3, 4, 5]})
d2 = pd.DataFrame({'ID': ["B", "D", "E"], "value": [1, 3, 2]})

Then, the final output that I would like to produce is as follows:

  ID  value
0  A      2
1  B      4
2  C      4
3  D      8
4  E      2

Do you have any ideas on this? I have tried to do it with update or concat functions, but this is not the way for producing the results that I want to produce. Thanks in advance.


Use concat and aggregate sum:

df = pd.concat([d1, d2]).groupby('ID', as_index=False).sum()
print (df)
  ID  value
0  A      2
1  B      4
2  C      4
3  D      8
4  E      2

Another idea if unique ID in both DataFrames with convert ID to index and use DataFrame.add:

df = d1.set_index('ID').add(d2.set_index('ID'), fill_value=0).reset_index()
print (df)
  ID  value
0  A    2.0
1  B    4.0
2  C    4.0
3  D    8.0
4  E    2.0
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜