在一个表中,列值用行表示:
df = pd.DataFrame({ 'a':['female, female, female, female, male, female', 'female, male, female, female', 'female, female, female', 'male, male, male']})
a
0 female, female, female, female, male, female
1 female, male, female, female
2 female, female, female
3 male, male, male
我使用该方法的解决方案set():
f = lambda x: [set(y) for y in x.split('; ')]
df['b'] = df['a'].apply(f)
给出以下结果:
a b
0 female, female, female, female, male, female [{m, , e, ,, a, f, l}]
1 female, male, female, female [{m, , e, ,, a, f, l}]
2 female, female, female [{m, , e, ,, a, f, l}]
3 male, male, male [{m, , e, ,, a, l}]
你需要:
a b
0 female, female, female, female, male, female female, male
1 female, male, female, female female, male
2 female, female, female female
3 male, male, male male
选项1:
选项 2:
选项 3:
结果: