我试图了解如何从数据库中计算所有国家的数量。
import pandas as pd
file_url = "https://drive.google.com/file/d/1LTpeRRuLgts3MDOBzvuSI6idL0no24AW/view?usp=sharing"
file_path = 'https://drive.google.com/uc?export=download&id=' + file_url.split('/')[-2]
data = pd.read_csv(file_path,encoding = "ISO-8859-1")
data=data.drop(['IMDb Link', 'Awards Received', 'Awards Nominated For', 'Image', 'Poster','Production House','TMDb Trailer','Trailer Site'],axis=1)
在网上找到了这个算法。
Country=data['Country Availability'].values.tolist()
lst_no = ['.', ',', ':', '!', '"', "'", '[', ']', '-', '—', '(', ')', ' ' ]
lst = []
for i in range(0,len(Country)):
for word in str(Country[i]).split():
if not word in lst_no:
_word = word
if word[-1] in lst_no:
_word = _word[:-1]
if word[0] in lst_no:
_word = _word[1:]
lst.append(_word)
_dict = dict()
for word in lst:
_dict[word] = _dict.get(word, 0) + 1
lst = []
for key, value in _dict.items():
lst.append((value, key))
lst.sort(reverse=True)
dic=dict(lst)
print(dic)
但由于某种原因,它在这种情况下不起作用。我得到这个答案。
1:
'States,Hungary,Turkey,Canada,Argentina,Mexico,Malaysia,Brazil,Netherlands,Italy,Israel,Colombia',
2: 'States,Greece,Slovakia,Thailand,Turkey,Malaysia,Brazil,Italy,Iceland,Israel,India,Mexico,Colombia',
3: 'Kingdom,France,India,Russia,Greece,Slovakia,Singapore,Poland,Czech',
4: 'States,Canada,Australia,Mexico,Argentina,Sweden,France,United',
5: 'Republic,Lithuania,Israel,Iceland,Romania,South',
6: 'Belgium,Brazil,United',
7: 'States,Canada,Germany,Mexico,Argentina,Sweden,France,United',
8: 'States,Canada,Mexico,Argentina,Sweden,United',
9: 'States,Germany,Argentina,Mexico,Brazil,Spain,Portugal,India,Russia,Greece,South',
10: 'Kingdom,France,Australia,Belgium,Canada,Netherlands,Sweden,Switzerland,United',
11: 'Republic,Romania,Russia,Greece,Poland,South',
请帮助理解。
在“国家可用性”国家/地区以逗号分隔,在您的代码中以空格分隔
split()这是您获取国家/地区列表的方式以及每个国家/地区出现的次数,排序(最常见的第一个):