首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >通过Pandas从球员名单中创建一个奖学金花名册

通过Pandas从球员名单中创建一个奖学金花名册
EN

Stack Overflow用户
提问于 2021-07-23 05:21:30
回答 1查看 35关注 0票数 1

最近,我从the following website上找到了一份大学橄榄球的名单,我得到了下面的输出……

代码语言:javascript
运行
复制
    Number  Name    Position    Experience  Height  Weight  Hometown / High School  Previous School High School Hometown
0   0.0 Jonathan Sutherland S   R-Sr.   5-11    203 Ottawa, Ontario, Canada / Episcopal (Va.)   NaN Episcopal (Va.) Ottawa, Ontario, Canada
1   1.0 Jaquan Brisker  S   Sr.+    6-1 204 Pittsburgh, Pa. / Gateway   Lackawanna College  Gateway Pittsburgh, Pa.
2   2.0 Keaton Ellis    CB  Jr. 5-11    192 State College, Pa. / State College Area NaN State College Area  State College, Pa.
3   2.0 Ta'Quan Roberson    QB  R-So.   5-11    199 Orange, N.J. / DePaul Catholic  NaN DePaul Catholic Orange, N.J.
4   3.0 Johnny Dixon    CB  Jr. 6-0 188 Tampa, Fla. / Chamberlain   South Carolina  Chamberlain Tampa, Fla.
... ... ... ... ... ... ... ... ... ... ...
104 NaN Jaylen Reed S   Fr. 6-0 213 Detroit, Mich. / Martin Luther King High School NaN Martin Luther King High School  Detroit, Mich.
105 NaN Jordan van den Berg DT  So. 6-3 292 Johannesburg, South Africa / Providence Christ...   Iowa Western    Providence Christian Academy    Johannesburg, South Africa
106 NaN Harrison Wallace III    WR  Fr. 6-1 187 Montgomery, Ala. / Pike Road High School    NaN Pike Road High School   Montgomery, Ala.
107 NaN Zakee Wheatley  CB  Fr. 6-2 190 Crofton, Md. / Archbishop Spalding High School  NaN Archbishop Spalding High School Crofton, Md.
108 NaN Eric Wilson OL  Sr.+    6-4 320 Minnetrista, Minn. / Benilde-St. Margaret's Harvard Benilde-St. Margaret's  Minnetrista, Minn.
109 rows × 10 columns

对以下脚本执行...from命令。

代码语言:javascript
运行
复制
roster = pd.read_html('https://gopsusports.com/sports/football/roster/2021')

roster_final = roster[2]

roster_final.columns = ['Number', 'Name', 'Position', 'Experience', 'Height', 'Weight', 'Hometown / High School', 'Previous School']

roster_final['High School'] = roster_final['Hometown / High School'].str.split('/').str[-1]
roster_final['Hometown'] = roster_final['Hometown / High School'].str.split('/').str[0]

roster_final

我现在想做的是创建一个所谓的“奖学金网格”,其中花名册按职位和他们参与项目的时间进行了分解。奖学金网格的一个例子是from this webpage

到目前为止,我已经用以下代码进行了尝试:

代码语言:javascript
运行
复制
fifth_year = ['R-Sr.+', 'Sr.+', 'R-Sr.']
fourth_year = ['Sr.', 'R-Jr.']
third_year = ['Jr.', 'R-So.']
second_year = ['So.', 'R-Fr.']
first_year = ['Fr.']

roster_final.loc[(roster_final['Experience'] == 'R-Sr.+') | (roster_final['Experience'] == 'Sr.+') | (roster_final['Experience'] == 'R-Sr.'), 'Class'] = 'Fifth Year' 
roster_final.loc[(roster_final['Experience'] == 'Sr.') | (roster_final['Experience'] == 'R-Jr.'), 'Class'] = 'Fourth Year'
roster_final.loc[(roster_final['Experience'] == 'Jr.') | (roster_final['Experience'] == 'R-So.'), 'Class'] = 'Third Year'
roster_final.loc[(roster_final['Experience'] == 'So.') | (roster_final['Experience'] == 'R-Fr.'), 'Class'] = 'Second Year'
roster_final.loc[(roster_final['Experience'] == 'Fr.'), 'Class'] = 'First Year'

# roster_final['Class'] = roster_final['Experience'].apply(lambda x: 'fifth_year' if x.isin(fifth_year))


year5 = roster_final[['Number', 'Name', 'Position', 'Experience', 'Class']][roster_final['Class']=='Fifth Year']
year4 = roster_final[['Number', 'Name', 'Position', 'Experience', 'Class']][roster_final['Class']=='Fourth Year']
year3 = roster_final[['Number', 'Name', 'Position', 'Experience', 'Class']][roster_final['Class']=='Third Year']
year2 = roster_final[['Number', 'Name', 'Position', 'Experience', 'Class']][roster_final['Class']=='Second Year']
year1 = roster_final[['Number', 'Name', 'Position', 'Experience', 'Class']][roster_final['Class']=='First Year']

scholarship_roster = roster_final.groupby("Position")
scholarship_roster = scholarship_roster["Name"].apply(list)
scholarship_roster = scholarship_roster.reset_index()

year5[year5['Position'] == 'OL']

我可以使用groupbys和lists来分解第一个表,但我希望我的结果看起来与the aforementioned website.上的完全一样。如果有人可以帮助我获得这个输出,我将非常感激。

EN

回答 1

Stack Overflow用户

发布于 2021-07-23 07:55:48

下面的代码通常会回答这个问题吗?它应该会给你一个想法,或者至少是一些思考的食粮。

注意:

  • addYear(x)函数可能需要检查,因为我没有尝试获取每个字符串的正确年份,只是为了演示一下。

  • 刚刚发现:Year列的顺序是从第一到第五,但你提供的例子是左边的高年级学生(以及代表“大一新生”的单词)。但是,您可以轻松地颠倒代码第54行的顺序,并(如果需要)在查看时更改addYear函数中Year列中的值的措辞(请参阅上面的第一个项目符号)。

代码:

代码语言:javascript
运行
复制
import pandas as pd
import requests

#set how the names of the players will be joined in the final pivot.  Pipe seems to work best
join_string = r' | '

# I could not get https://gopsusports.com/sports/football/roster/2021 to load but I saw this...
url = 'https://www.roarlionsroar.com/penn-state-football/roster'

page = requests.get(url)
table = pd.read_html(page.text, header = 1)
df_roster = table[0]
del df_roster['No.']

# df_roster['Class/Eligibility'].value_counts()

def addYear(x):
    '''
    df_roster['Class/Eligibility'].value_counts() returns the following:
    FR/FR 35, SO/FR 19, SR/JR 16, JR/SO 14, 
    SO/SO 7, JR/JR 6, SR/SR 5, 5th/SR 2, GR/SR 1
    '''
    if x == 'FR/FR':
        return 'First'
    elif x == 'SO/FR':
        return 'Second'
    elif x == 'SR/JR':
        return 'Fourth'
    elif x == 'JR/SO':
        return 'Third'
    elif x == 'SO/SO':
        return 'Second'
    elif x == 'JR/JR':
        return 'Third'
    elif x == 'SR/SR':
        return 'Fifth'
    elif x == '5th/SR':
        return 'Fifth'
    elif x == 'GR/SR':
        return 'Fifth'
    else:
        return 'Unknown'

df_roster['Year'] = df_roster['Class/Eligibility'].apply(addYear)

df_roster = df_roster[['Position', 'Year', 'Name']]

df_pivot = df_roster.pivot_table(index="Position", columns="Year", values="Name", \
                                 margins = True, aggfunc=lambda x: join_string.join(x), fill_value='-')

df_pivot = df_pivot.reindex(['First', 'Second', 'Third', 'Fourth', 'Fifth'], axis=1)
df_pivot = df_pivot.reindex(['QB', 'RB', 'WR', 'TE', 'OL', 'DE', 'DT', 'LB', 'CB', 'S', 'K', 'P'], axis=0)

print(df_pivot)

输出示例:

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68491620

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档