首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >使用Pandas Python进行循环

使用Pandas Python进行循环
EN

Stack Overflow用户
提问于 2018-05-25 02:45:26
回答 1查看 55关注 0票数 -1

下面是一个数据集

目标是选择至少一行上具有至少一个作曲家和一个发布者的歌曲id。例如,songid 4有2行,有2个不同的作曲家,但没有出版商,而歌曲id 1没有作曲家。我们的目标是用Python(pandas)拒绝这样的excel表格有什么建议吗?

代码语言:javascript
复制
import pandas as pd
import numpy as np
import smtplib
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart

df_header = pd.read_csv('New York Yankees Twins at Yankees-FNG-042318.csv',header=None,skiprows=1)
cuesheetprepareremail = df_header.iloc[0,7]
print(cuesheetprepareremail)


df = pd.read_csv('New York Yankees Twins at Yankees-FNG-042318.csv',
                 names=['CUE','SONG TITLE','USAGE','RUNNING TIME','COMPOSER','COMPOSER PRO','COMPOSER % SHARE','PUBLISHER',' PUBLISHER PRO','PUBLISHER % SHARE' ,'TRACK ID','LIBRARY','ARTIST','START TIME'
],skiprows=7)

#select all rows with same cue number
columns = ['CUE','COMPOSER','PUBLISHER']
df1 = pd.DataFrame(df,columns=columns)

df1 = df1.replace('', np.NaN)
gp = df1.groupby('CUE').count()
fileToSend = 'New York Yankees Twins at Yankees-FNG-042318.csv'
emailfrom = ''
emailto = 'xyz@abc.com'
username= ''
password = ''

msg = MIMEMultipart()
msg['Subject'] = 'Enco error testing'

msg['From'] = emailfrom
msg['To'] = emailto
msg.preamble = 'Enco error testing'

if gp[(gp['COMPOSER'] == 0) | (gp['PUBLISHER'] == 0)] :

    # Send the email via our own SMTP server.
    server = smtplib.SMTP('localhost')
    server.starttls()
    server.login(username,password)
    server.sendmail(emailfrom, emailto, msg.as_string())
    server.quit()
EN

回答 1

Stack Overflow用户

发布于 2018-05-25 03:04:45

给定您的DataFrame df

代码语言:javascript
复制
   Song_Id        SONG TITLE *USAGE RUNNING COMPOSER(s)  COMPOSE PUBLISHER(s)
0        1    Testing Moment    BGI                        ASCAP        audio
1        2  Rented Dreams-JP    BGI              Andrew  ABRAMUS         Nova
2        2                                         Paul      UBC             
3        2                                        Molly      UBC             
4        3     Gridiron Rock    BGI               Brian    ASCAP       Client
5        3                                       Daniel    ASCAP             
6        4          Rock Run    BGI             Sharron    ASCAP             
7        4                                   Kyle Towns    ASCAP           

您应该使用np.NaN填充空字符串,然后可以使用groupby + count,并将您的逻辑应用于分组的对象。

代码语言:javascript
复制
import numpy as np

df = df.replace('', np.NaN)
gp = df.groupby('Song_Id').count()

gp[(gp['COMPOSER(s)'] > 0) & (gp['PUBLISHER(s)'] > 0)]
#         *USAGE  COMPOSE  COMPOSER(s)  PUBLISHER(s)  RUNNING  SONG TITLE
#Song_Id                                                                 
#2             1        3            3             1        0           1
#3             1        2            2             1        0           1
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50516182

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档