下面是一个数据集
目标是选择至少一行上具有至少一个作曲家和一个发布者的歌曲id。例如,songid 4有2行,有2个不同的作曲家,但没有出版商,而歌曲id 1没有作曲家。我们的目标是用Python(pandas)拒绝这样的excel表格有什么建议吗?
import pandas as pd
import numpy as np
import smtplib
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
df_header = pd.read_csv('New York Yankees Twins at Yankees-FNG-042318.csv',header=None,skiprows=1)
cuesheetprepareremail = df_header.iloc[0,7]
print(cuesheetprepareremail)
df = pd.read_csv('New York Yankees Twins at Yankees-FNG-042318.csv',
names=['CUE','SONG TITLE','USAGE','RUNNING TIME','COMPOSER','COMPOSER PRO','COMPOSER % SHARE','PUBLISHER',' PUBLISHER PRO','PUBLISHER % SHARE' ,'TRACK ID','LIBRARY','ARTIST','START TIME'
],skiprows=7)
#select all rows with same cue number
columns = ['CUE','COMPOSER','PUBLISHER']
df1 = pd.DataFrame(df,columns=columns)
df1 = df1.replace('', np.NaN)
gp = df1.groupby('CUE').count()
fileToSend = 'New York Yankees Twins at Yankees-FNG-042318.csv'
emailfrom = ''
emailto = 'xyz@abc.com'
username= ''
password = ''
msg = MIMEMultipart()
msg['Subject'] = 'Enco error testing'
msg['From'] = emailfrom
msg['To'] = emailto
msg.preamble = 'Enco error testing'
if gp[(gp['COMPOSER'] == 0) | (gp['PUBLISHER'] == 0)] :
# Send the email via our own SMTP server.
server = smtplib.SMTP('localhost')
server.starttls()
server.login(username,password)
server.sendmail(emailfrom, emailto, msg.as_string())
server.quit()
发布于 2018-05-25 03:04:45
给定您的DataFrame
df
Song_Id SONG TITLE *USAGE RUNNING COMPOSER(s) COMPOSE PUBLISHER(s)
0 1 Testing Moment BGI ASCAP audio
1 2 Rented Dreams-JP BGI Andrew ABRAMUS Nova
2 2 Paul UBC
3 2 Molly UBC
4 3 Gridiron Rock BGI Brian ASCAP Client
5 3 Daniel ASCAP
6 4 Rock Run BGI Sharron ASCAP
7 4 Kyle Towns ASCAP
您应该使用np.NaN
填充空字符串,然后可以使用groupby
+ count
,并将您的逻辑应用于分组的对象。
import numpy as np
df = df.replace('', np.NaN)
gp = df.groupby('Song_Id').count()
gp[(gp['COMPOSER(s)'] > 0) & (gp['PUBLISHER(s)'] > 0)]
# *USAGE COMPOSE COMPOSER(s) PUBLISHER(s) RUNNING SONG TITLE
#Song_Id
#2 1 3 3 1 0 1
#3 1 2 2 1 0 1
https://stackoverflow.com/questions/50516182
复制相似问题