首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Sklearn2pmml似乎不支持自定义特性转换功能?

Sklearn2pmml似乎不支持自定义特性转换功能?
EN

Stack Overflow用户
提问于 2022-10-27 08:15:12
回答 1查看 23关注 0票数 0

我的管道使用自定义转换函数,它不能使用sklearn2pmml成功转换。

这是我的自定义函数代码

代码语言:javascript
运行
复制
def calc_modify_days(X):
    X['modify_date_new']  = X['modify_date'].apply(lambda x:x[:4]+'-'+x[4:6]+'-'+x[6:8] if x!='' and x<'20221230' else '2022-12-30' )
    X['modify_days'] = (pd.to_datetime(X['day_id']) - pd.to_datetime(X['modify_date_new'])).dt.days
    X['modify_days'] = X['modify_days'].apply(lambda x:-1 if x<0 else x)
    
    return X['modify_days']

def transform_channel_ty_cd(X):
    
    return X.apply(lambda x: all_cate_dict['channel_type_cd_3'].get(x) if x in all_cate_dict['channel_type_cd_3'] else 0)

下面是用于预测的管道代码。

代码语言:javascript
运行
复制
mapper_encode = [
    (['day_id','modify_date'],FunctionTransformer(calc_modify_days),{'alias':'modify_days'}),
    ('channel_type_cd_3',FunctionTransformer(transform_channel_ty_cd))]

mapper = DataFrameMapper(mapper_encode, input_df=True, df_out=True)

pipeline_test = PMMLPipeline(
    steps=[("mapper", mapper),
           ("classifier", clf_1)])

但是,当我试图将管道转换为pmml文件时,我会得到一个错误。

代码语言:javascript
运行
复制
Standard output is empty
Standard error:
Oct 27, 2022 3:43:25 PM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Oct 27, 2022 3:43:25 PM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 61 ms.
Oct 27, 2022 3:43:25 PM org.jpmml.sklearn.Main run
INFO: Converting..
Oct 27, 2022 3:43:25 PM sklearn2pmml.pipeline.PMMLPipeline initTargetFields
WARNING: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.target_fields' is not set. Assuming y as the name of the target field
Oct 27, 2022 3:43:25 PM org.jpmml.sklearn.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException: Attribute 'sklearn.preprocessing._function_transformer.FunctionTransformer.func' has an unsupported value (Java class net.razorvine.pickle.objects.ClassDictConstructor)
    at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:45)
    at org.jpmml.sklearn.PyClassDict.get(PyClassDict.java:82)
    at org.jpmml.sklearn.PyClassDict.getOptional(PyClassDict.java:92)
    at sklearn.preprocessing.FunctionTransformer.getFunc(FunctionTransformer.java:63)
    at sklearn.preprocessing.FunctionTransformer.encodeFeatures(FunctionTransformer.java:43)
    at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
    at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
    at sklearn.Initializer.encodeFeatures(Initializer.java:44)
    at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
    at sklearn.Composite.encodeFeatures(Composite.java:129)
    at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:208)
    at org.jpmml.sklearn.Main.run(Main.java:228)
    at org.jpmml.sklearn.Main.main(Main.java:148)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDictConstructor to numpy.core.UFunc
    at java.lang.Class.cast(Class.java:3369)
    at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
    ... 12 more

Exception in thread "main" java.lang.IllegalArgumentException: Attribute 'sklearn.preprocessing._function_transformer.FunctionTransformer.func' has an unsupported value (Java class net.razorvine.pickle.objects.ClassDictConstructor)
    at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:45)
    at org.jpmml.sklearn.PyClassDict.get(PyClassDict.java:82)
    at org.jpmml.sklearn.PyClassDict.getOptional(PyClassDict.java:92)
    at sklearn.preprocessing.FunctionTransformer.getFunc(FunctionTransformer.java:63)
    at sklearn.preprocessing.FunctionTransformer.encodeFeatures(FunctionTransformer.java:43)
    at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
    at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
    at sklearn.Initializer.encodeFeatures(Initializer.java:44)
    at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
    at sklearn.Composite.encodeFeatures(Composite.java:129)
    at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:208)
    at org.jpmml.sklearn.Main.run(Main.java:228)
    at org.jpmml.sklearn.Main.main(Main.java:148)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDictConstructor to numpy.core.UFunc
    at java.lang.Class.cast(Class.java:3369)
    at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
    ... 12 more

我试图查找它,而FunctionTransformerlambda函数似乎就是问题所在。

我该怎么解决呢?

我试图先将管道转换为pcl.z文件,然后再将其转换为pmml文件,但也发生了类似的错误。

此外,我试图删除lambda函数,但它仍然不能工作,只要它是一个自定义的特性处理程序。

EN

回答 1

Stack Overflow用户

发布于 2022-10-29 19:05:41

这个问题已经在jpmml/sklearn2pmml#354中得到了回答

简而言之,无法挑选包含lambda函数(或引用本地函数)的FunctionTransformer实例是Python的一个限制。SkLearn2PMML包只是抱怨这里的管道对象不完整。

在当前情况下,用户能够使用标准的PMML构造(在sklearn2pmml.preprocessing模块中作为转换器类实现)实现其日期时间算术业务逻辑。根本没有必要使用lambda函数。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74218832

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档