前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >marshmallow快速上手

marshmallow快速上手

作者头像
枇杷李子橙橘柚
发布2019-05-26 11:07:39
1.3K0
发布2019-05-26 11:07:39
举报
文章被收录于专栏:没有擅长的YcY没有擅长的YcY

快速上手

Declaring Schemas

首先创建一个基础的user“模型”(只是为了演示,并不是真正的模型):

代码语言:javascript
复制
import datetime as dt

class User(object):
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.created_at = dt.datetime.now()

    def __repr__(self):
        return '<User(name={self.name!r})>'.format(self=self)

然后通过定义一个映射属性名称到Field对象的类创建schema

代码语言:javascript
复制
from marshmallow import Schema, fields

class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()

Serializing Objects ("Dumping")

传递对象到创建的schema的dump方法,返回一个序列化字典对象(和一个错误字典对象,下文讲):

代码语言:javascript
复制
from marshmallow import pprint

user = User(name="Monty", email="monty@python.org")
schema = UserSchema()
result = schema.dump(user)
pprint(result.data)
# {"name": "Monty",
#  "email": "monty@python.org",
#  "created_at": "2014-08-17T14:54:16.049594+00:00"}

也可以使用dumps方法序列化对象为JSON字符串:

代码语言:javascript
复制
json_result = schema.dumps(user)
pprint(json_result.data)
# '{"name": "Monty", "email": "monty@python.org", "created_at": "2014-08-17T14:54:16.049594+00:00"}'

Filtering output

使用only参数指定要序列化输出的字段:

代码语言:javascript
复制
summary_schema = UserSchema(only=('name', 'email'))
summary_schema.dump(user).data
# {"name": "Monty Python", "email": "monty@python.org"}

使用exclude参数指定不进行序列化输出的字段。

Deserializing Objects ("Loading")

dump方法对应的是load方法,它反序列化一个字典为python数据结构。

load方法默认返回一个fields字段和反序列化值对应的字典对象:

代码语言:javascript
复制
from pprint import pprint

user_data = {
    'created_at': '2014-08-11T05:26:03.869245',
    'email': u'ken@yahoo.com',
    'name': u'Ken'
}
schema = UserSchema()
result = schema.load(user_data)
pprint(result.data)
# {'name': 'Ken',
#  'email': 'ken@yahoo.com',
#  'created_at': datetime.datetime(2014, 8, 11, 5, 26, 3, 869245)}

Deserializing to Objects

Schema子类中定义一个方法并用post_load装饰,该方法接收一个要反序列化的数据字典返回原始python对象:

代码语言:javascript
复制
from marshmallow import Schema, fields, post_load

class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()

    @post_load
    def make_user(self, data):
        return User(**data)

现在调用load方法将返回一个User对象:

代码语言:javascript
复制
user_data = {
    'name': 'Ronnie',
    'email': 'ronnie@stones.com'
}
schema = UserSchema()
result = schema.load(user_data)
result.data  # => <User(name='Ronnie')>

Handling Collections of Objects

可迭代的对象集合也可以进行序列化和反序列化。只需要设置many=True

代码语言:javascript
复制
user1 = User(name="Mick", email="mick@stones.com")
user2 = User(name="Keith", email="keith@stones.com")
users = [user1, user2]
schema = UserSchema(many=True)
result = schema.dump(users)  # OR UserSchema().dump(users, many=True)
result.data
# [{'name': u'Mick',
#   'email': u'mick@stones.com',
#   'created_at': '2014-08-17T14:58:57.600623+00:00'}
#  {'name': u'Keith',
#   'email': u'keith@stones.com',
#   'created_at': '2014-08-17T14:58:57.600623+00:00'}]

Validation

Schema.load()Schema.loads()返回值的第二个元素是一个验证错误的字典。某些fields例如EmailURL内置了验证器:

代码语言:javascript
复制
data, errors = UserSchema().load({'email': 'foo'})
errors  # => {'email': ['"foo" is not a valid email address.']}
# OR, equivalently
result = UserSchema().load({'email': 'foo'})
result.errors  # => {'email': ['"foo" is not a valid email address.']}

验证集合时,错误字典将基于无效字段的索引作为键:

代码语言:javascript
复制
class BandMemberSchema(Schema):
    name = fields.String(required=True)
    email = fields.Email()

user_data = [
    {'email': 'mick@stones.com', 'name': 'Mick'},
    {'email': 'invalid', 'name': 'Invalid'},  # invalid email
    {'email': 'keith@stones.com', 'name': 'Keith'},
    {'email': 'charlie@stones.com'},  # missing "name"
]

result = BandMemberSchema(many=True).load(user_data)
result.errors
# {1: {'email': ['"invalid" is not a valid email address.']},
#  3: {'name': ['Missing data for required field.']}}

通过给fields的validate参数传递callable对象,可以执行额外的验证:

代码语言:javascript
复制
class ValidatedUserSchema(UserSchema):
    # NOTE: This is a contrived example.
    # You could use marshmallow.validate.Range instead of an anonymous function here
    age = fields.Number(validate=lambda n: 18 <= n <= 40)

in_data = {'name': 'Mick', 'email': 'mick@stones.com', 'age': 71}
result = ValidatedUserSchema().load(in_data)
result.errors  # => {'age': ['Validator <lambda>(71.0) is False']}

验证函数可以返回布尔值或抛出ValidationError异常。如果是抛出异常,其信息将保存在错误字典中:

代码语言:javascript
复制
from marshmallow import Schema, fields, ValidationError

def validate_quantity(n):
    if n < 0:
        raise ValidationError('Quantity must be greater than 0.')
    if n > 30:
        raise ValidationError('Quantity must not be greater than 30.')

class ItemSchema(Schema):
    quantity = fields.Integer(validate=validate_quantity)

in_data = {'quantity': 31}
result, errors = ItemSchema().load(in_data)
errors  # => {'quantity': ['Quantity must not be greater than 30.']}

Field Validators as Methods

使用validates装饰器注册方法验证器:

代码语言:javascript
复制
from marshmallow import fields, Schema, validates, ValidationError

class ItemSchema(Schema):
    quantity = fields.Integer()

    @validates('quantity')
    def validate_quantity(self, value):
        if value < 0:
            raise ValidationError('Quantity must be greater than 0.')
        if value > 30:
            raise ValidationError('Quantity must not be greater than 30.')

strict Mode

在schema构造器或class Meta中设置strict=True,遇到不合法数据时将抛出异常,通过ValidationError.messages属性可以访问验证错误的字典:

代码语言:javascript
复制
from marshmallow import ValidationError

try:
    UserSchema(strict=True).load({'email': 'foo'})
except ValidationError as err:
    print(err.messages)# => {'email': ['"foo" is not a valid email address.']}

Required Fields

设置required=True可以定义一个必要字段,调用Schema.load()方法时如果字段值缺失将验证失败并保存错误信息。

error_messages参数传递一个dict对象可以自定义必要字段的错误信息:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String(required=True)
    age = fields.Integer(
        required=True,
        error_messages={'required': 'Age is required.'}
    )
    city = fields.String(
        required=True,
        error_messages={'required': {'message': 'City required', 'code': 400}}
    )
    email = fields.Email()

data, errors = UserSchema().load({'email': 'foo@bar.com'})
errors
# {'name': ['Missing data for required field.'],
#  'age': ['Age is required.'],
#  'city': {'message': 'City required', 'code': 400}}

Partial Loading

通过指定partial参数,可以仅检查部分必要字段:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String(required=True)
    age = fields.Integer(required=True)

data, errors = UserSchema().load({'age': 42}, partial=('name',))
# OR UserSchema(partial=('name',)).load({'age': 42})
data, errors  # => ({'age': 42}, {})

或者设置partial=True忽略必要字段检查:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String(required=True)
    age = fields.Integer(required=True)

data, errors = UserSchema().load({'age': 42}, partial=True)
# OR UserSchema(partial=True).load({'age': 42})
data, errors  # => ({'age': 42}, {})

Schema.validate

使用Schema.validate()可以只验证输入数据而不反序列化:

代码语言:javascript
复制
errors = UserSchema().validate({'name': 'Ronnie', 'email': 'invalid-email'})
errors  # {'email': ['"invalid-email" is not a valid email address.']}

Specifying Attribute Names

默认情况下schema序列化处理和field名称相同的对象属性。对于属性和field不相同的场景,通过attribute参数指定field处理哪个属性:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String()
    email_addr = fields.String(attribute="email")
    date_created = fields.DateTime(attribute="created_at")

user = User('Keith', email='keith@stones.com')
ser = UserSchema()
result, errors = ser.dump(user)
pprint(result)
# {'name': 'Keith',
#  'email_addr': 'keith@stones.com',
#  'date_created': '2014-08-17T14:58:57.600623+00:00'}

Specifying Deserialization Keys

默认情况下schema反序列化处理键和field名称相同的字典。可以通过load_from参数指定额外处理的字典键值:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String()
    email = fields.Email(load_from='emailAddress')

data = {
    'name': 'Mike',
    'emailAddress': 'foo@bar.com'
}
s = UserSchema()
result, errors = s.load(data)
#{'name': u'Mike',
# 'email': 'foo@bar.com'}

Specifying Serialization Keys

如果要序列化输出不想使用field名称作为键,可以通过dump_to参数指定(和load_from相反):

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.String(dump_to='TheName')
    email = fields.Email(load_from='CamelCasedEmail', dump_to='CamelCasedEmail')

data = {
    'name': 'Mike',
    'email': 'foo@bar.com'
}
s = UserSchema()
result, errors = s.dump(data)
#{'TheName': u'Mike',
# 'CamelCasedEmail': 'foo@bar.com'}

Refactoring: Implicit Field Creation

当schema中有很多属性时,为每个属性指定field类型会产生大量的重复工作,尤其是大部分属性为原生的python数据类型时。

class Meta允许开发人员指定序列化哪些属性,Marshmallow会基于属性类型选择合适的field类型:

代码语言:javascript
复制
# 重构UserSchema
class UserSchema(Schema):
    uppername = fields.Function(lambda obj: obj.name.upper())

    class Meta:
        fields = ("name", "email", "created_at", "uppername")


user = User(name="erika", email="marshmallow@126.com")
schema = UserSchema()
result = schema.dump(user)
print(result.data)

# {'created_at': '2019-05-20T15:45:27.760000+00:00', 'uppername': 'ERIKA', 'name': 'erika', 'email': 'marshmallow@126.com'}

除了显式声明的field外,使用additional选项可以指定还要包含哪些fields。以下代码等同于上面的代码:

代码语言:javascript
复制
class UserSchema(Schema):
    uppername = fields.Function(lambda obj: obj.name.upper())
    class Meta:
        # No need to include 'uppername'
        additional = ("name", "email", "created_at")

Ordering Output

设置ordered=True可以维护序列化输出的field顺序,此时序列化字典为collections.OrderedDict类型:

代码语言:javascript
复制
from collections import OrderedDict

class UserSchema(Schema):
    uppername = fields.Function(lambda obj: obj.name.upper())
    class Meta:
        fields = ("name", "email", "created_at", "uppername")
        ordered = True

u = User('Charlie', 'charlie@stones.com')
schema = UserSchema()
result = schema.dump(u)
assert isinstance(result.data, OrderedDict)
# marshmallow's pprint function maintains order
pprint(result.data, indent=2)
# {
#   "name": "Charlie",
#   "email": "charlie@stones.com",
#   "created_at": "2014-10-30T08:27:48.515735+00:00",
#   "uppername": "CHARLIE"
# }

"Read-only" and "Write-only" Fields

在web API上下文中,dump_onlyload_only参数分别类似于只读和只写的概念:

代码语言:javascript
复制
class UserSchema(Schema):
    name = fields.Str()
    # password is "write-only"
    password = fields.Str(load_only=True)
    # created_at is "read-only"
    created_at = fields.DateTime(dump_only=True)
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 快速上手
    • Declaring Schemas
      • Serializing Objects ("Dumping")
        • Filtering output
          • Deserializing Objects ("Loading")
            • Deserializing to Objects
              • Handling Collections of Objects
                • Validation
                  • Field Validators as Methods
                    • strict Mode
                      • Required Fields
                        • Partial Loading
                          • Schema.validate
                            • Specifying Attribute Names
                              • Specifying Deserialization Keys
                                • Specifying Serialization Keys
                                  • Refactoring: Implicit Field Creation
                                    • Ordering Output
                                      • "Read-only" and "Write-only" Fields
                                      相关产品与服务
                                      文件存储
                                      文件存储(Cloud File Storage,CFS)为您提供安全可靠、可扩展的共享文件存储服务。文件存储可与腾讯云服务器、容器服务、批量计算等服务搭配使用,为多个计算节点提供容量和性能可弹性扩展的高性能共享存储。腾讯云文件存储的管理界面简单、易使用,可实现对现有应用的无缝集成;按实际用量付费,为您节约成本,简化 IT 运维工作。
                                      领券
                                      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档