
colander.Invalid exception to the narrative docs. - Add (existing, but previously non-API) colander.Invalid attributes to its interface within the API documentation. - Fix Chrome pre renderings.
24 KiB
Colander
Colander is useful as a system for validating and deserializing data obtained via XML, JSON, an HTML form post or any other equally simple data serialization. Colander can be used to:
- Define a data schema
- Deserialize a data structure composed of strings, mappings, and lists into an arbitrary Python structure after validating the data structure against a data schema.
- Serialize an arbitrary Python structure to a data structure composed of strings, mappings, and lists.
Out of the box, Colander can serialize and deserialize various types of objects, including:
- A mapping object (e.g. dictionary)
- A variable-length sequence of objects (each object is of the same type).
- A fixed-length tuple of objects (each object is of a different type).
- A string or Unicode object.
- An integer.
- A float.
- A boolean.
- An importable Python object (to a dotted Python object path).
- A Python
datetime.datetime
object. - A Python
datetime.date
object.
Colander allows additional data structures to be serialized and deserialized by allowing a developer to define new "types". Its internal error messages are internationalizable.
Defining A Colander Schema
Imagine you want to deserialize and validate a serialization of data you've obtained by reading a YAML document. An example of such a data serialization might look something like this:
{'name':'keith',
'age':'20',
'friends':[('1', 'jim'),('2', 'bob'), ('3', 'joe'), ('4', 'fred')],
'phones':[{'location':'home', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{ }
Let's further imagine you'd like to make sure, on demand, that a particular serialization of this type read from this YAML document or another YAML document is "valid".
Notice that all the innermost values in the serialization are strings, even though some of them (such as age and the position of each friend) are more naturally integer-like. Let's define a schema which will attempt to convert a serialization to a data structure that has different types.
import colander
class Friend(colander.TupleSchema):
= colander.SchemaNode(colander.Int(),
rank =colander.Range(0, 9999))
validator= colander.SchemaNode(colander.String())
name
class Phone(colander.MappingSchema):
= colander.SchemaNode(colander.String(),
location =colander.OneOf(['home', 'work']))
validator= colander.SchemaNode(colander.String())
number
class Friends(colander.SequenceSchema):
= Friend()
friend
class Phones(colander.SequenceSchema):
= Phone()
phone
class Person(colander.MappingSchema):
= colander.SchemaNode(colander.String())
name = colander.SchemaNode(colander.Int(),
age =colander.Range(0, 200))
validator= Friends()
friends = Phones() phones
For ease of reading, we've actually defined five schemas
above, but we coalesce them all into a single Person
schema. As the result of our definitions, a Person
represents:
- A
name
, which must be a string. - An
age
, which must be deserializable to an integer; after deserialization happens, a validator ensures that the integer is between 0 and 200 inclusive. - A sequence of
friend
structures. Each friend structure is a two-element tuple. The first element represents an integer rank; it must be between 0 and 9999 inclusive. The second element represents a string name. - A sequence of
phone
structures. Each phone structure is a mapping. Each phone mapping has two keys:location
andnumber
. Thelocation
must be one ofwork
orhome
. The number must be a string.
Schema Node Objects
A schema is composed of one or more schema node objects,
each typically of the class colander.SchemaNode
, usually in a nested
arrangement. Each schema node object has a required type, an
optional deserialization validator, an optional
default, an optional title, an optional
description, and a slightly less optional name.
The type of a schema node indicates its data type (such as
colander.Int
or colander.String
).
The validator of a schema node is called after
deserialization; it makes sure the deserialized value matches a
constraint. An example of such a validator is provided in the schema
above: validator=colander.Range(0, 200)
. A validator is not
called after serialization, only after deserialization.
The default of a schema node indicates its default value if a value for the schema node is not found in the input data during serialization and deserialization. It should be the deserialized representation. If a schema node does not have a default, it is considered required.
The name of a schema node appears in error reports.
The title of a schema node is metadata about a schema node that can be used by higher-level systems. By default, it is a capitalization of the name.
The description of a schema node is metadata about a schema node that can be used by higher-level systems. By default, it is empty.
The name of a schema node that is introduced as a class-level
attribute of a colander.MappingSchema
, colander.TupleSchema
or a
colander.SequenceSchema
is its class attribute name.
For example:
import colander
class Phone(colander.MappingSchema):
= colander.SchemaNode(colander.String(),
location =colander.OneOf(['home', 'work']))
validator= colander.SchemaNode(colander.String()) number
The name of the schema node defined via
location = colander.SchemaNode(..)
within the schema above
is location
. The title of the same schema node is
Location
.
Schema Objects
In the examples above, if you've been paying attention, you'll have
noticed that we're defining classes which subclass from colander.MappingSchema
,
colander.TupleSchema
and colander.SequenceSchema
.
It's turtles all the way down: the result of creating an instance of
any of colander.MappingSchema
, colander.TupleSchema
or
colander.SequenceSchema
object is also a
colander.SchemaNode
object.
Instantiating a colander.MappingSchema
creates a schema node which
has a type value of colander.Mapping
.
Instantiating a colander.TupleSchema
creates a schema node which has
a type value of colander.Tuple
.
Instantiating a colander.SequenceSchema
creates a schema node which
has a type value of colander.Sequence
.
Deserializing A Data Structure Using a Schema
Earlier we defined a schema:
import colander
class Friend(colander.TupleSchema):
= colander.SchemaNode(colander.Int(),
rank =colander.Range(0, 9999))
validator= colander.SchemaNode(colander.String())
name
class Phone(colander.MappingSchema):
= colander.SchemaNode(colander.String(),
location =colander.OneOf(['home', 'work']))
validator= colander.SchemaNode(colander.String())
number
class Friends(colander.SequenceSchema):
= Friend()
friend
class Phones(colander.SequenceSchema):
= Phone()
phone
class Person(colander.MappingSchema):
= colander.SchemaNode(colander.String())
name = colander.SchemaNode(colander.Int(),
age =colander.Range(0, 200))
validator= Friends()
friends = Phones() phones
Let's now use this schema to try to deserialize some concrete data structures.
Deserializing A Valid Serialization
= {
data 'name':'keith',
'age':'20',
'friends':[('1', 'jim'),('2', 'bob'), ('3', 'joe'), ('4', 'fred')],
'phones':[{'location':'home', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{
}= Person()
schema = schema.deserialize(data) deserialized
When schema.deserialize(data)
is called, because all the
data in the schema is valid, and the structure represented by
data
conforms to the schema, deserialized
will
be the following:
{'name':'keith',
'age':20,
'friends':[(1, 'jim'),(2, 'bob'), (3, 'joe'), (4, 'fred')],
'phones':[{'location':'home', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{ }
Note that all the friend rankings have been converted to integers, likewise for the age.
Deserializing An Invalid Serialization
Below, the data
structure has some problems. The
age
is a negative number. The rank for bob
is
t
which is not a valid integer. The location
of the first phone is bar
, which is not a valid location
(it is not one of "work" or "home"). What happens when a data structure
cannot be deserialized due to a data type error or a validation
error?
import colander
= {
data 'name':'keith',
'age':'-1',
'friends':[('1', 'jim'),('t', 'bob'), ('3', 'joe'), ('4', 'fred')],
'phones':[{'location':'bar', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{
}= Person()
schema schema.deserialize(data)
The deserialize
method will raise an exception, and the
except
clause above will be invoked, causing an error
messaage to be printed. It will print something like:
'age':'-1 is less than minimum value 0',
Invalid: {'friends.1.0':'"t" is not a number',
'phones.0.location:'"bar" is not one of "home", "work"'}
The above error is telling us that:
- The top-level age variable failed validation.
- Bob's rank (the Friend tuple name
bob
's zeroth element) is not a valid number. - The zeroth phone number has a bad location: it should be one of "home" or "work".
We can optionally catch the exception raised and obtain the raw error dictionary:
import colander
= {
data 'name':'keith',
'age':'-1',
'friends':[('1', 'jim'),('t', 'bob'), ('3', 'joe'), ('4', 'fred')],
'phones':[{'location':'bar', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{
}= Person()
schema try:
schema.deserialize(data)except colander.Invalid, e:
= e.asdict()
errors print errors
This will print something like:
'age':'-1 is less than minimum value 0',
{'friends.1.0':'"t" is not a number',
'phones.0.location:'"bar" is not one of "home", "work"'}
colander.Invalid
Exceptions
The exceptions raised by Colander during deserialization are
instances of the colander.Invalid
exception class. We saw previously
that instances of this exception class have a colander.Invalid.asdict
method which returns a dictionary of error messages. This dictionary is
composed by Colander by walking the exception tree. The
exception tree is composed entirely of colander.Invalid
exceptions.
While the colander.Invalid.asdict
method is useful for simple
error reporting, a more complex application, such as a form library that
uses Colander as an underlying schema system, may need to do error
reporting in a different way. In particular, such a system may need to
present the errors next to a field in a form. It may need to translate
error messages to another language. To do these things effectively, it
will almost certainly need to walk and introspect the exception graph
manually.
The colander.Invalid
exceptions raised by Colander validation are very rich. They contain
detailed information about the circumstances of an error. If you write a
system based on Colander that needs to display and format Colander
exceptions specially, you will need to get comfy with the Invalid
exception API.
When a validation-related error occurs during deserialization, each
node in the schema that had an error (and any of its parents) will be
represented by a corresponding colander.Invalid
exception. To support this
behavior, each colander.Invalid
exception has a children
attribute which is a list. Each element in this list (if any) will also
be an colander.Invalid
exception, recursively, representing the error circumstances for a
particular schema deserialization.
Each exception in the graph has a msg
attribute, which
will either be the value None
, a str
or
unicode
object, or a translation string instance
representing a freeform error value set by a particular type during an
unsuccessful deserialization. Exceptions that exist purely for structure
will have a msg
attribute with the value None
.
Each exception instance will also have an attribute named
node
, representing the schema node to which the exception
is related.
Note
Translation strings are objects which behave like Unicode objects but have extra metadata associated with them for use in translation systems. See http://docs.repoze.org/translationstring/ for documentation about translation strings. All error messages used by Colander internally are translation strings, which means they can be translated to other languages. In particular, they are suitable for use as gettext message ids.
See the colander.Invalid
API documentation for more
information.
Serialization
Serializing a data structure is obviously the inverse operation from
deserializing a data structure. The serialize
method of a
schema performs serialization of application data (aka an
appstruct
). If you pass the serialize
method
data that can be understood by the schema types in the schema you're
calling it against, you will be returned a data structure of serialized
values.
For example, given the following schema:
import colander
class Person(colander.MappingSchema):
= colander.SchemaNode(colander.String())
name = colander.SchemaNode(colander.Int(),
age =colander.Range(0, 200)) validator
We can serialize a matching data structure:
= {'age':20, 'name':'Bob'}
data = Person()
schema = schema.serialize(data) deserialized
The value for deserialized
above will be
{'age':'20', 'name':'Bob'}
(note the integer has become a
string).
Serialization and deserialization are not completely symmetric,
however. Although schema-driven data conversion happens during
serialization, and defaults are injected as necessary, the default colander
types are defined in
such a way that the validation of values and structural validation does
not happen as it does during deserialization. For example, the
required
argument of a schema is typically ignored, none of
the validators associated with the schema or any of is nodes is
invoked.
This usually means you may "partially" serialize a data structure
where some of the values are missing. If we try to serialize partial
data using the serialize
method of the schema:
= {'age':20}
data = Person()
schema = schema.serialize(data) deserialized
The value for deserialized
above will be
{'age':'20'}
(note the integer has become a string). Above,
even though we did not include the name
attribute in the
data we fed to serialize
, an error is not
raised.
The corollary: it is the responsibility of the developer to ensure he
serializes "the right" data; colander
will not raise an error when asked to
serialize something that is partially nonsense.
Defining A Schema Imperatively
The above schema we defined was defined declaratively via a set of
class
statements. It's often useful to create schemas more
dynamically. For this reason, Colander offers an "imperative" mode of
schema configuration. Here's our previous declarative schema:
import colander
class Friend(colander.TupleSchema):
= colander.SchemaNode(colander.Int(),
rank =colander.Range(0, 9999))
validator= colander.SchemaNode(colander.String())
name
class Phone(colander.MappingSchema):
= colander.SchemaNode(colander.String(),
location =colander.OneOf(['home', 'work']))
validator= colander.SchemaNode(colander.String())
number
class Friends(colander.SequenceSchema):
= Friend()
friend
class Phones(colander.SequenceSchema):
= Phone()
phone
class Person(colander.MappingSchema):
= colander.SchemaNode(colander.String())
name = colander.SchemaNode(colander.Int(),
age =colander.Range(0, 200))
validator= Friends()
friends = Phones() phones
We can imperatively construct a completely equivalent schema like so:
import colander
= colander.SchemaNode(Tuple())
friend
friend.add(colander.SchemaNode(colander.Int(),=colander.Range(0, 9999),
validator='rank'))
name='name')
friend.add(colander.SchemaNode(colander.String()), name
= colander.SchemaNode(Mapping())
phone
phone.add(colander.SchemaNode(colander.String(),=colander.OneOf(['home', 'work']),
validator='location'))
name='number'))
phone.add(colander.SchemaNode(colander.String(), name
= colander.SchemaNode(Mapping())
schema ='name'))
schema.add(colander.SchemaNode(colander.String(), name='age'),
schema.add(colander.SchemaNode(colander.Int(), name=colander.Range(0, 200))
validator='friends'))
schema.add(colander.SchemaNode(colander.Sequence(), friend, name='phones')) schema.add(colander.SchemaNode(colander.Sequence(), phone, name
Defining a schema imperatively is a lot uglier than defining a schema declaratively, but it's often more useful when you need to define a schema dynamically. Perhaps in the body of a function or method you may need to disinclude a particular schema field based on a business condition; when you define a schema imperatively, you have more opportunity to control the schema composition.
Serializing and deserializing using a schema created imperatively is done exactly the same way as you would serialize or deserialize using a schema created declaratively:
= {
data 'name':'keith',
'age':'20',
'friends':[('1', 'jim'),('2', 'bob'), ('3', 'joe'), ('4', 'fred')],
'phones':[{'location':'home', 'number':'555-1212'},
'location':'work', 'number':'555-8989'},],
{
}= schema.deserialize(data) deserialized
Defining a New Type
A new type is a class with two methods:: serialize
and
deserialize
. serialize
converts a Python data
structure to a serialization. deserialize
converts a value
to a Python data structure.
Here's a type which implements boolean serialization and
deserialization. It serializes a boolean to the string true
or false
; it deserializes a string (presumably
true
or false
, but allows some wiggle room for
t
, on
, yes
, y
, and
1
) to a boolean value.
class Boolean(object):
def deserialize(self, node, value):
if not isinstance(value, basestring):
raise Invalid(node, '%r is not a string' % value)
= value.lower()
value if value in ('true', 'yes', 'y', 'on', 't', '1'):
return True
return False
def serialize(self, node, value):
if not isinstance(value, bool):
raise Invalid(node, '%r is not a boolean')
return value and 'true' or 'false'
= deserialize
pdeserialize = serialize pserialize
Here's how you would use the resulting class as part of a schema:
import colander
class Schema(colander.MappingSchema):
= colander.SchemaNode(Boolean()) interested
The above schema has a member named interested
which
will now be serialized and deserialized as a boolean, according to the
logic defined in the Boolean
type class.
Note that the only real constraint of a type class is that its
serialize
method must be able to make sense of a value
generated by its deserialize
method and vice versa.
The serialize and deserialize methods of a type accept two values:
node
, and value
. node
will be the
schema node associated with this type. It is used when the type must
raise a colander.Invalid
error, which expects a schema node as
its first constructor argument. value
will be the value
that needs to be serialized or deserialized.
pdeserialize
and pserialize
methods are
required on all types. These are called to "partially" serialize a data
structure. For most "leaf-level" types, partial serialization and
deserialization does not make any sense, so these methods are aliased to
deserialize
and serialize
respectively.
However, for types representing mappings or sequences, they may end up
being different.
For a more formal definition of a the interface of a type, see colander.interfaces.Type
.
Defining a New Validator
A validator is a callable which accepts two positional arguments:
node
and value
. It returns None
if the value is valid. It raises a colander.Invalid
exception if the value is not
valid. Here's a validator that checks if the value is a valid credit
card number.
def luhnok(node, value):
""" checks to make sure that the value passes a luhn mod-10 checksum """
sum = 0
= len(value)
num_digits = num_digits & 1
oddeven
for count in range(0, num_digits):
= int(value[count])
digit
if not (( count & 1 ) ^ oddeven ):
= digit * 2
digit if digit > 9:
= digit - 9
digit
sum = sum + digit
if not (sum % 10) == 0:
raise Invalid(node,
'%r is not a valid credit card number' % value)
Here's how the resulting luhnok
validator might be used
in a schema:
import colander
class Schema(colander.MappingSchema):
= colander.SchemaNode(colander.String(), validator=lunhnok) cc_number
Note that the validator doesn't need to check if the
value
is a string: this has already been done as the result
of the type of the cc_number
schema node being colander.String
. Validators
are always passed the deserialized value when they are
invoked.
The node
value passed to the validator is a schema node
object; it must in turn be passed to the colander.Invalid
exception constructor if one needs to
be raised.
For a more formal definition of a the interface of a validator, see
colander.interfaces.Validator
.
Interface and API Documentation
interfaces.rst api.rst
Indices and tables
genindex
modindex
search