Go to file

Samuel Merritt cc2f0f4ed6 Speed up reading and writing xattrs for object metadata

Object metadata is stored as a pickled hash: first the data is
pickled, then split into strings of length <= 254, then stored in a
series of extended attributes named "user.swift.metadata",
"user.swift.metadata1", "user.swift.metadata2", and so forth.

The choice of length 254 is odd, undocumented, and dates back to the
initial commit of Swift. From talking to people, I believe this was an
attempt to fit the first xattr in the inode, thus avoiding a
seek. However, it doesn't work. XFS _either_ stores all the xattrs
together in the inode (local), _or_ it spills them all to blocks
located outside the inode (extents or btree). Using short xattrs
actually hurts us here; by splitting into more pieces, we end up with
more names to store, thus reducing the metadata size that'll fit in
the inode.

[Source: http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure//tmp/en-US/html/Extended_Attributes.html]

I did some benchmarking of read_metadata with various xattr sizes
against an XFS filesystem on a spinning disk, no VMs involved.

Summary:

 name | rank | runs |      mean |        sd | timesBaseline
------|------|------|-----------|-----------|--------------
32768 |    1 | 2500 | 0.0001195 |  3.75e-05 |           1.0
16384 |    2 | 2500 | 0.0001348 | 1.869e-05 | 1.12809122912
 8192 |    3 | 2500 | 0.0001604 | 2.708e-05 | 1.34210998858
 4096 |    4 | 2500 | 0.0002326 | 0.0004816 | 1.94623473988
 2048 |    5 | 2500 | 0.0003414 | 0.0001409 | 2.85674781189
 1024 |    6 | 2500 | 0.0005457 | 0.0001741 | 4.56648611635
  254 |    7 | 2500 |  0.001848 |  0.001663 | 15.4616067887

Here, "name" is the chunk size for the pickled metadata. A total
metadata size of around 31.5 KiB was used, so the "32768" runs
represent storing everything in one single xattr, while the "254" runs
represent things as they are without this change.

Since bigger xattr chunks make things go faster, the new chunk size is
64 KiB. That's the biggest xattr that XFS allows.

Reading of metadata from existing files is unaffected; the
read_metadata() function already handles xattrs of any size.

On non-XFS filesystems, this is no worse than what came before:

ext4 has a limit of one block (typically 4 KiB) for all xattrs (names
and values) taken together [1], so this change slightly increases the
amount of Swift metadata that can be stored on ext4.

ZFS let me store an xattr with an 8 MiB value, so that's plenty. It'll
probably go further, but I stopped there.

[1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Extended_Attributes

Change-Id: Ie22db08ac0050eda693de4c30d4bc0d620e7f7d4

2014-12-05 15:52:58 -08:00

bin

Correct misspelled words

2014-11-25 15:44:30 +00:00

doc

Merge "Fix typo in apache_deployment doc"

2014-11-25 04:14:15 +00:00

etc

fix example typo

2014-11-20 11:38:49 +09:00

examples

Add a user variable to templates

2013-09-17 11:46:04 +10:00

swift

Speed up reading and writing xattrs for object metadata

2014-12-05 15:52:58 -08:00

test

Fix the behavior of swift-ring-builder list_parts before rebalance

2014-12-06 02:44:59 +09:00

.coveragerc

Align tox.ini and fix coverage jobs in jenkins.

2012-06-08 20:05:14 -04:00

.functests

Move the tests from functionalnosetests

2014-01-07 15:58:11 +08:00

.gitignore

fix(gitignore) : ignore *.egg and *.egg-info

2013-07-30 15:11:00 -04:00

.gitreview

Add .gitreview config file for gerrit.

2011-10-24 15:05:49 -04:00

.mailmap

updated AUTHORS and CHANGELOG for 2.2.0

2014-10-03 15:52:20 -07:00

.probetests

Allow specify arguments to .probetests script

2013-12-24 01:18:19 -08:00

.unittests

Fix coverage report for newer versions of coverage

2014-04-24 16:50:03 +00:00

AUTHORS

Merge "update AUTHORS"

2014-11-21 02:30:02 +00:00

babel.cfg

add pybabel setup.py commands and initial .pot

2011-01-27 00:01:24 +00:00

CHANGELOG

updated AUTHORS and CHANGELOG for 2.2.0

2014-10-03 15:52:20 -07:00

CONTRIBUTING.md

Workflow documentation is now in infra-manual

2014-12-05 15:30:27 +11:00

LICENSE

Convert LICENSE to use unix style line endings.

2012-12-19 12:48:27 -05:00

MANIFEST.in

Add requirements files to the source distribution

2013-06-03 19:26:20 +04:00

README.md

added testing notes to the contributing doc

2014-12-04 10:41:11 -05:00

requirements.txt

warn against sorting requirements

2014-09-03 12:03:57 -05:00

setup.cfg

Fix translation setup

2014-11-19 09:11:55 -05:00

setup.py

taking the global reqs that we can

2014-05-21 09:37:22 -07:00

test-requirements.txt

warn against sorting requirements

2014-09-03 12:03:57 -05:00

tox.ini

updated hacking rules

2014-09-25 11:04:31 -07:00

README.md

Swift

A distributed object storage system designed to scale from a single machine to thousands of servers. Swift is optimized for multi-tenancy and high concurrency. Swift is ideal for backups, web and mobile content, and any other unstructured data that can grow without bound.

Swift provides a simple, REST-based API fully documented at http://docs.openstack.org/.

Swift was originally developed as the basis for Rackspace's Cloud Files and was open-sourced in 2010 as part of the OpenStack project. It has since grown to include contributions from many companies and has spawned a thriving ecosystem of 3rd party tools. Swift's contributors are listed in the AUTHORS file.

Docs

To build documentation install sphinx (pip install sphinx), run python setup.py build_sphinx, and then browse to /doc/build/html/index.html. These docs are auto-generated after every commit and available online at http://docs.openstack.org/developer/swift/.

For Developers

The best place to get started is the "SAIO - Swift All In One". This document will walk you through setting up a development cluster of Swift in a VM. The SAIO environment is ideal for running small-scale tests against swift and trying out new features and bug fixes.

You can run unit tests with .unittests and functional tests with .functests.

If you would like to start contributing, check out these notes to help you get started.

Code Organization

bin/: Executable scripts that are the processes run by the deployer
doc/: Documentation
etc/: Sample config files
swift/: Core code
- account/: account server
- common/: code shared by different modules
  - middleware/: "standard", officially-supported middleware
  - ring/: code implementing Swift's ring
- container/: container server
- obj/: object server
- proxy/: proxy server
test/: Unit and functional tests

Data Flow

Swift is a WSGI application and uses eventlet's WSGI server. After the processes are running, the entry point for new requests is the Application class in swift/proxy/server.py. From there, a controller is chosen, and the request is processed. The proxy may choose to forward the request to a back- end server. For example, the entry point for requests to the object server is the ObjectController class in swift/obj/server.py.

For Deployers

Deployer docs are also available at http://docs.openstack.org/developer/swift/. A good starting point is at http://docs.openstack.org/developer/swift/deployment_guide.html

You can run functional tests against a swift cluster with .functests. These functional tests require /etc/swift/test.conf to run. A sample config file can be found in this source tree in test/sample.conf.

For Client Apps

For client applications, official Python language bindings are provided at http://github.com/openstack/python-swiftclient.

Complete API documentation at http://docs.openstack.org/api/openstack-object-storage/1.0/content/

For more information come hang out in #openstack-swift on freenode.

Thanks,

The Swift Development Team