A simple Python interface for implementing erasure codes
Go to file
2024-10-22 18:37:26 +00:00
doc Enable HTML doc builds 2024-09-20 11:01:05 -07:00
pyeclib Prep for 1.6.2rc1 release 2024-10-21 16:27:07 -07:00
src/c/pyeclib_c Add close/destroy methods; clean up descriptors 2024-10-10 21:51:41 -07:00
test Merge "Add a multi-threaded test" 2024-10-22 18:37:26 +00:00
tools Parse wheel metadata 2024-10-09 23:57:25 +00:00
.dockerignore Add Dockerfile to build manylinux wheels 2024-09-26 12:18:56 -07:00
.gitignore Add Dockerfile to build manylinux wheels 2024-09-26 12:18:56 -07:00
.gitreview Update .gitreview and playbooks following project rename 2019-06-12 11:09:11 -07:00
.mailmap Release 1.5.0 2017-06-06 21:11:13 -04:00
.travis.yml tox related fixes for travis-ci 2015-11-23 03:40:47 +00:00
.unittests .unittests: change nose to pytest 2024-09-23 11:58:25 +10:00
.zuul.yaml Merge "Publish manylinux wheels" 2024-10-21 23:06:04 +00:00
AUTHORS Prep for 1.6.2rc1 release 2024-10-21 16:27:07 -07:00
bindep.txt Switch from yasm to nasm 2023-07-11 13:55:29 -07:00
ChangeLog Prep for 1.6.2rc1 release 2024-10-21 16:27:07 -07:00
Dockerfile Build musllinux wheels 2024-10-09 14:50:06 -07:00
LICENSE Move license file 2021-11-27 16:47:19 -08:00
Makefile Clean py34 shared libraries created during build 2015-08-05 17:45:15 +00:00
MANIFEST.in Exclude test/ec_pyeclib_file_test.sh from sdists 2024-09-10 09:43:48 -07:00
pack_wheel.py Build musllinux wheels 2024-10-09 14:50:06 -07:00
README.rst Publish manylinux wheels 2024-10-09 10:24:20 -07:00
setup.py Prep for 1.6.2rc1 release 2024-10-21 16:27:07 -07:00
test-requirements.txt Fix gate 2023-05-01 08:49:55 -07:00
tox.ini Add Dockerfile to build manylinux wheels 2024-09-26 12:18:56 -07:00

PyEClib

This library provides a simple Python interface for implementing erasure codes and is known to work with Python 2.7 and 3.5 through 3.12. To obtain the best possible performance, the library utilizes liberasurecode, which is a C based erasure code library.

PyECLib supports a variety of Erasure Coding backends including the standard Reed-Solomon implementations provided by Jerasure [1], liberasurecode [3], Intel's ISA-L [4] and Phazr.IO's libphazr. It also provides support for a flat XOR-based encoder and decoder (part of liberasurecode) - a class of HD Combination Codes based on "Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs" in IEEE MSST 2010[2]). These codes are well-suited to archival use-cases, have a simple construction and require a minimum number of participating disks during single-disk reconstruction (think XOR-based LRC code).


Installation

Install pre-requisites:

  • Python 2.7 or 3.5+ (including development packages), argparse, setuptools
  • liberasurecode v1.4.0 or greater [3]
  • Erasure code backend libraries, gf-complete and Jerasure [1],[2], ISA-L [4], etc

Install dependencies:

Debian/Ubuntu hosts:

$ sudo apt-get install build-essential python-dev python-pip liberasurecode-dev
$ sudo pip install -U bindep -r test-requirements.txt

RHEL/CentOS hosts:

$ sudo yum install -y redhat-lsb python2-pip python-devel liberasurecode-devel
$ sudo pip install -U bindep -r test-requirements.txt
$ tools/test-setup.sh

If you want to confirm all dependency packages installed successfully, try:

$ sudo bindep -f bindep.txt

For CentOS, make sure to install the latest Openstack Cloud SIG repo to be able to install the latest available version of liberasurecode-devel.

Install PyECLib:

$ sudo python setup.py install

Run test suite included:

$ ./.unittests

If the test suite fails because it cannot find any of the shared libraries, then you probably need to add /usr/local/lib to the path searched when loading libraries. The best way to do this (on Linux) is to add '/usr/local/lib' to:

/etc/ld.so.conf

and then make sure to run:

$ sudo ldconfig

Getting started

Examples of using PyECLib are provided in the "tools" directory:

Command-line encoder:

tools/pyeclib_encode.py

Command-line decoder:

tools/pyeclib_decode.py

Utility to determine what is needed to reconstruct missing fragments:

tools/pyeclib_fragments_needed.py

A configuration utility to help compare available EC schemes in terms of performance and redundancy:

tools/pyeclib_conf_tool.py

PyEClib initialization:

ec_driver = ECDriver(k=<num_encoded_data_fragments>,
                     m=<num_encoded_parity_fragments>,
                     ec_type=<ec_scheme>))

Supported ec_type values:

  • liberasurecode_rs_vand => Vandermonde Reed-Solomon encoding, software-only backend implemented by liberasurecode [3]
  • jerasure_rs_vand => Vandermonde Reed-Solomon encoding, based on Jerasure [1]
  • jerasure_rs_cauchy => Cauchy Reed-Solomon encoding (Jerasure variant), based on Jerasure [1]
  • flat_xor_hd_3, flat_xor_hd_4 => Flat-XOR based HD combination codes, liberasurecode [3]
  • isa_l_rs_vand => Intel Storage Acceleration Library (ISA-L) - SIMD accelerated Erasure Coding backends [4]
  • isa_l_rs_cauchy => Cauchy Reed-Solomon encoding (ISA-L variant) [4]
  • shss => NTT Lab Japan's Erasure Coding Library [5]
  • libphazr => Phazr.IO's erasure code library with built-in privacy [6]

Code Maintenance

This library is currently mainly maintained by the Openstack Swift community. For questions or any other help, come ask in #openstack-swift on OFTC.


References

[1] Jerasure, C library that supports erasure coding in storage applications, http://jerasure.org

[2] Greenan, Kevin M et al, "Flat XOR-based erasure codes in storage systems", http://www.kaymgee.com/Kevin_Greenan/Publications_files/greenan-msst10.pdf

[3] liberasurecode, C API abstraction layer for erasure coding backends, https://opendev.org/openstack/liberasurecode

[4] Intel(R) Storage Acceleration Library (Open Source Version), https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version

[5] Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>, "NTT SHSS Erasure Coding backend"

[6] Jim Cheung <support@phazr.io>, "Phazr.IO libphazr erasure code backend with built-in privacy"