Doc improvements

This commit is contained in:
Ryan Williams
2010-01-18 21:26:05 -08:00
parent f2b831c028
commit 6c215654de
4 changed files with 87 additions and 34 deletions

View File

@@ -1,11 +1,12 @@
Basic Usage
===========
=============
Eventlet is built around the concept of green threads (i.e. coroutines) that are launched to do network-related work. Green threads differ from normal threads in two main ways:
* Green threads are so cheap they are nearly free. You do not have to conserve green threads like you would normal threads. In general, there will be at least one green thread per network connection. Switching between them is quite efficient.
* Green threads cooperatively yield to each other instead of preemptively being scheduled. The major advantage from this behavior is that shared data structures don't need locks, because only if a yield is explicitly called can another green thread have access to the data structure. It is also possible to inspect communication primitives such as queues to see if they have any data or waiting green threads, something that is not possible with preemptive threads.
Eventlet is built around the concept of green threads (i.e. coroutines, we use the terms interchangeably) that are launched to do network-related work. Green threads differ from normal threads in two main ways:
There are a bunch of basic patterns that Eventlet usage falls into. One is the client pattern, which makes a bunch of requests to servers and processes the responses. Another is the server pattern, where an application holds open a socket and processes requests that are incoming on it. These two patterns involve somewhat different usage of Eventlet's primitives, so here are a few examples to show them off.
* Green threads are so cheap they are nearly free. You do not have to conserve green threads like you would normal threads. In general, there will be at least one green thread per network connection.
* Green threads cooperatively yield to each other instead of preemptively being scheduled. The major advantage from this behavior is that shared data structures don't need locks, because only if a yield is explicitly called can another green thread have access to the data structure. It is also possible to inspect primitives such as queues to see if they have any pending data.
There are a bunch of basic patterns that Eventlet usage falls into. Here are a few examples that show their basic structure.
Client-side pattern
--------------------
@@ -23,7 +24,7 @@ The canonical client-side example is a web crawler. This use case is given a li
def fetch(url):
return urllib2.urlopen(url).read()
pool = eventlet.GreenPool(200)
pool = eventlet.GreenPool()
for body in pool.imap(fetch, urls):
print "got body", len(body)
@@ -31,9 +32,9 @@ There is a slightly more complex version of this in the file ``examples/webcrawl
``from eventlet.green import urllib2`` is how you import a cooperatively-yielding version of urllib2. It is the same in all respects to the standard version, except that it uses green sockets for its communication.
``pool = eventlet.GreenPool(200)`` constructs a pool of 200 green threads. Using a pool is good practice because it provides an upper limit on the amount of work that this crawler will be doing simultaneously, which comes in handy when the input data changes dramatically.
``pool = eventlet.GreenPool()`` constructs a :class:`GreenPool <eventlet.greenpool.GreenPool>` of a thousand green threads. Using a pool is good practice because it provides an upper limit on the amount of work that this crawler will be doing simultaneously, which comes in handy when the input data changes dramatically.
``for body in pool.imap(fetch, urls):`` iterates over the results of calling the fetch function in parallel. :meth:`imap <eventlet.parallel.GreenPool.imap>` makes the function calls in parallel, and the results are returned in the order that they were executed.
``for body in pool.imap(fetch, urls):`` iterates over the results of calling the fetch function in parallel. :meth:`imap <eventlet.greenpool.GreenPool.imap>` makes the function calls in parallel, and the results are returned in the order that they were executed.
Server-side pattern
@@ -64,13 +65,58 @@ The file ``examples/echoserver.py`` contains a somewhat more robust and complex
``pool = eventlet.GreenPool(10000)`` creates a pool of green threads that could handle ten thousand clients.
``pool.spawn_n(handle, new_sock)`` launches a green thread to handle the new client. The accept loop doesn't care about the return value of the handle function, so it uses :meth:`spawn_n <eventlet.parallel.GreenPool.spawn_n>`, instead of :meth:`spawn <eventlet.parallel.GreenPool.spawn>`. This is a little bit more efficient.
``pool.spawn_n(handle, new_sock)`` launches a green thread to handle the new client. The accept loop doesn't care about the return value of the ``handle`` function, so it uses :meth:`spawn_n <eventlet.greenpool.GreenPool.spawn_n>`, instead of :meth:`spawn <eventlet.greenpool.GreenPool.spawn>`.
Primary API
===========
.. _using_standard_library_with_eventlet:
The design goal for Eventlet's API is simplicity and readability. You should be able to read its code and understand what it's doing. Fewer lines of code are preferred over excessively clever implementations. Like Python itself, there should be only one right way to do something with Eventlet!
Using the Standard Library with Eventlet
Though Eventlet has many modules, much of the most-used stuff is accessible simply by doing ``import eventlet``
.. function:: eventlet.spawn(func, *args, **kw)
This launches a greenthread to call *func*. Spawning off multiple greenthreads gets work done in parallel. The return value from ``spawn`` is a :class:`greenthread.GreenThread` object, which can be used to retrieve the return value of *func*. See :func:`greenthread.spawn` for more details.
.. function:: eventlet.spawn_n(func, *args, **kw)
The same as :func:`spawn`, but it's not possible to retrieve the return value. This makes execution faster. See :func:`greenthread.spawn_n` for more details.
.. function:: eventlet.sleep(seconds)
Suspends the current greenthread and allows others a chance to process. See :func:`greenthread.sleep` for more details.
.. class:: eventlet.GreenPool
Pools control concurrency. It's very common in applications to want to consume only a finite amount of memory, or to restrict the amount of connections that one part of the code holds open so as to leave more for the rest, or to behave consistently in the face of unpredictable input data. GreenPools provide this control. See :class:`greenpool.GreenPool` for more on how to use these.
.. class:: eventlet.GreenPile
Sister class to the GreenPool, GreenPile objects represent chunks of work. In essence a GreenPile is an iterator that can be stuffed with work, and the results read out later. See :class:`greenpool.GreenPile` for more details.
.. class:: eventlet.Queue
Queues are a fundamental construct for communicating data between execution units. Eventlet's Queue class is used to communicate between greenthreads, and provides a bunch of useful features for doing that. See :class:`queue.Queue` for more details.
These are the basic primitives of Eventlet; there are a lot more out there in the other Eventlet modules; check out the :doc:`modules`.
Green Libraries
----------------
The package ``eventlet.green`` contains libraries that have the same interfaces as common standard ones, but they are modified to behave well with green threads. This can be preferable than monkeypatching in many circumstances, because it may be necessary to interoperate with some module that needs the standard libraries unmolested, or simply because it's good engineering practice to be able to understand how a file behaves based simply on its contents.
To use green libraries, simply import the desired module from ``eventlet.green``::
from eventlet.green import socket
from eventlet.green import threading
from eventlet.green import asyncore
That's all there is to it!
Monkeypatching the Standard Library
----------------------------------------
.. automethod:: eventlet.util::wrap_socket_with_coroutine_socket
@@ -78,13 +124,9 @@ Using the Standard Library with Eventlet
Eventlet's socket object, whose implementation can be found in the
:mod:`eventlet.greenio` module, is designed to match the interface of the
standard library :mod:`socket` object. However, it is often useful to be able to
use existing code which uses :mod:`socket` directly without modifying it to use the
eventlet apis. To do this, one must call
:func:`~eventlet.util.wrap_socket_with_coroutine_socket`. It is only necessary
use existing code which uses :mod:`socket` directly without modifying it to use the eventlet apis. To do this, one must call :func:`~eventlet.util.wrap_socket_with_coroutine_socket`. It is only necessary
to do this once, at the beginning of the program, and it should be done before
any socket objects which will be used are created. At some point we may decide
to do this automatically upon import of eventlet; if you have an opinion about
whether this is a good or a bad idea, please let us know.
any socket objects which will be used are created.
.. automethod:: eventlet.util::wrap_select_with_coroutine_select
@@ -93,10 +135,4 @@ such as calling :mod:`select` with only one file descriptor and a timeout to
prevent the operation from being unbounded. For this specific situation there
is :func:`~eventlet.util.wrap_select_with_coroutine_select`; however it's
always a good idea when trying any new library with eventlet to perform some
tests to ensure eventlet is properly able to multiplex the operations. If you
find a library which appears not to work, please mention it on the mailing list
to find out whether someone has already experienced this and worked around it,
or whether the library needs to be investigated and accommodated. One idea
which could be implemented would add a file mapping between common module names
and corresponding wrapper functions, so that eventlet could automatically
execute monkey patch functions based on the modules that are imported.
tests to ensure eventlet is properly able to multiplex the operations.

View File

@@ -14,7 +14,7 @@ Eventlet has multiple hub implementations, and when you start using it, it tries
**pyevent**
This is a libevent-based backend and is thus the fastest. It's disabled by default, because it does not support native threads, but you can enable it yourself if your use case doesn't require them.
There is one function that is of interest as regards hubs.
If the selected hub is not idea for the application, another can be selected.
.. function:: eventlet.hubs.use_hub(hub=None)
@@ -35,5 +35,14 @@ There is one function that is of interest as regards hubs.
Supplying None as the argument to :func:`eventlet.hubs.use_hub` causes it to select the default hub.
.. autofunction:: eventlet.hubs.get_hub
.. autofunction:: eventlet.hubs.get_default_hub
.. autofunction:: eventlet.hubs.get_default_hub
How the Hubs Work
-----------------
The hub has a main greenlet, MAINLOOP. When one of the running coroutines needs
to do some I/O, it registers a listener with the hub (so that the hub knows when to wake it up again), and then switches to MAINLOOP (via ``get_hub().switch()``). If there are other coroutines that are ready to run, MAINLOOP switches to them, and when they complete or need to do more I/O, they switch back to the MAINLOOP. In this manner, MAINLOOP ensures that every coroutine gets scheduled when it has some work to do.
MAINLOOP is launched only when the first I/O operation happens, and it is not the same greenlet that __main__ is running in. This lazy launching is why it's not necessary to explicitly call a dispatch() method like other frameworks, which in turn means that code can start using Eventlet without needing to be substantially restructured.

View File

@@ -53,8 +53,15 @@ class GreenPool(object):
def spawn(self, function, *args, **kwargs):
"""Run the *function* with its arguments in its own green thread.
Returns the GreenThread object that is running the function, which can
be used to retrieve the results.
Returns the :class:`GreenThread <eventlet.greenthread.GreenThread>`
object that is running the function, which can be used to retrieve the
results.
If the pool is currently at capacity, ``spawn`` will block until one of
the running greenthreads completes its task and frees up a slot.
This function is reentrant; *function* can call ``spawn`` on the same
pool without risk of deadlocking the whole thing.
"""
# if reentering an empty pool, don't try to wait on a coroutine freeing
# itself -- instead, just execute in the current coroutine
@@ -88,19 +95,20 @@ class GreenPool(object):
coro = greenthread.getcurrent()
self._spawn_done(coro)
def spawn_n(self, func, *args, **kwargs):
""" Create a greenthread to run the *function*. Returns None; the
results of the function are not retrievable.
def spawn_n(self, function, *args, **kwargs):
""" Create a greenthread to run the *function*, the same as
:meth:`spawn`. The difference is that :meth:`spawn_n` returns
None; the results of *function* are not retrievable.
"""
# if reentering an empty pool, don't try to wait on a coroutine freeing
# itself -- instead, just execute in the current coroutine
current = greenthread.getcurrent()
if self.sem.locked() and current in self.coroutines_running:
self._spawn_n_impl(func, args, kwargs)
self._spawn_n_impl(function, args, kwargs)
else:
self.sem.acquire()
g = greenthread.spawn_n(self._spawn_n_impl,
func, args, kwargs, coro=True)
function, args, kwargs, coro=True)
if not self.coroutines_running:
self.no_coros_running = event.Event()
self.coroutines_running.add(g)

View File

@@ -207,7 +207,7 @@ class GreenThread(greenlet.greenlet):
def func(gt, [curried args/kwargs]):
When the GreenThread finishes its run, it calls *func* with itself
and with the arguments supplied at link-time. If the function wants
and with the `curried arguments <http://en.wikipedia.org/wiki/Currying>`_ supplied at link-time. If the function wants
to retrieve the result of the GreenThread, it should call wait()
on its first argument.