Documentation sweep -- slightly improved documentation for a bunch of things, and beefed up the examples to keep up with the parlance of our times.

2009-12-31 22:30:08 -08:00
parent 3ddbba23de
commit 4aa200caee
10 changed files with 143 additions and 110 deletions
--- a/doc/basic_usage.rst
+++ b/doc/basic_usage.rst
@@ -1,40 +1,71 @@
 Basic Usage
 ===========

-Most of the APIs required for basic eventlet usage are exported by the eventlet.api module.
+Eventlet is built around the concept of green threads (i.e. coroutines) that are launched to do network-related work.  Green threads differ from normal threads in two main ways:
+* Green threads are so cheap they are nearly free.  You do not have to conserve green threads like you would normal threads.  In general, there will be at least one green thread per network connection.  Switching between them is quite efficient.
+* Green threads cooperatively yield to each other instead of preemptively being scheduled.  The major advantage from this behavior is that shared data structures don't need locks, because only if a yield is explicitly called can another green thread have access to the data structure.  It is also possible to inspect communication primitives such as queues to see if they have any data or waiting green threads, something that is not possible with preemptive threads.

-Here are some basic functions that manipulate coroutines.
+There are a bunch of basic patterns that Eventlet usage falls into.  One is the client pattern, which makes a bunch of requests to servers and processes the responses.  Another is the server pattern, where an application holds open a socket and processes requests that are incoming on it.  These two patterns involve somewhat different usage of Eventlet's primitives, so here are a few examples to show them off.

-.. automethod:: eventlet.api::spawn
+Client-side pattern
+--------------------

-.. automethod:: eventlet.api::sleep
-
-.. automethod:: eventlet.api::call_after
-
-.. automethod:: eventlet.api::exc_after
-
-Socket Functions
-----------------
-
-.. |socket| replace:: ``socket.socket``
-.. _socket: http://docs.python.org/library/socket.html#socket-objects
-.. |select| replace:: ``select.select``
-.. _select: http://docs.python.org/library/select.html
+The canonical client-side example is a web crawler.  This use case is given a list of urls and wants to retrieve their bodies for later processing.  Here is a very simple example::


-Eventlet provides convenience functions that return green sockets. The green
-socket objects have the same interface as the standard library |socket|_
-object, except they will automatically cooperatively yield control to other
-eligible coroutines instead of blocking. Eventlet also has the ability to
-monkey patch the standard library |socket|_ object so that code which uses
-it will also automatically cooperatively yield; see
-:ref:`using_standard_library_with_eventlet`.
+  urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
+         "https://wiki.secondlife.com/w/images/secondlife.jpg",
+         "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
+  
+  import eventlet
+  from eventlet.green import urllib2  

-.. automethod:: eventlet.api::tcp_listener
+  def fetch(url):
+      return urllib2.urlopen(url).read()
+  
+  pool = eventlet.GreenPool(200)
+  for body in pool.imap(fetch, urls):
+      print "got body", len(body)

-.. automethod:: eventlet.api::connect_tcp
+There is a slightly more complex version of this in the file ``examples/webcrawler.py`` in the source distribution.  Here's a tour of the interesting lines in this crawler. 
+
+``from eventlet.green import urllib2`` is how you import a cooperatively-yielding version of urllib2.  It is the same in all respects to the standard version, except that it uses green sockets for its communication.
+
+``pool = eventlet.GreenPool(200)`` constructs a pool of 200 green threads.  Using a pool is good practice because it provides an upper limit on the amount of work that this crawler will be doing simultaneously, which comes in handy when the input data changes dramatically.
+
+``for body in pool.imap(fetch, urls):`` iterates over the results of calling the fetch function in parallel.  :meth:`imap <eventlet.parallel.GreenPool.imap>` makes the function calls in parallel, and the results are returned in the order that they were executed.
+
+
+Server-side pattern
+--------------------
+
+Here's a simple server-side example, a simple echo server::
+    
+    import eventlet
+    from eventlet.green import socket
+    
+    def handle(client):
+        while True:
+            c = client.recv(1)
+            if not c: break
+            client.sendall(c)
+    
+    server = socket.socket()
+    server.bind(('0.0.0.0', 6000))
+    server.listen(50)
+    pool = eventlet.GreenPool(10000)
+    while True:
+        new_sock, address = server.accept()
+        pool.spawn_n(handle, new_sock)
+
+The file ``examples/echoserver.py`` contains a somewhat more robust and complex version of this example.
+
+``from eventlet.green import socket`` imports eventlet's socket module, which is just like the regular socket module, but cooperatively yielding.
+
+``pool = eventlet.GreenPool(10000)`` creates a pool of green threads that could handle ten thousand clients.  
+
+``pool.spawn_n(handle, new_sock)`` launches a green thread to handle the new client.  The accept loop doesn't care about the return value of the handle function, so it uses :meth:`spawn_n <eventlet.parallel.GreenPool.spawn_n>`, instead of :meth:`spawn <eventlet.parallel.GreenPool.spawn>`.  This is a little bit more efficient.

-.. automethod:: eventlet.api::ssl_listener


 .. _using_standard_library_with_eventlet:
@@ -46,8 +77,8 @@ Using the Standard Library with Eventlet

 Eventlet's socket object, whose implementation can be found in the
 :mod:`eventlet.greenio` module, is designed to match the interface of the
-standard library |socket|_ object. However, it is often useful to be able to
-use existing code which uses |socket|_ directly without modifying it to use the
+standard library :mod:`socket` object. However, it is often useful to be able to
+use existing code which uses :mod:`socket` directly without modifying it to use the
 eventlet apis. To do this, one must call
 :func:`~eventlet.util.wrap_socket_with_coroutine_socket`. It is only necessary
 to do this once, at the beginning of the program, and it should be done before
@@ -58,7 +89,7 @@ whether this is a good or a bad idea, please let us know.
 .. automethod:: eventlet.util::wrap_select_with_coroutine_select

 Some code which is written in a multithreaded style may perform some tricks,
-such as calling |select|_ with only one file descriptor and a timeout to
+such as calling :mod:`select` with only one file descriptor and a timeout to
 prevent the operation from being unbounded. For this specific situation there
 is :func:`~eventlet.util.wrap_select_with_coroutine_select`; however it's
 always a good idea when trying any new library with eventlet to perform some
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -3,38 +3,26 @@ Eventlet

 Eventlet is a networking library written in Python. It achieves high scalability by using `non-blocking io <http://en.wikipedia.org/wiki/Asynchronous_I/O#Select.28.2Fpoll.29_loops>`_ while at the same time retaining high programmer usability by using `coroutines <http://en.wikipedia.org/wiki/Coroutine>`_ to make the non-blocking io operations appear blocking at the source code level.

-Eventlet is different from all the other event-based frameworks out there because it doesn't require you to restructure your code to use it.  You don't have to rewrite your code to use callbacks, and you don't have to replace your main() method with some sort of dispatch method.  You can just sprinkle eventlet on top of your normal-looking code.
+Eventlet is different from other event-based frameworks out there because it doesn't require you to restructure your code to use it.  You don't have to rewrite your code to use callbacks, and you don't have to replace your main() method with some sort of dispatch method.  You can just sprinkle eventlet on top of your code.

 Web Crawler Example
 -------------------

-This is a simple web "crawler" that fetches a bunch of urls using a coroutine pool.  It has as much concurrency (i.e. pages being fetched simultaneously) as coroutines in the pool (in our example, 4).
+This is a simple web crawler that fetches a bunch of urls using a coroutine pool.  It has as much concurrency (i.e. pages being fetched simultaneously) as coroutines in the pool::

-::
-
-  urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
-         "http://wiki.secondlife.com/w/images/secondlife.jpg",
+    urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
+         "https://wiki.secondlife.com/w/images/secondlife.jpg",
         "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
-  
-  import time
-  from eventlet import coros
-  
-  # this imports a special version of the urllib2 module that uses non-blocking IO
-  from eventlet.green import urllib2
-  
-  def fetch(url):
-      print "%s fetching %s" % (time.asctime(), url)
-      data = urllib2.urlopen(url)
-      print "%s fetched %s" % (time.asctime(), data)
-  
-  pool = coros.CoroutinePool(max_size=4)
-  waiters = []
-  for url in urls:
-      waiters.append(pool.execute(fetch, url))
-  
-  # wait for all the coroutines to come back before exiting the process
-  for waiter in waiters:
-      waiter.wait()
+    
+    import eventlet
+    from eventlet.green import urllib2  
+    
+    def fetch(url):
+      return urllib2.urlopen(url).read()
+    
+    pool = eventlet.GreenPool(200)
+    for body in pool.imap(fetch, urls):
+      print "got body", len(body)
      

 Contents
--- a/doc/modules.rst
+++ b/doc/modules.rst
@@ -10,6 +10,7 @@ Module Reference
   modules/coros
   modules/db_pool
   modules/greenio
+   modules/parallel
   modules/pool
   modules/pools
   modules/processes
--- a/doc/modules/backdoor.rst
+++ b/doc/modules/backdoor.rst
@@ -3,4 +3,4 @@

 .. automodule:: eventlet.backdoor
 	:members:
-	:undoc-members:
+
--- a/doc/modules/greenio.rst
+++ b/doc/modules/greenio.rst
@@ -1,5 +1,5 @@
-:mod:`greenio` -- Greenlet file objects
-========================================
+:mod:`greenio` -- Cooperative network primitives
+=================================================

 .. automodule:: eventlet.greenio
 	:members:
--- a/eventlet/backdoor.py
+++ b/eventlet/backdoor.py
@@ -68,9 +68,10 @@ class SocketConsole(greenlets.greenlet):
        print "backdoor closed to %s:%s" % self.hostport


-def backdoor_server(server, locals=None):
-    """ Runs a backdoor server on the socket, accepting connections and 
-    running backdoor consoles for each client that connects.
+def backdoor_server(sock, locals=None):
+    """ Blocking function that runs a backdoor server on the socket *sock*, 
+    accepting connections and running backdoor consoles for each client that
+    connects.
    """
    print "backdoor server listening on %s:%s" % server.getsockname()
    try:
@@ -87,17 +88,18 @@ def backdoor_server(server, locals=None):


 def backdoor((conn, addr), locals=None):
-    """Sets up an interactive console on a socket with a connected client.  
-    This does not block the caller, as it spawns a new greenlet to handle 
-    the console.
+    """Sets up an interactive console on a socket with a single connected
+    client.  This does not block the caller, as it spawns a new greenlet to 
+    handle the console.  This is meant to be called from within an accept loop
+    (such as backdoor_server).
    """
    host, port = addr
    print "backdoor to %s:%s" % (host, port)
    fl = conn.makeGreenFile("rw")
    fl.newlines = '\n'
-    greenlet = SocketConsole(fl, (host, port), locals)
+    console = SocketConsole(fl, (host, port), locals)
    hub = hubs.get_hub()
-    hub.schedule_call_global(0, greenlet.switch)
+    hub.schedule_call_global(0, console.switch)


 if __name__ == '__main__':
--- a/eventlet/greenthread.py
+++ b/eventlet/greenthread.py
@@ -3,7 +3,7 @@ import sys
 from eventlet import hubs
 from eventlet.support import greenlets as greenlet

-__all__ = ['getcurrent', 'sleep', 'spawn', 'spawn_n', 'GreenThread', 'Event'] 
+__all__ = ['getcurrent', 'sleep', 'spawn', 'spawn_n', 'call_after_global', 'call_after_local', 'GreenThread', 'Event'] 

 getcurrent = greenlet.getcurrent

@@ -28,8 +28,8 @@ def sleep(seconds=0):
        

 def spawn(func, *args, **kwargs):
-    """Create a green thread to run func(*args, **kwargs).  Returns a GreenThread 
-    object which you can use to get the results of the call.
+    """Create a green thread to run func(*args, **kwargs).  Returns a 
+    GreenThread object which you can use to get the results of the call.
    """
    hub = hubs.get_hub()
    g = GreenThread(hub.greenlet)
--- a/eventlet/parallel.py
+++ b/eventlet/parallel.py
@@ -42,13 +42,16 @@ class GreenPool(object):
        return len(self.coroutines_running)

    def free(self):
-        """ Returns the number of coroutines available for use."""
+        """ Returns the number of coroutines available for use.
+        
+        If zero or less, the next call to :meth:`spawn` will block the calling
+        coroutine until a slot becomes available."""
        return self.sem.counter

-    def spawn(self, func, *args, **kwargs):
-        """Run func(*args, **kwargs) in its own green thread.  Returns the
-        GreenThread object that is running the function, which can be used
-        to retrieve the results.
+    def spawn(self, function, *args, **kwargs):
+        """Run the *function* with its arguments in its own green thread.
+        Returns the GreenThread object that is running the function, which can
+        be used to retrieve the results.
        """
        # if reentering an empty pool, don't try to wait on a coroutine freeing
        # itself -- instead, just execute in the current coroutine
@@ -56,11 +59,11 @@ class GreenPool(object):
        if self.sem.locked() and current in self.coroutines_running:
            # a bit hacky to use the GT without switching to it
            gt = greenthread.GreenThread(current)
-            gt.main(func, args, kwargs)
+            gt.main(function, args, kwargs)
            return gt
        else:
            self.sem.acquire()
-            gt = greenthread.spawn(func, *args, **kwargs)
+            gt = greenthread.spawn(function, *args, **kwargs)
            if not self.coroutines_running:
                self.no_coros_running = greenthread.Event()
            self.coroutines_running.add(gt)
@@ -84,9 +87,8 @@ class GreenPool(object):
                self._spawn_done(coro=coro)
    
    def spawn_n(self, func, *args, **kwargs):
-        """ Create a coroutine to run func(*args, **kwargs).
-        
-        Returns None; the results of the function are not retrievable.
+        """ Create a coroutine to run the *function*.  Returns None; the results
+        of the function are not retrievable.
        """
        # if reentering an empty pool, don't try to wait on a coroutine freeing
        # itself -- instead, just execute in the current coroutine
@@ -128,8 +130,8 @@ class GreenPool(object):

    def imap(self, function, *iterables):
        """This is the same as itertools.imap, except that *func* is 
-        executed in separate green threads, with the specified concurrency 
-        control.  Using imap consumes a constant amount of memory,
+        executed in separate green threads, with the concurrency controlled by
+        the pool. In operation, imap consumes a constant amount of memory,
        proportional to the size of the pool, and is thus suited for iterating
        over extremely long input lists.
        """
@@ -147,6 +149,14 @@ def raise_stop_iteration():
        
 class GreenPile(object):
    """GreenPile is an abstraction representing a bunch of I/O-related tasks.
+    
+    Construct a GreenPile with an existing GreenPool object.  The GreenPile will
+    then use that pool's concurrency as it processes its jobs.  There can be 
+    many GreenPiles associated with a single GreenPool.
+    
+    A GreenPile can also be constructed standalone, not associated with any 
+    GreenPool.  To do this, construct it with an integer size parameter instead 
+    of a GreenPool
    """
    def __init__(self, size_or_pool):
        if isinstance(size_or_pool, GreenPool):
--- a/examples/echoserver.py
+++ b/examples/echoserver.py
@@ -10,24 +10,30 @@ You terminate your connection by terminating telnet (typically Ctrl-]
 and then 'quit')
 """

-from eventlet import api
+import eventlet
+from eventlet.green import socket

-def handle_socket(reader, writer):
+def handle(reader, writer):
    print "client connected"
    while True:
        # pass through every non-eof line
        x = reader.readline()
        if not x: break
        writer.write(x)
-        print "echoed", x
+        writer.flush()
+        print "echoed", x,
    print "client disconnected"

 print "server socket listening on port 6000"
-server = api.tcp_listener(('0.0.0.0', 6000))
+server = socket.socket()
+server.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR, 1)
+server.bind(('0.0.0.0', 6000))
+server.listen(50)
+pool = eventlet.GreenPool(10000)
 while True:
    try:
        new_sock, address = server.accept()
-    except KeyboardInterrupt:
+        print "accepted", address
+        pool.spawn_n(handle, new_sock.makefile('r'), new_sock.makefile('w'))
+    except (SystemExit, KeyboardInterrupt):
        break
-    # handle every new connection with a new coroutine
-    api.spawn(handle_socket, new_sock.makefile('r'), new_sock.makefile('w'))
--- a/examples/webcrawler.py
+++ b/examples/webcrawler.py
@@ -2,32 +2,27 @@
 """\
@file webcrawler.py

-This is a simple web "crawler" that fetches a bunch of urls using a coroutine pool.  It fetches as
- many urls at time as coroutines in the pool.
+This is a simple web "crawler" that fetches a bunch of urls using a pool to 
+control the number of outbound connections. It has as many simultaneously open
+connections as coroutines in the pool.
+
+The prints in the body of the fetch function are there to demonstrate that the
+requests are truly made in parallel.
 """

 urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
-        "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif",
-        "http://eventlet.net"]
+     "https://wiki.secondlife.com/w/images/secondlife.jpg",
+     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

-import time
-from eventlet.green import urllib2
-from eventlet import coros
+import eventlet
+from eventlet.green import urllib2  

 def fetch(url):
-    # we could do something interesting with the result, but this is
-    # example code, so we'll just report that we did it
-    print "%s fetching %s" % (time.asctime(), url)
-    req = urllib2.urlopen(url)
-    print "%s fetched %s (%s)" % (time.asctime(), url, len(req.read()))
-
-pool = coros.CoroutinePool(max_size=4)
-waiters = []
-for url in urls:
-    waiters.append(pool.execute(fetch, url))
-
-# wait for all the coroutines to come back before exiting the process
-for waiter in waiters:
-    waiter.wait()
-
+  print "opening", url
+  body = urllib2.urlopen(url).read()
+  print "done with", url
+  return url, body

+pool = eventlet.GreenPool(200)
+for url, body in pool.imap(fetch, urls):
+  print "got body from", url, "of length", len(body)