Update sitemap tests

- add py3 to tox.ini (gate already tests py3)
- move all tests to $GITROOT/test so they can all run
  through testr
- add scrapy to test-requirements.txt to support sitemap tests
- move tests from test_items.py to test_sitemap_file.py
- fix broken sitemap tests
- add newton to list of old releases in sitemap_file.py
- ignore flake8 H101 as it returns false positives for Sphinx conf.py
- Use openstackdocstheme for docs
- Update sitemap README
- Restructure repo docs
- fix minor style issues

Change-Id: I22c018149b2eefde6ca5c38c22ac06886fe9a7a8
This commit is contained in:
Brian Moss 2017-04-06 14:40:51 +10:00
parent 53de7deb1c
commit d3bc42483a
23 changed files with 156 additions and 119 deletions

1
.gitignore vendored
View File

@ -12,6 +12,7 @@ eggs
sdist sdist
# Unit test / coverage reports # Unit test / coverage reports
.cache
.coverage .coverage
.tox .tox
.testrepository .testrepository

View File

@ -8,7 +8,7 @@ Team and repository tags
.. Change things from this point on .. Change things from this point on
OpenStack Doc Tools OpenStack Doc Tools
******************* ~~~~~~~~~~~~~~~~~~~
This repository contains tools used by the OpenStack Documentation This repository contains tools used by the OpenStack Documentation
project. project.
@ -16,8 +16,12 @@ project.
For more details, see the `OpenStack Documentation Contributor Guide For more details, see the `OpenStack Documentation Contributor Guide
<http://docs.openstack.org/contributor-guide/>`_. <http://docs.openstack.org/contributor-guide/>`_.
* License: Apache License, Version 2.0
* Source: https://git.openstack.org/cgit/openstack/openstack-doc-tools
* Bugs: https://bugs.launchpad.net/openstack-doc-tools
Prerequisites Prerequisites
============= -------------
You need to have Python 2.7 installed for using the tools. You need to have Python 2.7 installed for using the tools.
@ -57,12 +61,7 @@ On Ubuntu::
$ apt-get install libxml2-dev libxslt-dev $ apt-get install libxml2-dev libxslt-dev
* License: Apache License, Version 2.0
* Source: https://git.openstack.org/cgit/openstack/openstack-doc-tools
* Bugs: https://bugs.launchpad.net/openstack-doc-tools
Regenerating config option tables Regenerating config option tables
================================= ---------------------------------
See :ref:`autogenerate_config_docs`. See :ref:`autogenerate_config_docs`.

View File

@ -14,6 +14,8 @@
import os import os
import sys import sys
import openstackdocstheme
sys.path.insert(0, os.path.abspath('../..')) sys.path.insert(0, os.path.abspath('../..'))
# -- General configuration ---------------------------------------------------- # -- General configuration ----------------------------------------------------
@ -37,7 +39,7 @@ master_doc = 'index'
# General information about the project. # General information about the project.
project = u'openstack-doc-tools' project = u'openstack-doc-tools'
copyright = u'2014, OpenStack Foundation' copyright = u'2017, OpenStack Foundation'
# If true, '()' will be appended to :func: etc. cross-reference text. # If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = True add_function_parentheses = True
@ -51,10 +53,13 @@ pygments_style = 'sphinx'
# -- Options for HTML output -------------------------------------------------- # -- Options for HTML output --------------------------------------------------
# The theme to use for HTML and HTML Help pages. Major themes that come with # The theme to use for HTML and HTML Help pages. See the documentation for
# Sphinx are currently 'default' and 'sphinxdoc'. # a list of builtin themes.
# html_theme_path = ["."] html_theme = 'openstackdocs'
# html_theme = '_theme'
# Add any paths that contain custom themes here, relative to this directory.
html_theme_path = [openstackdocstheme.get_html_theme_path()]
# html_static_path = ['static'] # html_static_path = ['static']
# Output file base name for HTML help builder. # Output file base name for HTML help builder.

View File

@ -1,3 +1,4 @@
==============================================
Welcome to openstack-doc-tool's documentation! Welcome to openstack-doc-tool's documentation!
============================================== ==============================================
@ -6,16 +7,17 @@ Contents:
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
readme doc-tools-readme
autogenerate_config_docs
release_notes
installation installation
usage usage
autogenerate_config_docs
man/openstack-doc-test
sitemap-readme
release_notes
Indices and tables Indices and tables
================== ~~~~~~~~~~~~~~~~~~
* :ref:`genindex` * :ref:`genindex`
* :ref:`modindex` * :ref:`modindex`
* :ref:`search` * :ref:`search`

View File

@ -2,11 +2,15 @@
Installation Installation
============ ============
At the command line:: At the command line:
$ pip install openstack-doc-tools .. code-block:: console
Or, if you have virtualenvwrapper installed:: $ pip install openstack-doc-tools
$ mkvirtualenv openstack-doc-tools Or, if you have virtualenvwrapper installed:
$ pip install openstack-doc-tools
.. code-block:: console
$ mkvirtualenv openstack-doc-tools
$ pip install openstack-doc-tools

View File

@ -114,4 +114,4 @@ Bugs
* openstack-doc-tools is hosted on Launchpad so you can view current * openstack-doc-tools is hosted on Launchpad so you can view current
bugs at bugs at
`Bugs : openstack-manuals <https://bugs.launchpad.net/openstack-manuals/>`__ `Bugs : openstack-doc-tools <https://bugs.launchpad.net/openstack-doc-tools/>`__

View File

@ -1,2 +1 @@
.. include:: ../../RELEASE_NOTES.rst .. include:: ../../RELEASE_NOTES.rst

View File

@ -0,0 +1 @@
.. include:: ../../sitemap/README.rst

View File

@ -1,7 +1,9 @@
======== =====
Usage Usage
======== =====
To use openstack-doc-tools in a project:: To use openstack-doc-tools in a project:
import os_doc_tools .. code-block:: python
import os_doc_tools

View File

@ -2,46 +2,80 @@
Sitemap Generator Sitemap Generator
================= =================
This script crawls all available sites on http://docs.openstack.org and extracts This script crawls all available sites on http://docs.openstack.org and
all URLs. Based on the URLs the script generates a sitemap for search engines extracts all URLs. Based on the URLs the script generates a sitemap for search
according to the protocol described at http://www.sitemaps.org/protocol.html. engines according to the `sitemaps protocol
<http://www.sitemaps.org/protocol.html>`_.
Installation Installation
============ ~~~~~~~~~~~~
To install the needed modules you can use pip or the package management system included To install the needed modules you can use pip or the package management system
in your distribution. When using the package management system maybe the name of the included in your distribution. When using the package management system maybe
packages differ. Installation in a virtual environment is recommended. the name of the packages differ. Installation in a virtual environment is
recommended.
$ virtualenv venv .. code-block:: console
$ source venv/bin/activate
$ pip install -r requirements.txt
When using pip it's maybe necessary to install some development packages. $ virtualenv venv
For example on Ubuntu 16.04 install the following packages. $ source venv/bin/activate
$ pip install -r requirements.txt
$ sudo apt install gcc libssl-dev python-dev python-virtualenv When using pip, you may also need to install some development packages. For
example, on Ubuntu 16.04 install the following packages:
.. code-block:: console
$ sudo apt install gcc libssl-dev python-dev python-virtualenv
Usage Usage
===== ~~~~~
To generate a new sitemap file simply run the spider using the To generate a new sitemap file, change into your local clone of the
following command. It will take several minutes to crawl all available sites ``openstack/openstack-doc-tools`` repository and run the following commands:
on http://docs.openstack.org. The result will be available in the file
``sitemap_docs.openstack.org.xml``.
$ scrapy crawl sitemap .. code-block:: console
It's also possible to crawl other sites using the attribute ``domain``. $ cd sitemap
$ scrapy crawl sitemap
For example to crawl http://developer.openstack.org use the following command. The script takes several minutes to crawl all available
The result will be available in the file ``sitemap_developer.openstack.org.xml``. sites on http://docs.openstack.org. The result is available in the
``sitemap_docs.openstack.org.xml`` file.
$ scrapy crawl sitemap -a domain=developer.openstack.org Options
~~~~~~~
To write log messages into a file append the parameter ``-s LOG_FILE=scrapy.log``. domain=URL
It is possible to define a set of additional start URLs using the attribute Sets the ``domain`` to crawl. Default is ``docs.openstack.org``.
``urls``. Separate multiple URLs with ``,``.
$ scrapy crawl sitemap -a domain=developer.openstack.org -a urls="http://developer.openstack.org/de/api-guide/quick-start/" For example, to crawl http://developer.openstack.org use the following
command:
.. code-block:: console
$ scrapy crawl sitemap -a domain=developer.openstack.org
The result is available in the ``sitemap_developer.openstack.org.xml`` file.
urls=URL
You can define a set of additional start URLs using the ``urls`` attribute.
Separate multiple URLs with ``,``.
For example:
.. code-block:: console
$ scrapy crawl sitemap -a domain=developer.openstack.org -a urls="http://developer.openstack.org/de/api-guide/quick-start/"
LOG_FILE=FILE
Write log messages to the specified file.
For example, to write to ``scrapy.log``:
.. code-block:: console
$ scrapy crawl sitemap -s LOG_FILE=scrapy.log

View File

@ -69,7 +69,7 @@ class ExportSitemap(object):
def spider_opened(self, spider): def spider_opened(self, spider):
output = open(os.path.join(os.getcwd(), 'sitemap_%s.xml' output = open(os.path.join(os.getcwd(), 'sitemap_%s.xml'
% spider.domain), 'w') % spider.domain), 'w')
self.files[spider] = output self.files[spider] = output
self.exporter = SitemapItemExporter(output, item_element='url', self.exporter = SitemapItemExporter(output, item_element='url',
root_element='urlset') root_element='urlset')
@ -80,7 +80,7 @@ class ExportSitemap(object):
output = self.files.pop(spider) output = self.files.pop(spider)
output.close() output.close()
tree = lxml.etree.parse(os.path.join(os.getcwd(), "sitemap_%s.xml" tree = lxml.etree.parse(os.path.join(os.getcwd(), "sitemap_%s.xml"
% spider.domain)) % spider.domain))
with open(os.path.join(os.getcwd(), "sitemap_%s.xml" % spider.domain), with open(os.path.join(os.getcwd(), "sitemap_%s.xml" % spider.domain),
'w') as pretty: 'w') as pretty:
pretty.write(lxml.etree.tostring(tree, pretty_print=True)) pretty.write(lxml.etree.tostring(tree, pretty_print=True))

View File

@ -11,7 +11,10 @@
# under the License. # under the License.
import time import time
import urlparse try:
import urlparse
except ImportError:
import urllib.parse as urlparse
from scrapy import item from scrapy import item
from scrapy.linkextractors import LinkExtractor from scrapy.linkextractors import LinkExtractor
@ -41,7 +44,8 @@ class SitemapSpider(spiders.CrawlSpider):
'juno', 'juno',
'kilo', 'kilo',
'liberty', 'liberty',
'mitaka' 'mitaka',
'newton'
]]) ]])
rules = [ rules = [

View File

@ -1 +0,0 @@
scrapy>=1.0.0

View File

@ -1,37 +0,0 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import mock
from sitemap.generator import items
import unittest
class TestSitemapItem(unittest.TestCase):
def test_class_type(self):
self.assertTrue(type(items.SitemapItem) is items.scrapy.item.ItemMeta)
def test_class_supports_fields(self):
with mock.patch.object(items.scrapy.item, 'Field'):
a = items.SitemapItem()
supported_fields = ['loc', 'lastmod', 'priority', 'changefreq']
for field in supported_fields:
a[field] = field
not_supported_fields = ['some', 'random', 'fields']
for field in not_supported_fields:
with self.assertRaises(KeyError):
a[field] = field
if __name__ == '__main__':
unittest.main()

View File

@ -11,8 +11,8 @@ doc8 # Apache-2.0
pylint==1.4.5 # GPLv2 pylint==1.4.5 # GPLv2
reno>=1.8.0 # Apache-2.0 reno>=1.8.0 # Apache-2.0
openstackdocstheme>=1.5.0 # Apache-2.0
oslosphinx>=4.7.0 # Apache-2.0 oslosphinx>=4.7.0 # Apache-2.0
testrepository>=0.0.18 # Apache-2.0/BSD testrepository>=0.0.18 # Apache-2.0/BSD
# mock object framework # mock object framework

View File

@ -78,26 +78,22 @@ class TestExportSitemap(unittest.TestCase):
def test_spider_opened_calls_open(self): def test_spider_opened_calls_open(self):
with mock.patch.object(pipelines, 'open', with mock.patch.object(pipelines, 'open',
return_value=None) as mocked_open: return_value=None) as mocked_open:
with mock.patch.object(pipelines, with mock.patch.object(pipelines, 'SitemapItemExporter'):
'SitemapItemExporter'):
self.export_sitemap.spider_opened(self.spider) self.export_sitemap.spider_opened(self.spider)
self.assertTrue(mocked_open.called) self.assertTrue(mocked_open.called)
def test_spider_opened_assigns_spider(self): def test_spider_opened_assigns_spider(self):
prev_len = len(self.export_sitemap.files) prev_len = len(self.export_sitemap.files)
with mock.patch.object(pipelines, 'open', with mock.patch.object(pipelines, 'open', return_value=None):
return_value=None): with mock.patch.object(pipelines, 'SitemapItemExporter'):
with mock.patch.object(pipelines,
'SitemapItemExporter'):
self.export_sitemap.spider_opened(self.spider) self.export_sitemap.spider_opened(self.spider)
after_len = len(self.export_sitemap.files) after_len = len(self.export_sitemap.files)
self.assertTrue(after_len - prev_len, 1) self.assertTrue(after_len - prev_len, 1)
def test_spider_opened_instantiates_exporter(self): def test_spider_opened_instantiates_exporter(self):
with mock.patch.object(pipelines, 'open', with mock.patch.object(pipelines, 'open', return_value=None):
return_value=None):
with mock.patch.object(pipelines, with mock.patch.object(pipelines,
'SitemapItemExporter') as mocked_exporter: 'SitemapItemExporter') as mocked_exporter:
self.export_sitemap.spider_opened(self.spider) self.export_sitemap.spider_opened(self.spider)
@ -105,8 +101,7 @@ class TestExportSitemap(unittest.TestCase):
self.assertTrue(mocked_exporter.called) self.assertTrue(mocked_exporter.called)
def test_spider_opened_exporter_starts_exporting(self): def test_spider_opened_exporter_starts_exporting(self):
with mock.patch.object(pipelines, 'open', with mock.patch.object(pipelines, 'open', return_value=None):
return_value=None):
with mock.patch.object(pipelines.SitemapItemExporter, with mock.patch.object(pipelines.SitemapItemExporter,
'start_exporting') as mocked_start: 'start_exporting') as mocked_start:
self.export_sitemap.spider_opened(self.spider) self.export_sitemap.spider_opened(self.spider)

View File

@ -11,10 +11,30 @@
# under the License. # under the License.
import mock import mock
import scrapy
from sitemap.generator.spiders import sitemap_file from sitemap.generator.spiders import sitemap_file
import unittest import unittest
class TestSitemapItem(unittest.TestCase):
def test_class_type(self):
self.assertTrue(type(sitemap_file.SitemapItem) is scrapy.item.ItemMeta)
def test_class_supports_fields(self):
with mock.patch.object(scrapy.item, 'Field'):
a = sitemap_file.SitemapItem()
supported_fields = ['loc', 'lastmod', 'priority', 'changefreq']
for field in supported_fields:
a[field] = field
not_supported_fields = ['some', 'random', 'fields']
for field in not_supported_fields:
with self.assertRaises(KeyError):
a[field] = field
class TestSitemapSpider(unittest.TestCase): class TestSitemapSpider(unittest.TestCase):
def setUp(self): def setUp(self):
@ -38,16 +58,18 @@ class TestSitemapSpider(unittest.TestCase):
def test_parse_items_inits_sitemap(self): def test_parse_items_inits_sitemap(self):
response = mock.MagicMock() response = mock.MagicMock()
with mock.patch.object(sitemap_file.items, with mock.patch.object(sitemap_file,
'SitemapItem') as mocked_sitemap_item: 'SitemapItem') as mocked_sitemap_item:
with mock.patch.object(sitemap_file, 'time'): with mock.patch.object(sitemap_file.urlparse,
self.spider.parse_item(response) 'urlsplit'):
with mock.patch.object(sitemap_file, 'time'):
self.spider.parse_item(response)
self.assertTrue(mocked_sitemap_item.called) self.assertTrue(mocked_sitemap_item.called)
def test_parse_items_gets_path(self): def test_parse_items_gets_path(self):
response = mock.MagicMock() response = mock.MagicMock()
with mock.patch.object(sitemap_file.items, 'SitemapItem'): with mock.patch.object(sitemap_file, 'SitemapItem'):
with mock.patch.object(sitemap_file.urlparse, with mock.patch.object(sitemap_file.urlparse,
'urlsplit') as mocked_urlsplit: 'urlsplit') as mocked_urlsplit:
with mock.patch.object(sitemap_file, 'time'): with mock.patch.object(sitemap_file, 'time'):
@ -60,7 +82,7 @@ class TestSitemapSpider(unittest.TestCase):
path = sitemap_file.urlparse.SplitResult( path = sitemap_file.urlparse.SplitResult(
scheme='https', scheme='https',
netloc='docs.openstack.com', netloc='docs.openstack.com',
path='/kilo', path='/mitaka',
query='', query='',
fragment='' fragment=''
) )
@ -77,7 +99,7 @@ class TestSitemapSpider(unittest.TestCase):
path = sitemap_file.urlparse.SplitResult( path = sitemap_file.urlparse.SplitResult(
scheme='https', scheme='https',
netloc='docs.openstack.com', netloc='docs.openstack.com',
path='/mitaka', path='/ocata',
query='', query='',
fragment='' fragment=''
) )
@ -94,7 +116,7 @@ class TestSitemapSpider(unittest.TestCase):
path = sitemap_file.urlparse.SplitResult( path = sitemap_file.urlparse.SplitResult(
scheme='https', scheme='https',
netloc='docs.openstack.com', netloc='docs.openstack.com',
path='/mitaka', path='/ocata',
query='', query='',
fragment='' fragment=''
) )

13
tox.ini
View File

@ -1,6 +1,6 @@
[tox] [tox]
minversion = 2.0 minversion = 2.0
envlist = py27,pep8 envlist = py3,py27,pep8
skipsdist = True skipsdist = True
[testenv] [testenv]
@ -9,7 +9,10 @@ install_command = {toxinidir}/tools/tox_install.sh {env:UPPER_CONSTRAINTS_FILE:h
setenv = setenv =
VIRTUAL_ENV={envdir} VIRTUAL_ENV={envdir}
CLIENT_NAME=openstack-doc-tools CLIENT_NAME=openstack-doc-tools
deps = -r{toxinidir}/test-requirements.txt # Install also sitemap scraping tool, not installed by default
# therefore not in requirements file
deps = scrapy>=1.0.0
-r{toxinidir}/test-requirements.txt
-r{toxinidir}/requirements.txt -r{toxinidir}/requirements.txt
commands = python setup.py testr --slowest --testr-args='{posargs}' commands = python setup.py testr --slowest --testr-args='{posargs}'
@ -27,11 +30,14 @@ commands =
cleanup/remove_trailing_whitespaces.sh cleanup/remove_trailing_whitespaces.sh
[testenv:pylint] [testenv:pylint]
commands = pylint os_doc_tools cleanup commands = pylint os_doc_tools cleanup sitemap
[testenv:releasenotes] [testenv:releasenotes]
commands = sphinx-build -a -E -W -d releasenotes/build/doctrees -b html releasenotes/source releasenotes/build/html commands = sphinx-build -a -E -W -d releasenotes/build/doctrees -b html releasenotes/source releasenotes/build/html
[testenv:sitemap]
# commands = functional test command goes here
[testenv:venv] [testenv:venv]
commands = {posargs} commands = {posargs}
@ -44,3 +50,4 @@ builtins = _
exclude=.venv,.git,.tox,dist,*lib/python*,*egg,build,*autogenerate_config_docs/venv,*autogenerate_config_docs/sources exclude=.venv,.git,.tox,dist,*lib/python*,*egg,build,*autogenerate_config_docs/venv,*autogenerate_config_docs/sources
# 28 is currently the most complex thing we have # 28 is currently the most complex thing we have
max-complexity=29 max-complexity=29
ignore = H101