The logic here comes from http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/bs4/builder/_htmlparser.py#L20, inspecting the stdlib code in different versions, and simplifying. I haven't included the monkey patch from beautifulsoup for Python 3.2.2 because I don't think the bug affects django-compressor's parsing of script tags.