faqts : Computers : Programming : Languages : Python : Snippets : Regular Expressions

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

5 of 10 people (50%) answered Yes
Recently 4 of 9 people (44%) answered Yes

Entry

Perl regex code -> Python

Aug 9th, 2008 02:22
Sek Tea, Nathan Wallace, unknown unknown, Hans Nowak, Snippet 17, Michael P. Reilly


"""
Packages: text.regular_expressions
"""
"""
: I have been learning Perl in order to write some CGI scripts and various
: parsing scripts. I was wondering how the equivalent code would look in
Python?
: Given an HTML file with content data deliminated via HTML comments.  For
: example:
: <HTML>
: <BODY>
: <!--version-->6.4<!--/version-->
: <B><!--product-->OpenGL<!--/product--></B>some more stuff...
: <!--description-->line1
: line2
: line3
: line4<!--/description-->
: </BODY>
: </HTML>
: In Perl, I use the following regex to parse the content from between
the HTML
: comments:
: sub kbExtractContent ($text, "description") {
:    @_[0] =~ /<!--@_[1]-->(.+)<!--\/@_[1]-->/s;
:    return $1;
: }
: where $text contains the entire contents of an HTML file and
"description" is
: the comment pattern that I am looking for.  The regex in the above
case will
: look for any text between <!--description--> and <!--/description-->.  The
: regex will span multiple lines via the 's' option.  The content text is
: returned via the $1.
: How would you do the equivalent in Python?
How about this?
"""
def find_pattern(text, tag="description"):
  from re import search, S
  m = search(r'<!--%s-->(.+)<!--/%s-->' % (tag, tag), line, S)
  if m:
    return m.group(1)
"""
You could also do this with the htmllib module; you wouldn't have to
worry about how many lines between the comments to read in then.  I
cannot see a reason why you want to use Perl for CGI scripting though.
"""
http://regalos-de-navidad.blogspot.com/
http://regalosdesanvalentin.blogspot.com/
http://ideas-para-regalar.blogspot.com/