faqts : Computers : Programming : Languages : Python : XML

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

21 of 24 people (88%) answered Yes
Recently 9 of 10 people (90%) answered Yes

Entry

I have a really big XML file, but only need to read a small part. Do I have to read it all in memory to parse?

Jul 22nd, 2002 12:15
Michael Chermside, Henrik Motakef, Fredrik Lundh


Normally, using the DOM approach to XML processing (instead of the SAX
approach) requires reading the entire document into memory. But if you
only need to process a small portion of the document, Python has a
version of the DOM which works on a "pull" basis (reading in only as
needed). Here is a snippet of sample code that the Fredrik Lundh posted
to c.l.p:
>>> from xml.dom import pulldom
>>> source = pulldom.parse("somefile.xml")
>>> for event, node in source:
>>>     # node is now a dom node without child elements
>>>     if event == "START_ELEMENT" and node.tagName == "record":
>>>         # make sure we have all child elements
>>>         source.expandNode(node)
>>>         process(node)