faqts : Computers : Programming : Languages : Python : Snippets

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

1 of 1 people (100%) answered Yes
Recently 1 of 1 people (100%) answered Yes

Entry

How many files have been modified in the last 7/30/60/90/etc. days? How large are they?

Jan 23rd, 2008 07:21
stephen brown, matt wilkie,


An example of the kind of answer I'm looking for is "on G:\Corp in the
last 7 days there have been 13,456 files modifed totalling 5.73GB. Those
files are ...<big long list of files and paths>..."
--------------------------------
You need the os.walk() method to traverse a directory tree, and
os.path.getmtime() and os.path.getsize() to get modification times and
sizes.  The rest is pretty simple:
====
#!/usr/bin/python
import sys
import os
from os.path import join, getmtime, getsize
import time
# get the directory name from command line
if len(sys.argv)<=1:
    args = ['.']
else:
    args = sys.argv[1:]
# we are interested in file with a modification time newer than this
threshold_time = time.time() - 7*24*60*60
newest_files = []
for basedir in args:
    for rootdir, dirs, files in os.walk(basedir, topdown=False):
        paths = [join(rootdir, x) for x in files]
        new_files = [(x, getsize(x))
                     for x in paths if getmtime(x)>threshold_time]
        newest_files.extend(new_files)
    print 'On %s in the last 7 days there have been %d files modified '\
            'totalling %d bytes:' % \
            (basedir, len(newest_files), sum([y for x, y in newest_files]))
    for name, size in newest_files:
        print '%-24s\t%s' % (name, size)
===
Note the innermost loop (over the files in each directory) is handled
with a list comprehension.
You can also get modification times and sizes with the os.stat() method,
but the os.path.getmtime() method is more portable, as it returns the
time in the same units as time.time() returns for when now is. 
os.stat() returns units that are operating system dependent.
If you want to make the threshold time variable, you should probably use
the datetime module and datetime and timedelta objects to do the
subtraction, rather than just subtracting seconds from time.time(). 
This would probably also require a more complex command line interface,
for which I highly recommend the optparse module.