Entry
How do I extract a substring from a string?
Nov 12th, 2002 05:18
Michael Chermside, Daniela Zu,
Getting a substring can be done in numerous ways... but curiously
enough the string module is rarely involved. The simplest way, and the
one most often used is to use "slicing" (a range of subscripts):
>>> s = "dog cat aardvark"
>>> s[0:3]
'dog'
That returned a new string containing everything starting with
character 0, up until just before the character 3.
Often, you won't know exactly what indexes to use, and you will
calculate them with a formula. Here, for instance, we search for all
text from the first letter 'c' to the first letter 't' (adding 1 to
INCLUDE the 't' in our substring):
>>> s[ s.index('c') : s.index('t')+1 ]
'cat'
If you know that you want the beginning or end of a string, then just
ommit that index. Also, you can measure backwards from the end of the
string by using negative numbers. For instance, this creates a
substring of the last 3 characters of the string:
>>> s[-3: ]
'ark'
All of those techniques are ones which will work on ANY sequence
(lists, tuples, and other things in addition to strings). However,
there's one VERY powerful technique which is limited to strings.
Substrings can be selected using regular expressions by importing the
re module:
>>> import re
>>> re.search('aa[a-z]*', s).group()
'aardvark'
This module can also be used to perform more powerful transformations
than simple substrings... look at its documentation for more details.