Skip to Content

Fun with Strings

We quickly review basic string manipulations. Strings can be enclosed in either single quotes, double quotes or triple quotes: ‘string’, “string” and ‘’’string’’’ are all valid strings. Strings enclosed in single and double quotes can span at most one line in your code. Strings enclosed in triple quotes can span multiple lines.

1. String I/O

Strings can be printed:

python code
            >>> print("This is a string")
This is a string

We can print multiple strings using commas:

python code
            >>> print("This is a string", "and another one")
This is a string and another one

When printing sequences of strings using commas, a space is added in between each string element of the sequence. The same considerations apply for other literals:

python code
            >>> print("This is a number", 1, "and this is a Boolean", True)
This is a number 1 and this is a Boolean True

We can use a comma to use multiple print statements to print on one line:

python code
            print("this is",end="")
print("all on one line")

The output of the above code would be:

python code
            this is all on one line

Strings can also be read from user input using raw_input:

python code
            input("Type something: ")

raw_input always reads the input as a string, regardless of what characters the input contains.

2. Special Characters

Quotes inside strings must be preceded by a slash \ character, or else Python will think you are closing the string:

python code
            "This is a string with a \" quote in it" #valid
"This is a string with a " quote in it"  #not valid, Python will think you closed the string with the second set of quotes
'This is a string with a " quote in it'  #valid, because the string is enclosed in a different type of quote

Evidently, the slash removes the special significance of the quote character. The slash can add special significance to some characters:

python code
            '\t' #prints a horizontal tab
'\v' #prints a vertical tab
'\n' #starts a new line inside your string when printed, even though in your code it is all one line

A symbol preceded by a backslash to give it some special significance counts as one character. For example, '\n' is just one character. What if we want our string to actually contain '\n' without Python thinking we want a new line? The slash also removes its own special significance:

python code
            >>> print("This string has '\\n' in it")
This string has '\n' in it

3. Formatted Printing

We now illustrate a very useful construction with strings. Suppose we want to print out a string which contains numbers in it, to which we assign some arbitrary values. We could do something like this:

python code
            x = 10; y = 4; z = 8
print("The value of x is", x, ", the value of y is", y, ", the value of z is", z)

However, this is clearly tedious. It is much simpler to write:

python code
            x = 10; y = 4; z = 8
print("The value of x is %d, the value of y is %d, the value of z is %d" % (x, y, z))

Evidently, everywhere the program sees '%d' it inserts the corresponding element of the tuple (x, y, z). The 'd' indicates that we are inserting an integer. There is another advantage to this. Suppose we want our integer to take up exactly 10 spaces in the string, regardless of its length. Then we would type '%10d' to indicate this. The integer will be printed right justified, with the correct number of blank spaces preceding it. Suppose we want a float to take up 10 spaces and be precise to 2 decimal places. Then we would type ‘%10.2f’. If we want a string to take up 10 spaces, we similarly type '%10s'. The string will be right justified with an appropriate blankspace preceding it. This is evidently very useful for making tables. We summarize and give an example:

python code
            '%xd' % (integer)     #prints integer, right justified, taking up a total of x spaces
'%-xd' % (integer)    #prints integer, left justified, taking up a total of x spaces
'%x.yd' % (integer)   #prints integer, right justified, taking up a total of x spaces, with exactly y digits (trailing zeroes)
'%-x.yd' % (integer)  #prints integer, left justified, taking up a total of x spaces, with exactly y digits (trailing zeroes)
'%x.yf' % (float)     #prints float, right justified, taking up a total of x spaces, precise to y decimal places
'%-x.yf' % (float)    #prints float, left justified, taking up a total of x spaces, precise to y decimal places
'%xs' % (string)      #prints string, right justified, taking up a total of x spaces
'%-xs' % (string)     #prints string, left justified, taking up a total of x spaces

Of course, in the above code x and y must be substituted by numbers and the variables integer, float, string defined before it will work. Example:

python code
            #we tabulate some velocity – time data for a particle
print("%10s%10s" % ('Time', 'Velocity'))
time = [1, 2, 3, 4, 5]
velocity = [1.233, 1.239, 3.434, 2.323, 12.232]
for i in range(5):
    print("%10d%10.2f" % (time[i], velocity[i]))

The output is:

python code
            Time   Velocity
   1      1.23
   2      1.24
   3      3.43
   4      2.32
   5     12.23

4. String Methods and String Manipulations

Strings are very much like lists of characters (in some languages there is actually no separate string class so we must simply use lists of characters). Thus, when referring to a particular character in the string, we treat it as a list. For example, if we have a string string, we can access its first character by typing string[0], the second character by typing string[1], the last character by typing string[-1], the second last character by typing string[-2], and so on. We can access a subset of the string from character i to character j (inclusive) by typing string[i : j + 1] or string[i : j + 1 : 1]. The last ‘1’ indicates that every character between these two bounds is to be included in the subset. If we instead type ‘2’, every other character would be included, if we type ‘3’ then every third character will be included and so on. We can type string[i :] to get every character starting from the ith, string[: j] to get every character up to and not including the jth. Finally we can type string[i :: n] to get every nth character starting from the ith, and string[: j : n] to get every nth character up to and not including the jth. Here is an example:

python code
            >>> string = "etiquette"
>>> string[0]
>>> string[1]
>>> string[-1]
>>> string[-2]
>>> string[1:7] #should return 'tiquet'
>>> string[1:7:1] #should return 'tiquet'
>>> string[1:7:2] #should return 'tqe'
>>> string[1:] #should return 'tiquette'
>>> string[:7] #should return 'etiquet'
>>> string[1::2] #should return 'tqet'
>>> string[:7:2] #should return 'eiut'
>>> string

It is important to remember, however, that a string is not precisely a list, so we cannot edit individual characters by accessing them as we would elements of a list. Note that string cuts and character selection keeps the original string intact. We can concatenate strings together using the + operator. We can also concatenate several copies of the same string by multiplying the string by an integer. Multiplying by a nonpositive integer creates an empty string:

python code
            >>> string1 = 'ab'
>>> string2 = 'cd'
>>> string = string1 + string2
>>> string
>>> string * 3
>>> string * -1
>>> string * 0

We can use comparison operators on strings. Strings are compared based on first character (or if these are the same, second character, and so on). Two strings are equal if and only if all the corresponding characters are equal. One character is less than another character if its ASCII code value is less than that of the other character. The ASCII code is standard and used in many programming languages, and it assigns an integer to every possible character. Alphabetical characters are listed in alphabetical order in the ASCII table. We give an example:

python code
            >>> 'abc' == 'abc'
>>> 'ab' > 'bc'
>>> 'ab' >= 'bc'
>>> 'ab' < 'abc'
>>> '123' < 'abc'

Finally, we give some examples of string methods:

python code
            >>> string = 'abc \' def ghi'
>>> string.isupper() # is the string in uppercase letters?
>>> string.capitalize() #return the same string but with the first letter capitalized
"Abc ' def ghi"
>>> string.split() #split the string at a particular character and return the list of substrings; the default separator character is the blankspace ‘ ‘
['abc', "'", 'def', 'ghi']
>>> string.split('\'') #split the string at the '\'' character and return the list of substrings
>>> ['abc ', ' def ghi']
>>> string.find('def') #find substring 'def' in the string and return the index at which it begins, or -1 if the substring is not found
>>> string.find('bla')
>>> string.find('def', 2, 5) #find substring 'def' between indices 2 and 5
>>> string
'abc \' def ghi'

Note that in each case, the altered form of the string is returned separately and the string itself is unchanged. There are other string methods available. You can access their descriptions by typing help(str). Here are some practice exercises. Have fun!