String Formatting

ICS 31 • David G. Kay • UC Irvine

String Formatting

Are you tired of printing dollars-and-cents amounts in Python that look like $ 12.0 ? Do you want to have precise control over what the text results of your programs look like? You're ready to learn string formatting.

Python, like most programming languages, provides a rich set of features for specifying the format of text. They make it possible to format things into nicely aligned tables, or smoothly flowing sentences, or even rudimentary text-based graphics. Formatting specifications aren't conceptually difficult; they're not like mutable vs. immutable or navigating through lists of namedtuples containing other lists. But they are intricate: They control the character-by-character arrangement on the page or screen. Since even one extra space can mess up your results, string formatting requires us to pay close, meticulous attention.

Printing constants and expressions

Suppose we define the Dish namedtuple and some Dish objects as follows:

from collections import namedtuple
Dish = namedtuple('Dish', 'name price calories')
d1 = Dish("Paht Woon Sen", 12.50, 340)
d2 = Dish("Mee Krob", 9.00, 355)
d3 = Dish("Escargots", 24.50, 95)
DL = [d1, d2, d3]

Now suppose we want to display a Dish in this form:

Paht Woon Sen ($12.50): 340 cal.

This text, and any text we plan to print, consists of constant parts (that are the same every time we print the results) and variable parts (that may change every time, depending on the data).

Here are the variable parts of this string (underlined):

 Paht Woon Sen ($12.50): 340 cal.
 -------------   -----   ---
 VARIABLE        VAR     VAR

The variable parts are the data values: 'Paht Woon Sen', 12.50, and 340.

Here are the constant parts of the same string (underlined below):

 Paht Woon Sen ($12.50): 340 cal.
              ---     ---   -----
            CONST.    CON.  CONSTANT

The constant parts are the three strings ' ($' (space-parenthesis-dollar sign), '): ' (parenthesis-colon-space), and ' cal.' (space-c-a-l-period).

Next we list the names of the variable/data parts. These might be Python variable names, or more complicated Python expressions. In this case, the names are d1.name, d1.price, and d1.calories.

Next we decide whether we need any precise formatting (a specific number of digits or other precise spacing). We will want these eventually, but for now, let's say no so we can cover some other issues first.

Copy the code below, paste it into a Python file, and run it in IDLE:

from collections import namedtuple
Dish = namedtuple('Dish', 'name price calories')
d1 = Dish("Paht Woon Sen", 12.50, 340)
d2 = Dish("Mee Krob", 9.00, 355)
d3 = Dish("Escargots", 24.50, 95)
DL = [d1, d2, d3]

As we discuss each of the lines of code below, copy the line into IDLE and run it.

Controlling spacing between items (arguments to print(), concatenation, sep=, end=)

First we print the variable and constant parts as usual:

print(d1.name, " ($", d1.price, "): ", d1.calories, " cal.")

This gives us our results, but without fancy formatting. The print() function automatically prints one space to separate each of its arguments; that gives us the extra spaces we see when we run the code above.

Next we try to eliminate the extra spaces using concatenation (+). [This will give us an error when we try to concatenate a number into a string. Once you run this code and see the message, comment out this line or remove it so the subsequent examples run.]

print(d1.name + " ($" + d1.price + "): " + d1.calories + " cal.")

With concatenation and calls to str(), we can control the horizontal spacing precisely:

print(d1.name + " ($" + str(d1.price) + "): " + str(d1.calories) + " cal.")

The print() function automatically prints one space to separate each of its arguments; we can change that using the sep= keyword parameter. First we separate the items with the empty string instead of a space:

print(d1.name, " ($", d1.price, "): ", d1.calories, " cal.", sep="")

This gets the horizontal spacing right.

As another illustration, we can separate the items with any other separator string we care to specify:

print(d1.name, " ($", d1.price, "): ", d1.calories, " cal.", sep="---")

By default (i.e., without our having to give specific instructions), print() prints a newline at the end of each invocation (at the end of each call to print()). That's what gives us the blank lines each time we call print() with no arguments. In the code below, we see that each call to print() prints its arguments followed by a newline:

print('Huey', 'Dewey', 'Louie')
print('Donald')
print('Scrooge')
print()
print('Daisy')
print("\n")
print('Daffy')

Notice especially the two blank lines between Daisy and Daffy: One is for the explicit "\n" and the second is what print() automatically provides.

We can specify different behavior at the end of each call to print() by using the end= keyword parameter. It says what to print (instead of the usual newline) after the call to print() has printed it arguments. Saying end=" ", for example, says, "Keep whatever is printed next on the same line as what we just printed."

print("--------------------------")
print('Huey', 'Dewey', 'Louie', end=" ")
print('Donald', end=" ")
print('Scrooge')
print()

As with sep=, the value of the end= parameter can be any string:

print('Donald', end="Zot! Zot! Zot!")
print('---> This follows the end= string in the previous line <---')
print("\n")
print('Huey', 'Dewey', 'Louie', sep=" ** ", end="End of the line.\n")
print("Hey, Uncle Donald!")

F-strings (not in the textbook)

Starting with Python 3.6, we have an alternative way to combine constants and variables. It's called "f-strings" and it's not available in versions of Python before 3.6. So far, when using print(), we've built up our output part by part, separated by commas. We've also created one big part by concatenating subparts with the + operator. With an f-string, we can lay out the constant part of the text we want to print, inserting the variable/expression parts where needed in the string, designated by surrounding the expression with curly braces.

If d is a Dish created with Dish('Chicken Pot Pie', 23.95, 2200), we can print it in a form like this:

The dish Chicken Pot Pie has 2200 calories and costs $23.95.

using an f-string as follows:

print(f'The dish {d.name} has {d.calories} calories and costs ${d.price}.'

We can print out the Dish d in the briefer format shown earlier, using a different f-string as follows:

print(f'{d1.name} (${d1.price}): {d1.calories} cal.')

Note the f right before the first apostrophe; that's where f-strings get their name, of course. More importantly, note that the three expressions to be printed are enclosed in curly braces and everything else in the f-string is the surrounding constant text information. Each expression (in curly braces) is evaluated and its printable value is inserted in place of the brace-enclosed expresson; the result is the value of the f-string (which we're talking about printing, but could just as well be assigned to a variable or passed to a function).

[Details: f-strings can use F or f. They can use apostrophes, double-quotes, triple-apostrophes, or triple double-quotes. To include a curly-brace character in an f-string, you don't escape it with a backslash; you enter two curly braces in a row: The value of
f'There are {2+2+2} {{curly braces}} in this f-string but just 2 in its value'
is
'There are 6 {curly braces} in this f-string but just 2 in its value.'

An f-string is a string; it can be used anywhere any other string can be used, e.g., in string functions and methods.]

Format specifications

We can do a lot with the techniques we already know, but one more technique, format specifications, is useful in these two situations:

Specify a particular number of digits to the right of a decimal point (in the example below, two digits, for dollars-and-cents amounts):
```
     Paht Woon Sen ($12.50): 340 cal.
```

Place values into fixed-sized "fields" to line the values up:

     Paht Woon Sen  12.50 340
     Mee Krob        9.00 355
     Escargots      24.50  95

A format specification is a few extra characters placed inside the curly braces in an f-string (or in a call to the str.format() method, which we will cover later). The syntax for each format specification has this form:

{ EXPRESSION : FORMAT-CODE }

To the left of the colon is the expression, as before, to print in that space. What can we put to the right of the colon? That's where we put the field width specifications, the instructions to Python for how many characters in the formatted string to devote to each data item. The syntax of a field width specification (what can go to the right of the colon in a format specification) depends on the type of data being formatted. Here are the three main types, for strings, ints, and floats respectively:

FIELD-WIDTHs
FIELD-WIDTHd
FIELD-WIDTH.NUMBER-OF-DECIMAL-PLACESf

For example, if x is a variable holding a number, a format specification of {x:5.2f] says:

Reserve a 5-character field in the result string.
In that field, place a float number (the valueof x) with two digits to the right of the decimal point.

Try running this example:

print(f'{d1.name} (${d1.price:5.2f}): {d1.calories} cal.')

Notice that the 5-character field counts one character for the decimal point itself.

What happens if we use a larger field width than our data requires?

print(f'{d1.name} (${d1.price:7.2f}): {d1.calories} cal.')

We use 5 of the 7 characters for the number, with the two extra spaces after the dollar sign and before the first digit of the number.

What if we specify a narrower field than we need?

print(f'{d1.name} (${d1.price:3.2f}): {d1.calories} cal.')

Python has three choices in this situation: It could give us an error message; it could chop off the value somehow to make it fit in the specified-width field; or it could take as many digits as it needs, even if that exceeds the specified field width. Python takes the latter choice, with the reasoning that it's better to see the actual value with messed-up formatting than to see only part of the value or not to see it at all.

So in Python, if the field width is too small, Python still takes the number of characters it needs.

In fact, if we always want to take up exactly the space we need for the value, with no extra spaces, we use a field width of zero:

print(f'{d1.name} (${d1.price:0.2f}): {d1.calories} cal.')
print(f'{d1.name} (${3.50:0.2f}): {d1.calories} cal.')
print(f'{d1.name} (${53453453:0.2f}): {d1.calories} cal.')

Another use of field width specifications is to line things up in columns like this:

     Paht Woon Sen  12.50 340
     Mee Krob        9.00 355
     Escargots      24.50  95

To do this, we choose a field width that's large enough to accommodate the largest value we expect in a given column:

print(f'{d1.name:20s} (${d1.price:6.2f}): {d1.calories:4d} cal.')

This says to place the dish name in a 20-character field, the price in a six-character field (with two digits to the right of the decimal point), and the number of calories in a 4-character field. (By default, strings are aligned with the left edge of their field and numbers are aligned with their rightmost digit; this reflects the typical practice in typesetting data in tables.)

We can put this in a loop through our list of dishes:

for d in DL:
    print(f'{d.name:20s} (${d.price:6.2f}): {d.calories:4d} cal.')

Finally, we can write a function that prints our dish information in tabular format with titles:

def print_dishlist_info(DL: [Dish]) -> None:  # Just prints
    """ Print a table with a row for each dish """
    print("Name                 Price Calories")
    print("----                 ----- --------")
    for d in DL:
        print(f'{d.name:20s} (${d.price:6.2f}): {d.calories:4d} cal.')
    return
print()
print_dishlist_info(DL)

The format() method

The most powerful tool for formatting text is the format() method. We won't use it to do any more than we can do with f-strings and format specifications, but we cover it here because it's in the textbook, it's available before Python 3.6, and it could show up on exams or other course materials.

Here is the syntax of the format() method on strings:

FORMAT-STRING.format(SERIES-OF-EXPRESSIONS-TO-PRINT)

Here is one form of the Dish-printing example above:

print('{:20s} (${:6.2f}): {:4d} cal.'.format(d1.name, d1.price, d1.calories))

We wrote above about the constant parts and the variable parts of what we want to print. With the format() method, the constant parts go in the format string; the variable parts are the arguments to the method (i.e., they go in the series of expressions).

Here are the semantics: The format() method returns a string, which we usually print out (but we could use the string returned by format() in any other context where a string makes sense, e.g., by assigning it to a variable). The string is formatted according to the instructions in the format-string, following this pattern:

print( FORMAT-STRING . format(d1.name, d1.price, d1.calories)

The format string looks like the desired output. It contains constant parts and variable parts; in the format string each variable part is a placeholder or "format specification" (shown below as a dashed line) for the eventual data value that will appear in that place.

"-------- ($--------): -------- cal."
 FMT-SPEC   FMT-SPEC   FMT-SPEC

Each placeholder (dashed line) is a place where we put a format specification, which can tell Python which of the variable parts to print and how to print it. Actual format specifications in Python don't use dashed lines. Instead they use curly braces: { }

We can combine what we've learned so far into this working Python code, which of course you should run:

print("{} (${}): {} cal.".format(d1.name, d1.price, d1.calories))

This code says to print the value of d1.name where the first set of braces appears (at the start of the format string), then to print space-parenthesis-dollar-sign, then to print the value of d1.price where the second set of braces appears, then to print a few more characters, then the value of d1.calories where the third set of braces appears, and then the last few characters in the format string. There are three arguments to format(); they correspond with the three format specifications ({ }) in the format string.

To control the formatting further, we can say things inside the curly braces. The syntax for each format specification has this form:

{ ARGUMENT-SELECTOR :FORMAT-CODE }

To the left of the colon is a value that indicates which of the arguments to format() to print in that space. Usually we just take them in order, but Python allows us to specify them in a different order:

print("{2:} (${1:}): {0:} cal.".format(d1.calories, d1.price, d1.name)

It's not that common to want to reorder the appearance of the arguments in the format string; normally we'll leave the left side of the colon empty. (But if we didn't at least mention what could go there, format strings would seem even stranger than they do already.) The code below shows nothing on either side of the colon in the format specifications. It behaves just the same as if we'd used { } without the colons.

print("{:} (${:}): {:} cal.".format(d1.name, d1.price, d1.calories))

If we rarely put anything to the left of the colon in a format specification, what can we put to the right of the colon? That's where we put the field width specifications, These have the same syntax and semantics with the format() method as they do with f-strings.

Here's a version of the table-printing code using format():

print("{:20s}{:6.2f}{:4d}".format(d1.name, d1.price, d1.calories))

There are many more features to the format() method than we have covered here. String formatting is almost a sub-language of its own within Python. You are welcome to explore at python.org or in other reference materials. But for the problems or exams in this class, you will not need anything beyond what's on this page.

David G. Kay, kay@uci.edu