ICS 45C Spring 2022
Notes and Examples: Type Conversions


Implicit conversions between built-in types

In C++, values of many of the built-in types can be implicitly converted to other built-in types. By implicitly, I mean that it is not necessary for you to ask for the conversion to take place; it simply happens, with the compiler deducing automatically that it's necessary based on context. The most obvious example arises in the various integral types, which we've seen before are implicitly convertible to one another.

int numi = 66;
long numl = numi;
short nums = numi;
char numc = numi;
bool numb = numi;

std::cout << numi << std::endl;   // prints 66
std::cout << numl << std::endl;   // prints 66
std::cout << nums << std::endl;   // prints 66
std::cout << numc << std::endl;   // prints B
std::cout << numb << std::endl;   // prints 1

This entire block of code is legal, including two seemingly-nonsensical conversions:

Similarly, there are implicit conversions between integral types and floating-point types, even in cases where there is the potential for information loss.

double numd = 3.5;
int numi = numd;
float numf = numd;

std::cout << numd << std::endl;   // prints 3.5
std::cout << numi << std::endl;   // prints 3
std::cout << numf << std::endl;   // prints 3.5

Some compilers can be configured to warn you in cases like the conversion from numd to numi, which will lose the fractional part of numd's value, but this conversion is nonetheless technically legal in C++.

Of course, as we've seen, not every pair of built-in types supports an implicit conversion. For example, even though pointers are technically numbers — in the sense that they store a memory address, which is generally stored, behind the scenes, as a number — we can't willy-nilly convert integers to pointers, regardless of the pointer's type. For example, this code will give us a compile time error:

int i = 3;
int* p1 = i;   // not legal
char* p2 = i;  // also not legal

This restriction is a sensible one, as there is very rarely a case where you would be manipulating integers and then want to treat them as memory addresses (or vice versa), whereas the idea that you might have an int value and want to store it in a double variable isn't nearly so far-fetched.


Explicit conversions using casts

In C++, it is possible to ask for an explicit conversion, even in cases where an implicit one is not allowed. The idea is that there are times when you might want to make these conversions, but they're rare enough that you should have to ask for them specifically; in most cases where they appear implicitly in the code you wrote, you probably made a mistake.

One way to perform an explicit conversion in C++ is to use a cast, which is an explicit conversion of a value from one type to another. C++ inherits an old-style casting syntax from C, which we've intentionally avoided (for reasons I'll explain here), but there are clearer notations that have been added to C++ that we'll prefer instead.

C-style casts (and why they fall short)

In C, it is legal to cast values of most types to values of most other types, simply by asking for the conversion to be done. We'll call these C-style casts. An example will drive the discussion:

int i = 3;
int* p = (int*) i;

std::cout << i << std::endl;   // prints 3
std::cout << p << std::endl;   // prints 0x3

Syntactically, a C-style cast is included in an expression by prefixing that expression with the name of the target type, surrounded by parentheses. In this case, for example, we've said (int*) i, which is to say that we want to take the value of i and then cast it to int*. This is a blunt instrument: It says "Please take this integer and give me a pointer that points to the address represented by that number." There's no good way for you to know, in general, what will be stored in that address, or even what type of value is stored there; nonetheless, C++ will happily give you a pointer to it and allow you to treat the value stored there as an integer, regardless of what it actually is.

Not all C-style casts are so insidious, though. For example, if you have two int values and you want to divide one by another and get back a double result, it's not as easy as it sounds.

int i = 3;
int j = 4;
double k = i / j;

std::cout << k << std::endl;   // prints 0

The reason that k's value is 0 is that division of two integers in C++ is defined as integer division, which is to say that we divide the two integers, discard whatever fractional part there might be, and return the result. So, rather than k having the value 0.75 that you might expect, it has the value 0 instead.

One solution to this problem is to explicit convert one value or the other to a double. As long as one of them is double, even if the other is int, C++ will perform floating-point division and give you a double as a result. So one way to write that would be like this:

int i = 3;
int j = 4;
double k = i / (double) j;

std::cout << k << std::endl;   // prints 0.75

Note, too, that parenthesization matters here, and in ways you might not expect. C-style casting has a higher precedence than most other operators, meaning that it binds more tightly to operands than, say, a division operator does. So, for example: (double) i / j technically means "Cast i to double, then divide that by the int j." Meanwhile, (double) (i / j) means "Divide the int i by the int j (yielding an integer), then cast the result to a double."

Finally, you can use C-style casts to cast pointers from one type to another, which is occasionally useful when you want to do downcasting in an inheritance hierarchy (i.e., you have a Shape* pointing to an object that you know is a Circle and want to cast the pointer to Circle* instead). And C-style casts will do that, too.

Shape* ps = new Circle{3.0};
Circle* pc = (Circle*) ps;

Though it should be noted that this cast will succeed whether the pointer really points to a Circle object or not. What you'll want is a pointer to an object that you'll now be able to treat as a Circle; if it's something else, you'll very likely end up with undefined behavior, or even a program crash. In fact, you can even use C-style casts to convert between pointers of completely unrelated types.

struct A
{
    int a;
};

struct B
{
    double b;
};

A* a = new A{3};
B* b = (B*) a;
std::cout << b->b << std::endl;   // on the ICS 45C VM, prints 1.4822e-323

C-style casts, ultimately, have two shortcomings, which means that they're best avoided in modern C++ programs.

For many years, the C++ standard has included a more targeted set of casts. Their syntax is a bit more cumbersome — which has its own benefits, so that it's obvious when there's a cast! — but the advantage is mainly that each kind of cast specifies what its purpose is, meaning not only that a reader of the code will have a better chance of understanding what's going on, but also that the compiler can verify that a cast is sensible given what you said you're using it for, so that many mistakes that would not be caught by a compiler when using C-style casts will actually become compile-time errors when using the newer-style C++ casts.

dynamic_cast

A dynamic cast is one that casts a pointer or reference of one type to a pointer or reference of a related type. In particular, by related type, I mean that they're class types that are related by inheritance, and that your goal is to downcast in an inheritance hierarchy (which is to say that you have a pointer of a base class type and you want to cast it to a derived class type). Consider the Shape example we've seen previously; this would be a legal use of dynamic_cast.

void foo(Shape* s)
{
    Circle* c = dynamic_cast<Circle*>(s);
    // ...
}

Note the syntax here; the use of dynamic_cast looks very much like a call to a function template. As it turns out, dynamic_cast isn't really a function, but it doesn't hurt to think of it that way.

We've seen previously that the word dynamic is quite often used to describe things that happen while the program runs, while the word static quite often refers to things that happen beforehand (e.g., things that are done by the compiler). A compiler could analyze whether it's possible for the cast above to succeed, by checking whether Circle indeed inherits from Shape. So what's so dynamic about a dynamic_cast?

The answer to that question lies in the fact that the cast is checked at run-time. The compiler will emit code that checks whether s is really pointing to a Circle object. If so, the cast succeeds and c will be a Circle* pointing to it; if not, the cast fails and c will instead be nullptr.

Now, you may be saying to yourself "But does that mean I have to check every dynamic_cast to see if it returns nullptr?" Not necessarily; you probably should, but you don't have to, but you do have to accept that the penalty for not checking and being wrong is that your program will most likely crash when you try to dereference the nullptr.

There's one more small wrinkle: dynamic_cast will only work when the types involved have at least one virtual member function. It should be noted that this is generally the only circumstance where you'd typically want to use it, anyway.

static_cast

A static cast is one that is entirely resolved at compile time. That means there is an inherent risk involved, since the compiler can't necessarily be sure what the run-time type of an object is, given only a pointer to it. For example, if we rewrote our foo() function above this way:

void foo(Shape* s)
{
    Circle* c = static_cast<Circle*>(s);
    // ...
}

the cast will succeed regardless of what kind of object s points to, though subsequent use of the pointer c will lead to undefined behavior if the object isn't really a Circle. In a case like this, one reason you might choose static_cast over dynamic_cast is a performance reason (i.e., avoiding the cost of the type check at run time). As is often the case, there is a tradeoff here between safety and speed.

But, at the very least, the compiler can check whether there's some chance of the cast succeeding (e.g., by checking whether there is an inheritance relationship between the types) and report an error at compile time if you try to use it for a conversion between unrelated types. And, additionally, you can use static_cast to invoke built-in conversions between non-pointer types, such as this one.

double d = 3.5;
int i = static_cast<int>(d);   // legal, because such a conversion exists

reinterpret_cast

A reinterpret cast is a very scary instrument, indeed. You can use reinterpret_cast to convert just about anything into just about anything else, similar to the C-style casts we saw previously. For example, a reinterpret_cast will allow you to convert between pointers of unrelated types, between integers and pointers, and so on.

It's not often you need something like this — and it's best avoided if you don't have the absolute need, since it's so dangerous in practice — but it does come in handy once in a while to be able to make conversions in a completely unrestricted way. One such use is to cast pointers to void* and back, where void* is an untyped pointer: an address without a type. A void* doesn't let you do anything with the object it points to (since it's unknown what the object's type is), but a reinterpret_cast would let you cast the pointer to a different type, provided that you knew what the right type was. This is the kind of thing that's done in extremely low-level code, such as memory allocators, but if you find yourself doing this often in higher-level programs, your design is probably lacking.

const_cast

All of the C++ casts we've seen so far, including reinterpret_cast, are respectful of const, in the sense that none of those casts can be used to change a const type to a corresponding non-const type (e.g., to cast from const std::string& to std::string&). (On the other hand, C-style casts will let you do this, whether you meant to do it or not; this is one of many reasons they're best avoided.)

In general, this is a good thing; we don't want const protections to be thrown away indiscriminately. But sometimes we have to remove it temporarily; for example, if we need to interoperate with someone else's C++ code and they didn't make proper use of const (e.g., they have a class with member functions that should be marked const but aren't), we may not have the ability to change their code, so we'll have to work around the problem if we want to make use of it. A const cast can be used to remove the const from a type, while introducing no other changes to it. (If you also wanted to change the type in some other way, you'd have to do two casts: a const_cast to remove the const and another one such as a static_cast or dynamic_cast to change the type.) A simple example:

void blah(const Blah& b)
{
    Blah& bb = const_cast<Blah&>(b);
    bb.someMemberFunctionThatIsNotConst();
}

As a rule, const_cast is something that should make you feel uncomfortable to use, because it represents a way to subvert invariants about how a program is supposed to behave. Once in a while, you have no choice, but it's very much a tool of last resort.


Implicit conversions between user-defined types

Suppose we have a class called Transaction, where a Transaction has an ID that is initialized when it's constructed. (There might well be other interesting things about transactions in the context of the program we're writing, but we'll leave them out of this example to keep things simple.) We might write such a class this way.

class Transaction
{
public:
    Transaction(int id): id{id} { }
    int getId() const { return id; }

private:
    int id;
};

Now suppose that you had the following two functions that take a Transaction as a parameter, one by value and the other by reference.

void val(Transaction t) { ... }
void ref(Transaction& t) { ... }

Which of the following lines of code would you expect to be legal? (Take a moment to decide what you think before proceeding with reading this section.)

Transaction t1{14};
Transaction t2 = 17;
int i3 = 555; Transaction t3 = i3;
val(72);
ref(89);

Let's consider them individually:

From this discussion, we see that a general rule emerges. If you have a constructor in a class that can be called with exactly one argument — either because it's declared to take exactly one, or because it has default arguments that allow it to be called with only one — then it is automatically a converting constructor, which is to say that it will be used by the compiler to automatically perform an implicit type conversion from the type of the argument to the type of the constructor.

One could certainly argue that this is not a sensible default — and plenty of folks much more well-known in the C++ community than me think so! — but it's long been the rule in C++, so it's not a rule that can be changed now without breaking a lot of existing code. Still, we don't have to live with this default; we just have to know how to turn it off.


Explicit constructors

A constructor that is marked with the explicit keyword at the beginning of its signature is one that can only be called explicitly, which is to say that it can never be used to invoke an implicit type conversion. It's not a word you can use in the signature of anything but a constructor, and it only really makes sense to use it on a constructor that takes a single argument — the only circumstance where it will otherwise be easy to make a mistake like passing a parameter of one type and having it be implicitly converted to another type without us realizing it. But when you write a constructor that can be called with one argument, you should consider whether you want the implicit conversion to be a possibility; in my experience, the answer is more often than not "No!", but it's worth thinking about each time.

So we could refine our Transaction example above like this:

class Transaction
{
public:
    explicit Transaction(int id): id{id} { }
    int getId() const { return id; }

private:
    int id;
};

The one-word change, adding explicit to the constructor's signature, is enough to clean up our design, so that the things that seemed crazy now become illegal.

Transaction t1{14};
Transaction t2 = 17;
int i3 = 555; Transaction t3 = i3;
val(72);
ref(89);

Of these examples, the only one that will now be legal is the first one. For a design like this, I would argue that it's the only one that made sense! One of the hallmarks of a good C++ design is one in which a program doesn't compile when you do something you shouldn't; much of what's been added to the language over the years has been added with the goal of making it possible for compilers to catch more of our mistakes for us automatically. But compilers can't do that unless we can communicate our intent; explicit is another in a long list of C++ features we've seen that allow us to do that.