Standard format specification

From cppreference.com
Utilities library
General utilities
Relational operators (deprecated in C++20)

For basic types and string types, the format specification is based on the format specification in Python .

The syntax of format specifications is:

fill-and-align  (optional) sign  (optional) # (optional) 0 (optional) width  (optional) precision  (optional) L (optional) type  (optional)

The sign , # and 0 options are only valid when an integer or floating-point presentation type is used.

Fill and align

fill-and-align is an optional fill character (which can be any character other than { or } ), followed by one of the align options < , > , ^ .

If no fill character is specified, it defaults to the space character. For a format specification in a Unicode encoding, the fill character must correspond to a single Unicode scalar value.

The meaning of align options is as follows:

  • < : Forces the formatted argument to be aligned to the start of the available space by inserting n fill characters after the formatted argument. This is the default when a non-integer non-floating-point presentation type is used.
  • > : Forces the formatted argument to be aligned to the end of the available space by inserting n fill characters before the formatted argument. This is the default when an integer or floating-point presentation type is used.
  • ^ : Forces the formatted argument to be centered within the available space by inserting
    n
    2
    characters before and
    n
    2
    characters after the formatted argument.

In each case, n is the difference of the minimum field width (specified by width ) and the estimated width of the formatted argument, or 0 if the difference is less than 0.

char c = 120;
assert(std::format("{:6}", 42)    == "    42");
assert(std::format("{:6}", 'x')   == "x     ");
assert(std::format("{:*<6}", 'x') == "x*****");
assert(std::format("{:*>6}", 'x') == "*****x");
assert(std::format("{:*^6}", 'x') == "**x***");
assert(std::format("{:6d}", c)    == "   120");
assert(std::format("{:6}", true)  == "true  ");

Sign, #, and 0

The sign option can be one of following:

  • + : Indicates that a sign should be used for both non-negative and negative numbers. The + sign is inserted before the output value for non-negative numbers.
  • - : Indicates that a sign should be used for negative numbers only (this is the default behavior).
  • space: Indicates that a leading space should be used for non-negative numbers, and a minus sign for negative numbers.

Negative zero is treated as a negative number.

The sign option applies to floating-point infinity and NaN.

double inf = std::numeric_limits<double>::infinity();
double nan = std::numeric_limits<double>::quiet_NaN();
assert(std::format("{0:},{0:+},{0:-},{0: }", 1)   == "1,+1,1, 1");
assert(std::format("{0:},{0:+},{0:-},{0: }", -1)  == "-1,-1,-1,-1");
assert(std::format("{0:},{0:+},{0:-},{0: }", inf) == "inf,+inf,inf, inf");
assert(std::format("{0:},{0:+},{0:-},{0: }", nan) == "nan,+nan,nan, nan");

The # option causes the alternate form to be used for the conversion.

  • For integral types, when binary, octal, or hexadecimal presentation type is used, the alternate form inserts the prefix ( 0b , 0 , or 0x ) into the output value after the sign character (possibly space) if there is one, or add it before the output value otherwise.
  • For floating-point types, the alternate form causes the result of the conversion of finite values to always contain a decimal-point character, even if no digits follow it. Normally, a decimal-point character appears in the result of these conversions only if a digit follows it. In addition, for g and G conversions, trailing zeros are not removed from the result.

The 0 option pads the field with leading zeros (following any indication of sign or base) to the field width, except when applied to an infinity or NaN. If the 0 character and an align option both appear, the 0 character is ignored.

char c = 120;
assert(std::format("{:+06d}", c)   == "+00120");
assert(std::format("{:#06x}", 0xa) == "0x000a");
assert(std::format("{:<06}", -42)  == "-42   "); // 0 is ignored because of < alignment

Width and precision

width is either a positive decimal number, or a nested replacement field ( {} or { n } ). If present, it specifies the minimum field width.

precision is a dot ( . ) followed by either a non-negative decimal number or a nested replacement field. This field indicates the precision or maximum field size. It can only be used with floating-point and string types.

  • For floating-point types, this field specifies the formatting precision.
  • For string types, it provides an upper bound for the estimated width (see below ) of the prefix of the string to be copied to the output. For a string in a Unicode encoding, the text to be copied to the output is the longest prefix of whole extended grapheme clusters whose estimated width is no greater than the precision.

If a nested replacement field is used for width or precision , and the corresponding argument is not of integral type (until C++23) standard signed or unsigned integer type (since C++23) , or is negative, an exception of type std::format_error is thrown.

float pi = 3.14f;
assert(std::format("{:10f}", pi)           == "  3.140000"); // width = 10
assert(std::format("{:{}f}", pi, 10)       == "  3.140000"); // width = 10
assert(std::format("{:.5f}", pi)           == "3.14000");    // precision = 5
assert(std::format("{:.{}f}", pi, 5)       == "3.14000");    // precision = 5
assert(std::format("{:10.5f}", pi)         == "   3.14000"); // width = 10, precision = 5
assert(std::format("{:{}.{}f}", pi, 10, 5) == "   3.14000"); // width = 10, precision = 5
 
auto b1 = std::format("{:{}f}", pi, 10.0); // throws: width is not of integral type 
auto b2 = std::format("{:{}f}", pi, -10);  // throws: width is negative
auto b3 = std::format("{:.{}f}", pi, 5.0); // throws: precision is not of integral type

The width of a string is defined as the estimated number of column positions appropriate for displaying it in a terminal.

For the purpose of width computation, a string is assumed to be in an implementation-defined encoding. The method of width computation is unspecified, but for a string in a Unicode encoding, implementation should estimate the width of the string as the sum of estimated widths of the first code points in its extended grapheme clusters . The estimated width is 2 for the following code points, and is 1 otherwise:

  • Any code point whose Unicode property East_Asian_Width has value Fullwidth ( F ) or Wide ( W )
  • U+4DC0 - U+4DFF (Yijing Hexagram Symbols)
  • U+1F300 – U+1F5FF (Miscellaneous Symbols and Pictographs)
  • U+1F900 – U+1F9FF (Supplemental Symbols and Pictographs)
assert(std::format("{:.^5s}",   "🐱")    == ".🐱..");
assert(std::format("{:.5s}",    "🐱🐱🐱") == "🐱🐱");
assert(std::format("{:.<5.5s}", "🐱🐱🐱") == "🐱🐱.");

L (locale-specific formatting)

The L option causes the locale-specific form to be used. This option is only valid for arithmetic types.

  • For integral types, the locale-specific form inserts the appropriate digit group separator characters according to the context's locale.
  • For floating-point types, the locale-specific form inserts the appropriate digit group and radix separator characters according to the context's locale.
  • For the textual representation of bool , the locale-specific form uses the appropriate string as if obtained with std::numpunct::truename or std::numpunct::falsename .

Type

The type option determines how the data should be presented.

The available string presentation types are:

  • none, s : Copies the string to the output.
  • ? : Copies the escaped string (see below ) to the output.
(since C++23)

The available integer presentation types for integral types other than char , wchar_t , and bool are:

  • b : Binary format. Produces the output as if by calling std:: to_chars ( first, last, value, 2 ) . The base prefix is 0b .
  • B : same as b , except that the base prefix is 0B .
  • c : Copies the character static_cast < CharT > ( value ) to the output, where CharT is the character type of the format string. Throws std::format_error if value is not in the range of representable values for CharT .
  • d : Decimal format. Produces the output as if by calling std:: to_chars ( first, last, value ) .
  • o : Octal format. Produces the output as if by calling std:: to_chars ( first, last, value, 8 ) . The base prefix is 0 if the corresponding argument value is non-zero and is empty otherwise.
  • x : Hex format. Produces the output as if by calling std:: to_chars ( first, last, value, 16 ) . The base prefix is 0x .
  • X : same as x , except that it uses uppercase letters for digits above 9 and the base prefix is 0X .
  • none: same as d .

The available char and wchar_t presentation types are:

  • none, c : Copies the character to the output.
  • b , B , d , o , x , X : Uses integer presentation types with the value static_cast < unsigned char > ( value ) or static_cast < std:: make_unsigned_t < wchar_t >> ( value ) respectively.
  • ? : Copies the escaped character (see below ) to the output.
(since C++23)

The available bool presentation types are:

  • none, s : Copies textual representation ( true or false , or the locale-specific form) to the output.
  • b , B , d , o , x , X : Uses integer presentation types with the value static_cast < unsigned char > ( value ) .

The available floating-point presentation types are:

  • a : If precision is specified, produces the output as if by calling std:: to_chars ( first, last, value, std :: chars_format :: hex , precision ) where precision is the specified precision; otherwise, the output is produced as if by calling std:: to_chars ( first, last, value, std :: chars_format :: hex ) .
  • A : same as a , except that it uses uppercase letters for digits above 9 and uses P to indicate the exponent.
  • e : Produces the output as if by calling std:: to_chars ( first, last, value, std :: chars_format :: scientific , precision ) where precision is the specified precision, or 6 if precision is not specified.
  • E : same as e , except that it uses E to indicate the exponent.
  • f , F : Produces the output as if by calling std:: to_chars ( first, last, value, std :: chars_format :: fixed , precision ) where precision is the specified precision, or 6 if precision is not specified.
  • g : Produces the output as if by calling std:: to_chars ( first, last, value, std :: chars_format :: general , precision ) where precision is the specified precision, or 6 if precision is not specified.
  • G : same as g , except that it uses E to indicate the exponent.
  • none: If precision is specified, produces the output as if by calling std:: to_chars ( first, last, value, std :: chars_format :: general , precision ) where precision is the specified precision; otherwise, the output is produced as if by calling std:: to_chars ( first, last, value ) .

For lower-case presentation types, infinity and NaN are formatted as inf and nan , respectively. For upper-case presentation types, infinity and NaN are formatted as INF and NAN , respectively.

The available pointer presentation types (also used for std::nullptr_t ) are:

  • none, p : If std::uintptr_t is defined, produces the output as if by calling std:: to_chars ( first, last, reinterpret_cast < std:: uintptr_t > ( value ) , 16 ) with the prefix 0x added to the output; otherwise, the output is implementation-defined.
  • P : same as p , except that it uses uppercase letters for digits above 9 and the base prefix is 0X .
(since C++26)


Formatting escaped characters and strings

A character or string can be formatted as escaped to make it more suitable for debugging or for logging.

Escaping is done as follows:

  • For each well-formed code unit sequence that encodes a character C :
  • If C is one of the characters in the following table, the corresponding escape sequence is used.
Character Escape sequence Notes
horizontal tab (byte 0x09 in ASCII encoding) \t
line feed - new line (byte 0x0a in ASCII encoding) \n
carriage return (byte 0x0d in ASCII encoding) \r
double quote (byte 0x22 in ASCII encoding) \" Used only if the output is a double-quoted string
single quote (byte 0x27 in ASCII encoding) \' Used only if the output is a single-quoted string
backslash (byte 0x5c in ASCII encoding) \\
  • Otherwise, if C is not the space character (byte 0x20 in ASCII encoding), and either
  • the associated character encoding is a Unicode encoding and
  • C corresponds to a Unicode scalar value whose Unicode property General_Category has a value in the groups Separator ( Z ) or Other ( C ), or
  • C is not immediately preceded by a non-escaped character, and C corresponds to a Unicode scalar value which has the Unicode property Grapheme_Extend=Yes , or
  • the associated character encoding is not a Unicode encoding and C is one of an implementation-defined set of separator or non-printable characters
the escape sequence is \u{ hex-digit-sequence } , where hex-digit-sequence is the shortest hexadecimal representation of C using lower-case hexadecimal digits.
  • Otherwise, C is copied as is.
  • A code unit sequence that is a shift sequence has unspecified effect on the output and further decoding of the string.
  • Other code units (i.e. those in ill-formed code unit sequences) are each replaced with \x{ hex-digit-sequence } , where hex-digit-sequence is the shortest hexadecimal representation of the code unit using lower-case hexadecimal digits.

The escaped string representation of a string is constructed by escaping the code unit sequences in the string, as described above, and quoting the result with double quotes.

The escaped representation of a character is constructed by escaping it as described above, and quoting the result with single quotes.

Compiler Explorer demo:

#include <print>
 
int main()
{
    std::println("[{:?}]", "h\tllo");             // prints: ["h\tllo"]
    std::println("[{:?}]", "Спасибо, Виктор ♥!"); // prints: ["Спасибо, Виктор ♥!"]
    std::println("[{:?}] [{:?}]", '\'', '"');     // prints: ['\'', '"']
 
    // The following examples assume use of the UTF-8 encoding
    std::println("[{:?}]", std::string("\0 \n \t \x02 \x1b", 9));
                                             // prints: ["\u{0} \n \t \u{2} \u{1b}"]
    std::println("[{:?}]", "\xc3\x28");      // invalid UTF-8
                                             // prints: ["\x{c3}("]
    std::println("[{:?}]", "\u0301");        // prints: ["\u{301}"]
    std::println("[{:?}]", "\\\u0301");      // prints: ["\\\u{301}"]
    std::println("[{:?}]", "e\u0301\u0323"); // prints: ["ẹ́"]
}
(since C++23)

Notes

In most of the cases the syntax is similar to the old % -formatting, with the addition of the {} and with : used instead of % . For example, "%03.2f" can be translated to "{:03.2f}" .

Feature-test macro Value Std Feature
__cpp_lib_format_uchar 202311L (C++20)
(DR)
Formatting of code units as unsigned integers

Defect reports

The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

DR Applied to Behavior as published Correct behavior
LWG 3721 C++20 zero is not allowed for the width field
in standard format specification
zero is permitted if specified
via a replacement field
P2909R4 C++20 char or wchar_t might be formatted as
out-of-range unsigned integer values
code units are converted to the corresponding
unsigned type before such formatting