String Functions

Introduction

The header <utilities/string.h> supplies several utility functions that work on strings.

Many of the functions come in two flavours. One version alters the input string in place, while the other returns a new string that is a copy of the input appropriately converted, leaving the original untouched.

For example, utilities::upper_case(str) converts str to upper-case in place. On the other hand, utilities::upper_cased(str) returns a fresh string that is a copy of str converted to upper-case. As you will see below, this is the typical naming style used.

There are other functions where this distinction is unnecessary, such as utilities::starts_with(...).

Case Conversions

1void utilities::upper_case(std::string&);
2void utilities::lower_case(std::string&);

3std::string utilities::upper_cased(std::string_view);
4std::string utilities::lower_cased(std::string_view);
1
Converts a string to uppercase.
2
Converts a string to lowercase.
3
Returns a new string, an uppercase copy of the input string.
4
Returns a new string, a lowercase copy of the input string.
Our case conversions rely on the std::tolower and std::toupper functions, which only work for simple character types.

Trimming Spaces

1void utilities::trim_left(std::string&);
2void utilities::trim_right(std::string&);
3void utilities::trim(std::string&);

4std::string utilities::trimmed_left(std::string_view);
5std::string utilities::trimmed_right(std::string_view);
6std::string utilities::trim(medstd::string_view);
1
Remove any leading whitespace from the input string.
2
Remove any trailing whitespace from the input string.
3
Remove leading and trailing whitespace from the input string.
4
Returns a new string, a left-trimmed copy of the input string.
5
Returns a new string, a right-trimmed copy of the input string.
6
Returns a new string that is a trimmed copy of the input string on both sides.
Our case conversions rely on the std::isspace function to identify whitespace characters.

Replacing Substrings

void
utilities::replace_left(std::string &str,
                        std::string_view target,
1                        std::string_view replacement);
void
utilities::replace_right(std::string &str,
                         std::string_view target,
2                         std::string_view replacement);
void
utilities::replace(std::string &str,
                   std::string_view target,
3                   std::string_view replacement);
std::string
utilities::replaced_left(std::string_view str,
                         std::string_view target,
4                         std::string_view replacement);
std::string
utilities::replaced_right(std::string_view str,
                          std::string_view target,
5                          std::string_view replacement);
std::string
utilities::replaced(std::string_view str,
                    std::string_view target,
6                    std::string_view replacement);
1
Replace the first occurrence of target in str with replacement.
2
Replace the final occurrence of target in str with replacement.
3
Replace all occurrences of target in str with replacement.
4
Returns a new string, a copy of str with the first occurrence of target changed to replacement.
5
Returns a new string, a copy of str with the final occurrence of target changed to replacement.
6
Returns a new string, a copy of str with all occurrences of target changed to replacement.

We also have functions to replace all contiguous white space sequences in a string:

void
utilities::replace_space(std::string &str,
                         const std::string &with = " ",
1                         bool also_trim = true);
std::string
utilities::condense(std::string_view str,
2                    bool also_trim = true);
std::string
utilities::replaced_space(std::string_view &str,
                          const std::string &with = " ",
3                          bool also_trim = true);
std::string
utilities::condensed(std::string_view str,
4                     bool also_trim = true);
1
Replaces all contiguous white space sequences in a string with a single white space character or, optionally, something else. By default, the string is also trimmed of white space on both the left and right.
2
Replaces all contiguous white space sequences in a string with a single white space character. By default, the string is also trimmed of white space on both the left and right.
3
Returns a new string, a copy of str with all contiguous white space sequences replaced with a single white space character or, optionally, something else. By default, the output string is also trimmed of white space on both the left and right.
4
Returns a new string, a copy of str with all contiguous white space sequences replaced with a single white space character. By default, the output string is also trimmed of white space on both the left and right.

Erasing Substrings

void
utilities::erase_left(std::string &str,
1                      std::string_view target);
void
utilities::erase_right(std::string &str,
2                       std::string_view target);
void
utilities::erase(std::string &str,
3                 std::string_view target);
std::string
utilities::erased_left(std::string_view str,
4                       std::string_view target);
std::string
utilities::erased_right(std::string_view str,
5                        std::string_view target);
std::string
utilities::erased(std::string_view str,
6                  std::string_view target);
1
Erases the first occurrence of the target substring in str.
2
Erases the final occurrence of the target substring in str.
3
Erases all occurrences of the target substring in str.
4
Returns a new string, a copy of str with the first occurrence of target erased.
5
Returns a new string, a copy of str with the final occurrence of target erased.
6
Returns a new string, a copy of str with all occurrences of target erased.

“Standardizing” Strings

We often need to parse free-form input while looking for a keyword or phrase. Having a facility that converts strings to some standard form is helpful.

1void utilities::remove_surrounds(std::string&);
2void utilities::standardize(std::string&);

3std::string utilities::removed_surrounds(std::string_view);
4std::string utilities::standardized(std::string_view);
1
Strips any “surrounds” from the input string.
For example, the string “(text)” becomes “text”. Multiples also work so “[[[text]]]” becomes “text”. Only correctly balanced surrounds are ever removed.
2
Standardize the input string — see below
3
Returns a new string, a copy of the input with any “surrounds” removed.
4
Returns a new string, a standardized copy of the input.

The standardize functions give you a string stripped of extraneous brackets, etc. Moreover, the single space character will replace all interior white space, and all leading and trailing whitespace will be removed. So a string like “< Ace of Clubs >” will become “ACE OF CLUBS”.

It is a lot easier to parse standardized strings.

Searching

1bool utilities::starts_with(std::string_view str, std::string_view prefix);
2bool utilities::ends_with(std::string_view str, std::string_view prefix);
1
Returns true if str starts with prefix.
2
Returns true if str ends with suffix.

Tokenizing

We often want to convert a stream of text into tokens. Here are some functions to help with that:

template<std::input_iterator InIter, std::forward_iterator FwdIter, typename Func>
constexpr void
for_each_token(InIter  input_begin, InIter  input_end,
1               FwdIter delims_begin, FwdIter delims_end, Func token_func);

template<typename Container_t>
constexpr void
tokenize(std::string_view input, Container_t &output_container,
2         std::string_view delimiters = "\t,;: ", bool skip = true);

std::vector<std::string>
split(std::string_view input,
3      std::string_view delimiters = "\t,;: ", bool skip = true);
1
Given iterators that bracket the input text and others that bracket the possible token delimiters, this method processes the text and passes each token to a user-supplied function.
2
Tokenizes the input text string and places the tokens into output_container.
3
Tokenizes the input text string and returns the tokens as a std::vector of strings.

We have based the for_each_token function on the excellent discussion here.

Function Arguments

Argument Description
input_begin To tokenize the string stored in text input_begin should be std::cbegin(text).
input_end To tokenize the string stored in text input_end should be std::cend(text).
delims_begin If the possible delimiters for the tokens are in the string delims, which might be "\t,;: ", then delims_begin should be std::cbegin(delims).
delims_end If the possible delimiters for the tokens are in the string delims, which might be "\t,;: ", then delims_end should be std::cend(delims).
token_func This will be called for each token: token_func(token.cbegin(), token.cend()).
output_container This container needs to be dynamically resizable and support the emplace_back(token.cbegin(), token.cend()).
skip If true, we ignore empty tokens (e.g., two spaces in a row).
delimiters These are the characters that should delimit our tokens. Tokens break on white space, commas, semi-colons, and colons by default.

Extracting Values

We also have a function that attempts to parse a value from a string.

template<typename T>
1constexpr std::optional<T> possible(std::string_view str);
1
Tries to read a value of a particular type from a string.

This function uses the std::from_chars function to retrieve a possible simple type from a string. It returns a std::nullopt if it fails to parse the input.

Example

auto x = possible<double>(str);
if(x) std::cout << str << ": parsed as the double value " << x << '\n';

If successful, this function tries to fill x with a double value read from a string and print it on std::cout.

Back to top