4 minute read

Introduction

Iterating through the words of a string separated by whitespace in C++ is a common task in many programming applications. Whether you are working with natural language processing, text analysis, or any other application that involves working with text, the ability to split a string into its individual words can be very useful. In this tutorial, we will learn how to accomplish this using different methods.

Method 1: Using the stringstream class

The stringstream class is a powerful tool that allows us to work with strings as if they were streams of characters. This means that we can use the same methods that we would use to read from a file or a network socket to read from a string.

Example

#include <sstream>
#include <iostream>

int main() {
    std::string input = "This is a test string";
    std::stringstream ss(input);
    std::string word;

    while(ss >> word) {
        std::cout << word << std::endl;
    }

    return 0;
}

Output:

This
is
a
test
string

In the example above, we create a stringstream object and initialize it with the input string. Then, we use the » operator to extract words from the stream, one at a time, until we reach the end of the string.

Method 2: Using the getline function

Another way to split a string into words is to use the getline function, which allows us to extract a substring from a string based on a delimiter. In this case, we can use whitespace as the delimiter.

Example

#include <iostream>
#include <string>

int main() {
    std::string input = "This is a test string";
    std::string word;
    std::string delimiter = " ";
    size_t pos = 0;

    while((pos = input.find(delimiter)) != std::string::npos) {
        word = input.substr(0, pos);
        std::cout << word << std::endl;
        input.erase(0, pos + delimiter.length());
    }

    std::cout << input << std::endl;

    return 0;
}

Output:

This
is
a
test
string

In the example above, we use the find method to locate the position of the first occurrence of the delimiter (whitespace) in the input string. We then use the substr method to extract the word before the delimiter and the erase method to remove it from the input string. We repeat this process until there are no more occurrences of the delimiter in the input string. Finally, we output the last word, which is the remaining part of the input string after all the other words have been removed.

Method 3: Using the strtok function

Another option to split a string into words is to use the strtok function, which modifies the original string to separate it into tokens based on a delimiter. Example

#include <iostream>
#include <cstring>

int main() {
    char input[] = "This is a test string";
    char* word = strtok(input, " ");

    while(word != nullptr) {
        std::cout << word << std::endl;
        word = strtok(nullptr, " ");
    }

    return 0;
}

Output:

This
is
a
test
string

In the example above, we use the strtok function to extract the first word from the input string, using whitespace as the delimiter. We then use strtok(nullptr, “ “) to continue extracting subsequent words until the function returns nullptr, indicating that there are no more words in the input string.

Conclusion

In this tutorial, we have learned three different methods for iterating through the words of a string separated by whitespace in C++: using the stringstream class, using the getline function, and using the strtok function. Each method has its own advantages and disadvantages, and the best approach will depend on the specific requirements of your application. Understanding how to use these methods will help you to work with text more efficiently in your C++ programs.

References

Read more about the std::stringstream function here: https://cplusplus.com/reference/sstream/stringstream/stringstream/

Also, have a read about the std::strtok function here: https://en.cppreference.com/w/cpp/string/byte/strtok