0%

Understanding strtok() in C

February 14, 2026

C

1. What is strtok?

strtok (string tokenize) is a C standard library function that splits a string into tokens based on delimiter characters.

#include <string.h>
char *strtok(char *str, const char *delim);

Parameters:

  • str - String to tokenize (first call) or NULL (subsequent calls)
  • delim - String containing delimiter characters

Returns:

  • Pointer to the next token
  • NULL when no more tokens are found

2. Key Characteristic: Stateful Behavior

strtok is stateful - it remembers its position between calls using an internal static variable. This is why we use NULL in subsequent calls.

3. How It Works

3.1. The Two-Parameter Purpose

  1. First parameter (str/NULL): WHERE to search

    • Pass the string on first call to initialize
    • Pass NULL on subsequent calls to continue from saved position
  2. Second parameter (delim): WHAT to search for

    • Specifies which characters are delimiters
    • Must be provided on every call (can be different each time)

3.2. What strtok Does

  1. Searches for the next delimiter character
  2. Replaces the delimiter with '\0' (modifies original string!)
  3. Returns pointer to the start of the current token
  4. Saves position internally for next call
  5. Returns NULL when no more tokens exist

4. Step-by-Step Example

4.1. Input String

char str[] = "James,Hong Kong,20";

4.2. First Call - Initialize

char* name = strtok(str, ",");

What happens:

  1. Searches str for first ','
  2. Replaces ',' with '\0'
  3. Returns pointer to "James"
  4. Saves internal position after the '\0'

Memory after:

"James\0Hong Kong,20"
      ↑ replaced
       ↑ saved position

Result: name points to "James"

4.3. Second Call - Continue

char* addr = strtok(NULL, ",");

What happens:

  1. NULL means "use saved position"
  2. Continues from saved position
  3. Searches for next ','
  4. Replaces it with '\0'
  5. Returns pointer to "Hong Kong"
  6. Updates saved position

Memory after:

"James\0Hong Kong\020"
                 ↑ replaced
                  ↑ new saved position

Result: addr points to "Hong Kong"

4.4. Third Call - Continue

char* hours = strtok(NULL, ",");

What happens:

  1. Continues from saved position
  2. No more ',' found
  3. Returns pointer to "20"
  4. Reaches end of string

Memory after:

"James\0Hong Kong\020\0"
                      ↑ end of original string

Result: hours points to "20"

4.5. Fourth Call - End

char* next = strtok(NULL, ",");

Result: next is NULL (no more tokens)

5. Visual Flow Diagram

Original string: "James,Hong Kong,20"

Call 1: strtok(str, ",")
        "James\0Hong Kong,20"
      returns
        
Call 2: strtok(NULL, ",")
        "James\0Hong Kong\020"
              returns
              
Call 3: strtok(NULL, ",")
        "James\0Hong Kong\020\0"
                       returns
                       
Call 4: strtok(NULL, ",")
        returns NULL (no more tokens)

6. Complete Working Example

#include <stdio.h>
#include <string.h>

int main() {
    char str[] = "James,Hong Kong,20";
    
    char* name  = strtok(str, ",");   // "James"
    char* addr  = strtok(NULL, ",");  // "Hong Kong"
    char* hours = strtok(NULL, ",");  // "20"
    
    printf("Name: %s\n", name);       // Name: James
    printf("Address: %s\n", addr);    // Address: Hong Kong
    printf("Hours: %s\n", hours);     // Hours: 20
    
    return 0;
}

7. Why Use NULL in Subsequent Calls?

If we DON'T use NULL:

char* name  = strtok(str, ",");   // "James"
char* addr  = strtok(str, ",");   // "James" again!
char* hours = strtok(str, ",");   // "James" again!

Passing the string again resets to the beginning.

With NULL (correct):

char* name  = strtok(str, ",");   // "James"
char* addr  = strtok(NULL, ",");  // "Hong Kong" ✓
char* hours = strtok(NULL, ",");  // "20" ✓

NULL tells strtok to continue from where it stopped.

8. Multiple Delimiter Characters

We can specify multiple delimiters in one string:

char str[] = "name=James;addr=Hong Kong;hours=20";
char* token = strtok(str, "=;");  // Split on '=' OR ';'

// Returns: "name", "James", "addr", "Hong Kong", "hours", "20"

Common use case:

char str[] = "one  two\tthree\nfour";
char* token = strtok(str, " \t\n");  // Split on space, tab, or newline

while (token != NULL) {
    printf("%s\n", token);
    token = strtok(NULL, " \t\n");
}

9. Using Different Delimiters Per Call

Each call can use different delimiters:

char str[] = "name:James,addr:Hong Kong,hours:20";

char* field1 = strtok(str, ":");    // "name"
char* value1 = strtok(NULL, ",");   // "James"
char* field2 = strtok(NULL, ":");   // "addr"
char* value2 = strtok(NULL, ",");   // "Hong Kong"
// And so on...

10. Loop Pattern

Common pattern to process all tokens:

char str[] = "apple,banana,cherry,date";
char* token = strtok(str, ",");

while (token != NULL) {
    printf("%s\n", token);
    token = strtok(NULL, ",");
}

// Output:
// apple
// banana
// cherry
// date

11. Critical Warnings

11.1. Modifies Original String

strtok destroys the original string by replacing delimiters with '\0'.

char str[] = "a,b,c";
printf("%s\n", str);        // "a,b,c"

strtok(str, ",");
printf("%s\n", str);        // "a" (rest is still there but separated by \0)

Solution: Make a copy if we need the original

char original[] = "James,Hong Kong,20";
char copy[256];
strcpy(copy, original);

char* token = strtok(copy, ",");  // Work with copy
// original is preserved

11.2. Not Thread-Safe

strtok uses a static internal variable, making it not thread-safe. Multiple threads calling strtok will interfere with each other.

Solution: Use strtok_r (reentrant version)

char* saveptr;
char* token = strtok_r(str, ",", &saveptr);
char* next  = strtok_r(NULL, ",", &saveptr);

11.3. Cannot Use on String Literals

char* str = "a,b,c";         // String literal (in read-only memory)
strtok(str, ",");            // ❌ CRASH! Cannot modify string literals

Solution: Use array instead

char str[] = "a,b,c";        // Modifiable array
strtok(str, ",");            // ✓ Works

11.4. Empty Tokens

strtok skips consecutive delimiters:

char str[] = "a,,c";         // Two commas
strtok(str, ",");            // "a"
strtok(NULL, ",");           // "c" (skips empty token!)

If we need to preserve empty tokens, use alternative methods.

12. Common Use Cases

12.1. Parsing CSV Data

char line[] = "John,Doe,30,Engineer";
char* first = strtok(line, ",");
char* last  = strtok(NULL, ",");
char* age   = strtok(NULL, ",");
char* job   = strtok(NULL, ",");

12.2. Parsing Command-Line Input

char input[] = "add 5 10";
char* cmd  = strtok(input, " ");   // "add"
char* arg1 = strtok(NULL, " ");    // "5"
char* arg2 = strtok(NULL, " ");    // "10"

12.3. Splitting Path Components

char path[] = "/usr/local/bin";
char* token = strtok(path, "/");

while (token != NULL) {
    printf("Component: %s\n", token);
    token = strtok(NULL, "/");
}
// Output: usr, local, bin

13. Alternatives to strtok

13.1. strtok_r (Thread-Safe)

char* saveptr;
char* token = strtok_r(str, ",", &saveptr);
while (token != NULL) {
    printf("%s\n", token);
    token = strtok_r(NULL, ",", &saveptr);
}

13.2. strsep (BSD/Linux)

char* token;
char* ptr = str;
while ((token = strsep(&ptr, ",")) != NULL) {
    printf("%s\n", token);  // Handles empty tokens
}

13.3. Manual Parsing

char* start = str;
char* end;
while ((end = strchr(start, ',')) != NULL) {
    *end = '\0';
    printf("%s\n", start);
    start = end + 1;
}
printf("%s\n", start);  // Last token

14. Summary Table

FeatureBehavior
First parameterString pointer (first call) or NULL (subsequent calls)
Second parameterDelimiter characters (required every call)
Modifies stringYes - replaces delimiters with '\0'
Thread-safeNo - use strtok_r instead
Empty tokensSkipped (consecutive delimiters ignored)
String literalsCannot use - will crash
ReturnsPointer to token, or NULL when done

15. Quick Reference

// Basic usage
char str[] = "a,b,c";
char* tok1 = strtok(str, ",");     // First token
char* tok2 = strtok(NULL, ",");    // Second token
char* tok3 = strtok(NULL, ",");    // Third token

// Loop pattern
char str[] = "a,b,c";
char* token = strtok(str, ",");
while (token != NULL) {
    // Process token
    token = strtok(NULL, ",");
}

// Preserve original
char original[] = "a,b,c";
char copy[256];
strcpy(copy, original);
strtok(copy, ",");                  // original unchanged

// Thread-safe version
char* saveptr;
strtok_r(str, ",", &saveptr);

16. Key Takeaways

  1. First call: Pass the string to initialize
  2. Subsequent calls: Pass NULL to continue
  3. Delimiter: Can be different for each call
  4. Modifies string: Always makes a copy if we need the original
  5. Loop pattern: Common idiom is while (token != NULL)
  6. Thread safety: Use strtok_r for multithreaded code
  7. NULL terminates: The function replaces delimiters with '\0'
  8. Stateful: Maintains internal position between calls