1. The Problem: Different Byte Orders (Endianness)
1.1. What is Endianness?
Endianness refers to the order in which bytes are stored in memory for multi-byte data types.
Consider the 32-bit integer: 0x12345678
Big-Endian (Most significant byte first):
Little-Endian (Least significant byte first):
1.2. Why This Matters
Different CPU architectures use different byte orders:
| Architecture | Byte Order |
|---|---|
| x86/x64 (Intel, AMD) | Little-Endian |
| ARM (varies) | Bi-endian (usually Little) |
| PowerPC | Big-Endian |
| Network Protocols | Big-Endian (Network Byte Order) |
The problem: If we write data on one system and read it on another with different endianness, the values will be completely wrong.
1.3. Example of the Problem
We write the integer 305,419,896 (0x12345678) on an x86 machine to a file:
2. The Solution: Network Byte Order
2.1. Standard Convention
To ensure compatibility across different systems, protocols and file formats define a standard byte order:
- Network Byte Order = Big-Endian
- All network protocols (TCP/IP, UDP, etc.) use big-endian
- Many binary file formats also use big-endian for consistency
2.2. Conversion Functions
C provides functions to convert between host byte order (our machine) and network byte order (big-endian standard):
3. The Four Functions
3.1. htons() - Host to Network Short
- Converts 16-bit value from host to network byte order
- Use when writing 16-bit values to network/file
3.2. htonl() - Host to Network Long
- Converts 32-bit value from host to network byte order
- Use when writing 32-bit values to network/file
3.3. ntohs() - Network to Host Short
- Converts 16-bit value from network to host byte order
- Use when reading 16-bit values from network/file
3.4. ntohl() - Network to Host Long
- Converts 32-bit value from network to host byte order
- Use when reading 32-bit values from network/file
4. Function Name Breakdown
Similarly:
ntohl: Network TO Host Long (32-bit)htons: Host TO Network Short (16-bit)htonl: Host TO Network Long (32-bit)
5. When to Use These Functions
5.1. Always Use When:
-
Writing/reading binary data to files that might be used on different architectures
-
Sending/receiving data over networks
-
Implementing binary protocols (like our database file format)
-
Working with portable binary formats
5.2. Don't Need When:
-
Data stays within same program (never written to disk/network)
-
Using text-based formats (JSON, XML, CSV)
-
Using standard serialization libraries that handle it
6. Complete Example: Our Database Header
6.1. Writing to File (Host → Network)
6.2. Reading from File (Network → Host)
7. Size Guide: Which Function to Use?
| Data Type | Size | Function (Write) | Function (Read) |
|---|---|---|---|
short, uint16_t | 16-bit | htons() | ntohs() |
int, uint32_t | 32-bit | htonl() | ntohl() |
long long, uint64_t | 64-bit | Manual or htobe64() | Manual or be64toh() |
char, uint8_t | 8-bit | None needed | None needed |
Note: For 64-bit integers, some systems provide htobe64() and be64toh(), but they're not as universally available as the 16/32-bit versions.
8. What Happens on Big-Endian Systems?
On big-endian systems, these functions do nothing (they're typically macros that expand to the identity):
This means:
- No performance penalty on big-endian systems
- Automatic conversion on little-endian systems
- Our code works everywhere regardless of architecture
9. Real-World Example: IP Addresses
IP addresses in network programming must use network byte order:
If we forget htons(8080):
- On little-endian: tries to bind to port 20992 instead!
- On big-endian: works correctly, but not portable
10. Common Mistakes
10.1. Forgetting to Convert
10.1.1. Problem
10.1.2. Correct
10.2. Wrong Function
10.2.1. Problem
10.2.2. Correct
10.3. Converting Text Data
10.3.1. Problem
11. Testing for Endianness
We can detect our system's byte order by:
12. Summary
12.1. The Rule of Thumb
Always use byte order conversion functions when:
- Data crosses machine boundaries (network, files, IPC)
- We want portable binary formats
- Working with multi-byte integers (16-bit, 32-bit, etc.)
Key Points:
- Network byte order = Big-endian
- Host byte order = Our CPU's native order
- Convert when writing: Use
htons()/htonl() - Convert when reading: Use
ntohs()/ntohl() - Choose by size: 16-bit →
sfunctions, 32-bit →lfunctions - No penalty: Functions are no-ops on big-endian systems
12.2. Quick Reference Card
By consistently using these functions, our code will work correctly on any architecture, ensuring data portability and compatibility across different systems.
13. Understanding Pack and Unpack
13.1. What Does "Packing" and "Unpacking" Mean?
Packing and unpacking are terms commonly used to describe the byte order conversion process:
- Packing = Converting data from host byte order to network byte order (for storage/transmission)
- Unpacking = Converting data from network byte order to host byte order (for use)
13.2. The Process
When data is in a file (packed):
This is the "packed" format - standardized for storage/transmission.
After reading into memory (still packed):
Unpacking (converting to host byte order):
13.3. Pack vs Unpack in Practice
Pack (before writing):
Unpack (after reading):
13.4. Why the Terminology?
The "pack/unpack" terminology comes from the idea that:
- Packed = data compressed into a standard format for storage/transmission (like packing a suitcase for travel)
- Unpacked = data expanded/converted into a format our system can directly use (like unpacking the suitcase at our destination)
13.5. Visual Workflow
13.6. Key Takeaways
-
Packing happens before writing/sending data
-
Unpacking happens after reading/receiving data
-
Both operations ensure data is correctly interpreted regardless of CPU architecture
-
The file/network always stores the packed (network byte order) format
-
Our program works with unpacked (host byte order) data
-
Always pack before writing, always unpack after reading











