11.3. Working with Binary Data Record Layouts | TutorialsJet

11.3. Working with Binary Data Record Layouts

Working with binary data record layouts in Python often involves the struct module.

🡆struct.pack(format, v1, v2, ...): Packs Python values into a bytes object according to the format string.

🡆struct.unpack(format, buffer): Unpacks a bytes object into a tuple of Python values according to the format string.

🡆struct.calcsize(format): Returns the size of the struct corresponding to the format string.

Which allows packing and unpacking of binary data according to specified format strings.

Example

Consider a binary file with a header containing an unsigned short for version and an unsigned int for data length, both in little-endian.

import struct

# Simulate a binary file content

# Version (2 bytes, H), Length (4 bytes, I)

mock_binary_header = struct.pack(‘<HI’, 101, 512)

with open(‘mock_file.bin’, ‘wb’) as f:

f.write(mock_binary_header)

with open(‘mock_file.bin’, ‘rb’) as f:

header_data = f.read(struct.calcsize(‘<HI’))

version, length = struct.unpack(‘<HI’, header_data)

print(f”Version: {version}, Length: {length}”)

Opening Binary Files

Binary files are opened using the 'rb' (read binary), 'wb' (write binary), or 'ab' (append binary) modes in the open() function.

with open(‘data.bin’, ‘rb’) as f:

binary_data = f.read()

Format strings define the layout of the binary data, specifying data types, byte order, and alignment.

🡆'<': Little-endian (standard for many systems)

🡆'>' or '!': Big-endian (network byte order)

🡆'=': Native byte order and alignment

🡆'@': Native byte order, native alignment (default)

Data Types: include 'b' (signed char), 'H' (unsigned short), 'I' (unsigned int), 'f' (float), 'd' (double), 's' (bytes).

This is particularly useful when interacting with files or network protocols that use fixed-size binary structures.

TutorialsJet