How do i open a binary image in python?

I have a very simple script in Matlab that opens a 'raw' binary image file and displays it. Is this easily reproducible using numpy in python? I've come across various posts discussing unpacking, dealing with endian, specifying buffers, etc. But this seems like it should be simple, based on how simple the matlab interface is

>> fileID = fopen('sampleX3.raw','rb')

fileID =

     1

>> A = fread(fileID,[1024,1024],'int16');
size(A)

ans =

        1024        1024

>> max(max(A))

ans =

       12345

>> close all; figure; imagesc(A);

asked Jul 4, 2013 at 23:39

1

This will do the same thing using numpy and matplotlib:

import numpy as np
from matplotlib import pylab as plt

A = np.fromfile(filename, dtype='int16', sep="")
A = A.reshape([1024, 1024])
plt.imshow(A)

I feel obligated to mention that using raw binary files to store data is generally a bad idea.

answered Jul 4, 2013 at 23:57

Bi RicoBi Rico

24.6k3 gold badges49 silver badges72 bronze badges

2

Binary files are files that are not normal text files. Example: An Image File. These files are also stored as a sequence of bytes in the computer hard disk. These types of binary files cannot be opened in the normal mode and read as text.

You can read binary file by opening the file in binary mode using the open('filename', 'rb').

When working with the problems like image classification in Machine learning, you may need to open the file in binary mode and read the bytes to create ML models. In this situation, you can open the file in binary mode, and read the file as bytes. In this case, decoding of bytes to the relevant characters will not be attempted. On the other hand, when you open a normal file in the normal read mode, the bytes will be decoded to string or the other relevant characters based on the file encoding.

If You’re in Hurry…

You can open the file using open() method by passing b parameter to open it in binary mode and read the file bytes.

open('filename', "rb") opens the binary file in read mode.

r– To specify to open the file in reading mode
b – To specify it’s a binary file. No decoding of bytes to string attempt will be made.

Example

The below example reads the file one byte at a time and prints the byte.

try:
    with open("c:\temp\Binary_File.jpg", "rb") as f:
        byte = f.read(1)
        while byte:
            # Do stuff with byte.
            byte = f.read(1)
            print(byte)
except IOError:
     print('Error While Opening the file!')  

If You Want to Understand Details, Read on…

In this tutorial, you’ll learn how to read binary files in different ways.

  • Read binary file byte by byte
  • Python Read Binary File into Byte Array
  • Python read binary file into numpy array
  • Read binary file Line by Line
  • Read Binary File Fully in One Shot
  • Python Read Binary File and Convert to Ascii
  • Read binary file into dataframe
  • Read binary file skip header
  • Readind Binary file using Pickle
  • Conclusion

Read binary file byte by byte

In this section, you’ll learn how to read a binary file byte by byte and print it. This is one of the fastest ways to read the binary file.

The file is opened using the open() method and the mode is mentioned as “rb” which means opening the file in reading mode and denoting it’s a binary file. In this case, decoding of the bytes to string will not be made. It’ll just be read as bytes.

The below example shows how the file is read byte by byte using the file.read(1) method.

The parameter value 1 ensures one byte is read during each read() method call.

Example

try:
    with open("c:\temp\Binary_File.jpg", "rb") as f:
        byte = f.read(1)
        while byte:
            # Do stuff with byte.
            byte = f.read(1)
            print(byte)
except IOError:
     print('Error While Opening the file!')  

Output

    b'\xd8'
    b'\xff'
    b'\xe0'
    b'\x00'
    b'\x10'
    b'J'
    b'F'
    b'I'
    b'F'
    b'\x00'
    b'\x01'
    b'\x01'
    b'\x00'
    b'\x00'
    b'\x01'
    b'\x00'
    b'\x01'
    b'\x00'
    b'\x00'
    b'\xff'
    b'\xed'
    b'\x00'
    b'|'
    b'P'
    b'h'
    b'o'
    b't'
    b'o'
    b's'
    b'h'
    b'o'
    b'p'
    b' '
    b'3'
    b'.'
    b'0'
    b'\xc6'
    b'\xb3'
    b'\xff'
    b'\xd9'
    b''

Python Read Binary File into Byte Array

In this section, you’ll learn how to read the binary files into a byte array.

First, the file is opened in therb mode.

A byte array called mybytearray is initialized using the bytearray() method.

Then the file is read one byte at a time using f.read(1) and appended to the byte array using += operator. Each byte is appended to the bytearray.

At last, you can print the bytearray to display the bytes that are read.

Example

try:
    with open("c:\temp\Binary_File.jpg", "rb") as f:

        mybytearray = bytearray()

        # Do stuff with byte.
        mybytearray+=f.read(1)
        mybytearray+=f.read(1)
        mybytearray+=f.read(1)
        mybytearray+=f.read(1)
        mybytearray+=f.read(1)

        print(mybytearray)

except IOError:
    print('Error While Opening the file!')    

Output

    bytearray(b'\xff\xd8\xff\xe0\x00\x10')

Python read binary file into numpy array

In this section, you’ll learn how to read the binary file into a NumPy array.

First, import numpy as np to import the numpy library.

Then specify the datatype as bytes for the np object using np.dtype('B')

Next, open the binary file in reading mode.

Now, create the NumPy array using the fromfile() method using the np object.

Parameters are the file object and the datatype initialized as bytes. This will create a NumPy array of bytes.

numpy_data = np.fromfile(f,dtype)

Example

import numpy as np

dtype = np.dtype('B')
try:
    with open("c:\temp\Binary_File.jpg", "rb") as f:
        numpy_data = np.fromfile(f,dtype)
    print(numpy_data)
except IOError:
    print('Error While Opening the file!')    

Output

[255 216 255 ... 179 255 217]


The bytes are read into the numpy array and the bytes are printed.

Read binary file Line by Line

In this section, you’ll learn how to read binary file line by line.

You can read the file line by line using the readlines() method available in the file object.

Each line will be stored as an item in the list. This list can be iterated to access each line of the file.

rstrip() method is used to remove the spaces in the beginning and end of the lines while printing the lines.

Example

f = open("c:\temp\Binary_File.jpg",'rb')

lines = f.readlines()

for line in lines:

    print(line.rstrip())

Output

    b'\x07\x07\x07\x07'
    b''
    b''
    b''
    b''
    b''
    b'\x0c\x0f\x0c\x0c\x0c\x0c\x0c\x0c\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x12\x12\x12\x12\x12\x12\x15\x15\x15\x15\x15\x17\x17\x17\x17\x17\x17\x17\x17\x17\x17\xff\xdb\x00C\x01\x04\x04\x04\x06\x06\x06'
    b'\x06\x06'

Read Binary File Fully in One Shot

In this section, you’ll learn how to read binary file in one shot.

You can do this by passing -1 to the file.read() method. This will read the binary file fully in one shot as shown below.

Example

try:
    f = open("c:\temp\Binary_File.jpg", 'rb')
    while True:
        binarycontent = f.read(-1)  
        if not binarycontent:
            break
        print(binarycontent)
except IOError:
    print('Error While Opening the file!')

Output

 b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xed\x00|Photoshop 3.0\x008BIM\x04\x04\x00\x00\x00\x00\x00\x1c\x02(\x00ZFBMD2300096c010000fe0e000032160000051b00003d2b000055300000d6360000bb3c0000ce4100008b490000\x00\xff\xdb\x00C\x00\x03\x03\x03\x03\x03\x03\x05\x03\x03\x05\x07\x05\x05\x05\x07\n\x07\x07\x07\x07\n\x0c\n\n\n\n\n\x0c\x0f\x0c\x0c\x0c\x0c\x0c\x0c\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x12\x12\x12\x12\x12\x12\x15\x15\x15\x15\x15\x17\x17\x17\x17\x17\x17\x17\x17\x17\x17\xff\xdb\x00C\x01\x04\x04\x04\x06\x06\x06\n\x06\x06\n\x18\x11\x0e\x11\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18\x18

Python Read Binary File and Convert to Ascii

In this section, you’ll learn how to read a binary file and convert to ASCII using the binascii library. This will convert all the bytes into ASCII characters.

Read the file as binary as explained in the previous section.

Next, use the method binascii.b2a_uu(bytes). This will convert the bytes into ascii and return an ascii value.

Then you can print this to check the ascii characters.

Example

import binascii

try:
    with open("c:\temp\Binary_File.jpg", "rb") as f:

        mybytes = f.read(45)

        data_bytes2ascii = binascii.b2a_uu(mybytes)

        print("Binary String to Ascii")

        print(data_bytes2ascii)

except IOError:

    print("Error While opening the file!")

Output

 Binary String to Ascii
 b'M_]C_X  02D9)[email protected] ! 0   0 !  #_[0!\\4&AO=&]S:&]P(#,N,  X0DE-! 0 \n'

Read binary file into dataframe

In this section, you’ll learn how to read the binary file into pandas dataframe.

First, you need to read the binary file into a numpy array. Because there is no method available to read the binary file to dataframe directly.

Once you have the numpy array, then you can create a dataframe with the numpy array.

Pass the NumPy array data into the pd.DataFrame(). Then you’ll have the dataframe with the bytes read from the binary file.

Example

import numpy as np

import pandas as pd

# Create a dtype with the binary data format and the desired column names
try:

    dt = np.dtype('B')

    data = np.fromfile("c:\temp\Binary_File.jpg", dtype=dt)

    df = pd.DataFrame(data)

    print(df)

except IOError:

    print("Error while opening the file!")

Output

             0
    0      255
    1      216
    2      255
    3      224
    4        0
    ...    ...
    18822    0
    18823  198
    18824  179
    18825  255
    18826  217

    [18827 rows x 1 columns]

This is how you can read a binary file using NumPy and use that NumPy array to create the pandas dataframe.

With the NumPy array, you can also read the bytes into the dictionary.

Read binary file skip header

In this section, you’ll learn how to read binary file, skipping the header line in the binary file. Some binary files will be having the ASCII header in them.

This skip header method can be useful when reading the binary files with the ASCII headers.

You can use the readlines() method available in the File object and specify [1:] as an additional parameter. This means the line from index 1 will be read.

The ASCII header line 0 will be ignored.

Example

f = open("c:\temp\Binary_File.jpg",'rb')

lines = f.readlines()[1:]
for line in lines:
    print(line.rstrip())

Output

    b'\x07\x07\x07\x07'
    b''
    b''
    b''
    b''
    b''
    b'\x0c\x0f\x0c\x0c\x0c\x0c\x0c\x0c\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x12\x12\x12\x12\x12\x12\x15\x15\x15\x15\x15\x17\x17\x17\x17\x17\x17\x17\x17\x17\x17\xff\xdb\x00C\x01\x04\x04\x04\x06\x06\x06'
    b'\x06\x06'

    b"\x93\x80\x18\x98\xc9\xdc\x8bm\x90&'\xc5U\xb18\x81\xc7y\xf0\x80\x00\x14\x1c\xceQd\x83\x13\xa0\xbf-D9\xe0\xae;\x8f\\LK\xb8\xc3\x8ae\xd4\xd1C\x10\x7f\x02\x02\xa6\x822K&D\x9a\x04\xd4\xc8\xfbC\x87\xf2\x8d\xdcN\xdes)rq\xbbI\x92\xb6\xeeu8\x1d\xfdG\xabv\xe8q\xa5\xb6\xb56\xe0\xa1\x06\x84n#\xf0\x1c\x86\xb0\x83\xee\x99\xe7\xc6\xaaN\xafY\xdf\xd9\xcfe\xd5\x84"

    b'\xd9\x0b\xc2\x1b0\xa1Q\x17\x88\xb4et\x81u8\xed\xf5\xe8\xd9#c\t\xf9\xc0\xa7\x06\xa2/={\x87l\x01K\x870\xe3\xa1\x024\xdc^\x11\x96\x96\xba\[email protected]\x91A\xd6U\xea\xe1\xbb\xb733'

Readind Binary file using Pickle

In this section, you’ll learn how to read binary files in python using the Pickle.

This is really tricky as all the types of binary files cannot be read in this mode. You may face problems while pickling a binary file. As invalid load key errors may occur.

Hence it’s not recommended to use this method.

Example

import pickle


file_to_read = open("c:\temp\Binary_File.jpg", "rb")

loaded_dictionary = pickle.load(file_to_read)

print(loaded_dictionary)

Output

    ---------------------------------------------------------------------------

    UnpicklingError                           Traceback (most recent call last)

     in 
          7 file_to_read = open("E:\Vikram_Blogging\Stack_Vidhya\Python_Notebooks\Read_Binary_File_Python\Binary_File.jpg", "rb")
          8 
    ----> 9 loaded_dictionary = pickle.load(file_to_read)
         10 
         11 print(loaded_dictionary)


    UnpicklingError: invalid load key, '\xff'.

Conclusion

Reading a binary file is an important functionality. For example, reading the bytes of an image file is very useful when you are working with image classification problems. In this case, you can read the image file as binary and read the bytes to create the model.

In this tutorial, you’ve learned the different methods available to read binary files in python and the different libraries available in it.

If you have any questions, feel free to comment below.

How do I read a binary image in Python?

You can open the file using open() method by passing b parameter to open it in binary mode and read the file bytes. open('filename', "rb") opens the binary file in read mode.

How do I open a binary file in Python?

To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading, while the "wb" mode opens the file in binary format for writing. Unlike text files, binary files are not human-readable.

How do you open an image in binary?

Binary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary document formats like Word or PDF. In Python, files are opened in text mode by default. To open files in binary mode, when specifying a mode, add 'b' to it.

How do you convert a binary image to Python?

Approach: Read the image from the location. As a colored image has RGB layers in it and is more complex, convert it to its Grayscale form first. Set up a Threshold mark, pixels above the given mark will turn white, and below the mark will turn black.