How do i convert a txt to a dataframe in python?

I have .TX0 file (some sort of csv txt file) and have converted this to a .txt file via python .readlines(), open(filename, 'w') etc method. I have this new saved txt file but when i try to convert it to a dataframe it's giving me only one column. the txt file is below :

Empty DataFrame
Columns: [ '"Software Version:", 6.3.2.0646, Date:, 19/08/2015 09:26:04\n',  '"Reprocess Number:", vma2:  261519, Unnamed: 7, \n',  '"Sample Name:",  , Data Acquisition Time:, 18/08/2015 17:23:23\n',  '"Instrument Name:", natural gas (PE ASXL-TCD/FID), Channel:, B\n',  '"Rack/Vial:", 0, 0.1, Operator:, joey.walker\n',  '"Sample Amount:", 1.000000, Dilution Factor:, 1.000000\n',  '"Cycle:", 1, Result File :, \\\\vma2\\TotalChrom\11170_he_tcd001.rst \n',  '"Sequence File :", \\\\vma\C1_C2_binary.seq \n',  '"===================================================================================================================================="\n',  '""\n',  '""\n'.1,  '"condensate analysis (HP4890 Optic - FID)"\n',  '"Peak", Component, Time, Area, Height, BL\n',  '"#", Name, [min], [uV*sec], [uV], \n'.1,  '------, ------, ------.1, ------.2, ------.3, ------\n',  '1, Unnamed: 55, 0.810, 706.42, 304.38, *BB\n',  '2, CH4, 0.900, 1113518.24, 495918.41, *BB\n'.1,  '3, C2H6, 1.373, 901670.23, 295381.12, *BB\n'.2,  '"", Unnamed: 73, Unnamed: 74, ------.4, ------.5, \n'.2,  '"".1, Unnamed: 79, Unnamed: 80, 2015894.89, 791603.91, \n'.3,  '"Missing Component Report"\n',  '"Component", Expected Retention (Calibration File)\n',  '------.1, ------\n'.1,  '"All components were found"\n',  '"Report stored in ASCII file :", C:\\Shared Folders\\TotalChrom\\11170_he_tcd001.TX0 \n']]
Index: []

for easier reading:

Empty DataFrame

Columns: [ '"Software Version:", 6.3.2.0646, Date:, 19/08/2015 09:26:04\n', '"Reprocess Number:", vma2: 261519, Unnamed: 7, \n', '"Sample Name:", , Data Acquisition Time:, 18/08/2015 17:23:23\n', '"Instrument Name:", natural gas (PE ASXL-TCD/FID), Channel:, B\n', '"Rack/Vial:", 0, 0.1, Operator:, joey.walker\n', '"Sample Amount:", 1.000000, Dilution Factor:, 1.000000\n', '"Cycle:", 1, Result File :, \\vma2\TotalChrom\data\Joey\Binary_Mixtures\Std1\11170_he_tcd001.rst \n', '"Sequence File :", \\vma2\TotalChrom\sequences\Joey\C1_C2_binary.seq \n', '"===================================================================================================================================="\n', '""\n', '""\n'.1, '"condensate analysis (HP4890 Optic - FID)"\n', '"Peak", Component, Time, Area, Height, BL\n', '"#", Name, [min], [uV*sec], [uV], \n'.1, '------, ------, ------.1, ------.2, ------.3, ------\n', '1, Unnamed: 55, 0.810, 706.42, 304.38, *BB\n', '2, CH4, 0.900, 1113518.24, 495918.41, *BB\n'.1, '3, C2H6, 1.373, 901670.23, 295381.12, *BB\n'.2, '"", Unnamed: 73, Unnamed: 74, ------.4, ------.5, \n'.2, '"".1, Unnamed: 79, Unnamed: 80, 2015894.89, 791603.91, \n'.3, '"Missing Component Report"\n', '"Component", Expected Retention (Calibration File)\n', '------.1, ------\n'.1, '"All components were found"\n', '"Report stored in ASCII file :", C:\Shared Folders\TotalChrom\data\Joey\Binary_Mixtures\Std1\11170_he_tcd001.TX0 \n']] Index: []

As you can see this is comma separated. Would there be any way of transferring this text to a comma delimited dataframe?

Thanks.

J

In this post, we are going to understand How to Convert text file into Pandas DataFrame with examples. We are going to use an inbuilt python pandas function. To run all the programs in this post we have to First Pandas library on our system by using “pip install pandas” and import in the program using “import pandas as pd”

Methods to convert text file to DataFrame


  • read_csv() method
  • read_table() function
  • read_fwf() function

Pandas read_csv() Method


Pandas library has a built-in read_csv() method to read a CSV that is a comma-separated value text file so we can use it to read a text file to Dataframe. It read the file at the given path and read its contents in the dataframe. It uses a comma as a defualt separator or delimiter or regular expression can be used.

Syntax

pandas.read_csv(filepath_or_buffer, sep='', delimiter=None, header='infer', names=, index_col=None)

Parameters

  • FilePath: The path of file.
  • Sep: This is used as a delimiter while reading a file to Dataframe.
  • header: To specify first rows consider as a header or not, by default the first row is considered as a header.
  • names: used to pass thename of columns.
  • index_col: This is used to specify the custom indexes.

1. Read_CSV() to convret text file to DataFrame


File contents

Name Subjs Marks
Alex Phy 100
Ben Chem 100
Jack Math 100

In this example, we are reading a text file to a dataframe by using a custom delimiter colon(:) with the help of the read_csv() method. This file exists in the current directory we just pass the file path not Full Path

Program Exmaple

import pandas as pd
 
# Read a text file to a dataframe using colon delimiters
student_csv =  pd.read_csv('students.txt', sep=':', engine='python')

print(student_csv)

Output

   Name Subjs  Marks
0  Alex   Phy    100
1   Ben  Chem    100
2  Jack  Math    100

2. Reg Exp to Read_csv() with mutiple delimiters


This is a file contents we are using in the below program example.it is present in the current directory.

File content

Name,Subjs;Marks
Alex:Phy|100
Ben;Chem_100
Jack,Math|100

In this example, we are reading a text file that is separated by multiple delimiters(:;|_) with the help of Regular Expressions to a dataframe by using Read_csv() method of Pandas dataframe. The Regular expression is used to remove multiple delimiters from a text file. Let us understand with the help of the below python program.

Program example

import pandas as pd
 
# Read a text file to a dataframe using mutiple delimiters
student_csv =  pd.read_csv('students.txt', sep='[:,;|_]', engine='python')

print(student_csv)

Output

   Name Subjs  Marks
0  Alex   Phy    100
1   Ben  Chem    100
2  Jack  Math    100

3. read_table() to convert text file to Dataframe


The read_table() function to used to read the contents of different types of files as a table. It uses a tab(\t) delimiter by default. Let us understand by example how to use it.

File Contents

Name Subjs Marks
Alex Phy  100
Ben  Chem 100
Jack Math 100

Program Example

import pandas as pd
 
# Read a text file to a dataframe using read_table function
student_csv =  pd.read_table('students.txt', 
delimiter = ' ')

print(student_csv)

Output

  Name Subjs Marks
0    Alex Phy  100
1    Ben  Chem 100
2    Jack Math 100

4. read_fwf() to convert text file to Dataframe


The read_fwf() function is used to read fixed-width formatted lines to convert a text file to a dataframe.it does not use any delimiter to delimit the lines.

Program Example

import pandas as pd

student_csv =  pd.read_fwf('students.txt')
print(student_csv)

Output

  Name Subjs Marks
0     Alex Phy 100
1     Ben Chem 100
2    Jack Math 100

Summary

We have understood How to Convert text file into Pandas DataFrame into Pandas DataFrame using the built-in methods read_csv(),read_table(), read_fwf() function with examples

How do you convert text data to DataFrame in Python?

Methods to convert text file to DataFrame.
read_csv() method..
read_table() function..
read_fwf() function..

How do you convert text to a table in Python?

Using tabula: import tabula dfs = tabula. read_pdf("myfile. pdf", pages='all') # Note that dfs is list of dataframes, the tables found in the PDF.

How do I convert TXT to pandas CSV?

Steps to Convert a Text File to CSV using Python.
Step 1: Install the Pandas package. If you haven't already done so, install the Pandas package. ... .
Step 2: Capture the path where your text file is stored. ... .
Step 3: Specify the path where the new CSV file will be saved. ... .
Step 4: Convert the text file to CSV using Python..

How do I convert a TXT file to CSV?

Go to File > Save As. Click Browse. In the Save As dialog box, under Save as type box, choose the text file format for the worksheet; for example, click Text (Tab delimited) or CSV (Comma delimited).