Save sparse matrix to csv python

How to save scipy.sparse.csr.csr_matrix as csv or other format to handle it with R?

Hello,

I am new to python. I have a naive question abount sparse matrix in python. How to save a scipy.sparse.csr.csr_matrix as csv or other format so that I can handle the matrix with R?

Python R • 6.9k views

I think you can use pandas or numpy to do this. I think if you;re using scipy then you should have numpy already installed (and you're probably using it now).

As an example:

# save numpy array as csv file
from numpy import asarray
from numpy import savetxt
# define data
data = asarray([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
# save to csv file
savetxt('data.csv', data, delimiter=',')

Reference documentation: https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html

There is a function in SciPy to convert sparse matrices and it is called todense:

import pandas as pd
from scipy.sparse.csr_matrix import todense

df = pd.DataFrame(data=todense(your_sparse_matrix_here))
df.to_csv('your_dense_matrix_name_here.csv', index=False)

Note that you may need large memory for this conversion depending on matrix dimensions.

Login before adding your answer.

Since you want to write both 1s and 0s to the CSV file, you can use todense() function first to convert the sparse matrix to a dense matrix. Then you can convert the dense matrix to a pandas dataframe to write to a CSV file.

If you have a very large CSR matrix, this approach may be slow. Here is the python code to save CSR matrix to a CSV file.

import numpy as np
from scipy.sparse import csr_matrix
import pandas as pd

# create a test CSR matrix
r = np.array([0, 0, 1, 1, 2, 2, 2, 3, 4, 4, 5, 6, 6])
c = np.array([0, 3, 4, 1, 3, 5, 6, 3, 1, 6, 0, 1, 3])
data = np.array([1]*len(r))
X = csr_matrix((data, (r, c)), shape=(7, 7))

# save CSR matrix as csv
df = pd.DataFrame(csr_matrix.todense(X))
csv_file = "test_csv_file.csv"
print("Write data to a CSV file", csv_file)
df.to_csv(csv_file, index=False, header=None)

I am a newbie to Python. I have been given a script which generates random numbers and puts them in a 126x81 sparse matrix. I would like to generate a csv file with: Cell_ID; Cell_X; Cell_Y; Val. Cell X and Y are of course the coordinates for each cell. The script I have has a loop which generates an "outputs.csv" file, but in it data are not displayed the way I want them (there are square brackets at the beginning/end of each line and there are ellipsis [...] in place of some values). To sum up, I am not able to read the whole content of the matrix.

If I could, I would upload a picture to show you how I would like these data to look like when read in Notepad or Excel, but I am not allowed to do so. However, these data should look like a typical csv file with each value aligned under its column. Thank you for your help guys! :)

asked Jul 28, 2015 at 16:03

Save sparse matrix to csv python

FaCoffeeFaCoffee

7,21125 gold badges91 silver badges166 bronze badges

8

Since the standard library supports csv, let's use it:

import numpy
import csv

N = 126
M = 81

g = numpy.random.rand(N, M)

with open('test.csv', 'w') as f:
    writer = csv.writer(f)
    writer.writerow(['x', 'y', 'value'])
    for (n, m), val in numpy.ndenumerate(g):
        writer.writerow([n, m, val])

answered Jul 28, 2015 at 18:20

chthonicdaemonchthonicdaemon

18.5k2 gold badges51 silver badges65 bronze badges

2

Since the standard library supports csv, let's use it:

import numpy
import csv

N = 126
M = 81

g = numpy.random.rand(N, M)

with open('test.csv', 'w') as f:
   writer = csv.writer(f)
writer.writerow(['x', 'y', 'value'])
for (n, m), val in numpy.ndenumerate(g):
   writer.writerow([n, m, val])


I am new to python. I have a naive question abount sparse matrix in python. How to save a scipy.sparse.csr.csr_matrix as csv or other format so that I can handle the matrix with R?,There is a function in SciPy to convert sparse matrices and it is called todense:,Note that you may need large memory for this conversion depending on matrix dimensions.,I think you can use pandas or numpy to do this. I think if you;re using scipy then you should have numpy already installed (and you're probably using it now).

As an example:

# save numpy array as csv file
from numpy
import asarray
from numpy
import savetxt
# define data
data = asarray([
   [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
])
# save to csv file
savetxt('data.csv', data, delimiter = ',')

There is a function in SciPy to convert sparse matrices and it is called todense:

import pandas as pd
from scipy.sparse.csr_matrix
import todense

df = pd.DataFrame(data = todense(your_sparse_matrix_here))
df.to_csv('your_dense_matrix_name_here.csv', index = False)


Creating a sparse matrix from pandas data frame using scipy.sparse,Creating a sparse matrix from csv file data,Creating a very large sparse matrix csv from a list of condensed data,Python Pandas - Creating a timeseries from a csv file

As noticed by Can Kavaklıoğlu, as_matrix() is deprecated as of pandas version 0.23.0. Changed to values.

import pandas as pd

df = pd.read_csv('csv_file.csv', names = ['user_id', 'group_id', 'group_value'])
df = df.pivot(index = 'user_id', columns = 'group_id', values = 'group_value')
mat = df.values


bsr_matrix(arg1[, shape, dtype, copy, blocksize]),bsr_array(arg1[, shape, dtype, copy, blocksize]),coo_matrix(arg1[, shape, dtype, copy]),csr_matrix(arg1[, shape, dtype, copy])

>>>
import numpy as np
   >>>
   from scipy.sparse
import csr_matrix
   >>>
   A = csr_matrix([
      [1, 2, 0],
      [0, 0, 3],
      [4, 0, 5]
   ]) >>>
   v = np.array([1, 0, -1]) >>>
   A.dot(v)
array([1, -3, -1], dtype = int64)

>>> np.dot(A.toarray(), v)
array([1, -3, -1], dtype = int64)

>>> from scipy.sparse
import lil_matrix
   >>>
   from scipy.sparse.linalg
import spsolve
   >>>
   from numpy.linalg
import solve, norm
   >>>
   from numpy.random
import rand

>>> A = lil_matrix((1000, 1000)) >>>
   A[0,: 100] = rand(100) >>>
   A[1, 100: 200] = A[0,: 100] >>>
   A.setdiag(rand(1000))

>>> A = A.tocsr() >>>
   b = rand(1000) >>>
   x = spsolve(A, b)


Thank you, the problem is that CSV file is very large, for example in python I use *.npz which is 2-5 MB but here it can go up to 500 MB.,How to save a sparse matrix in Julia? I want the saved file to be small since my Matrix size is about (2^26,2^26). Thanks.,Alternatively, you can save a bit of space by using the CSC representation (ie export directly from the fields), but I don’t think it is worth it.,I don’t know this format, but you can also compress the CSV, eg with gzip.

I would just save nonzero indexes and values as CSV:

using SparseArrays, DataFrames, CSV
M = sprand(10, 10, 0.2)
I, J, V = findnz(M)
df = DataFrame([: I => I,: J => J,: V => V])
CSV.write("/tmp/spmatrix.csv", df)

Very true!

using SparseArrays, DataFrames, CSV, LinearAlgebra

function save_sparse_matrix(M)
I, J, V = findnz(M)
println("I = ", I)
df = DataFrame([: I => I,: J => J,: V => V])
CSV.write("M.csv", df)
end

julia > M = sprand(10, 10, 0.2)

julia > save_sparse_matrix(M)
I = [1, 7, 5, 9, 10, 1, 3, 4, 6, 3, 4, 9, 3, 7, 10, 5, 6, 7, 9, 10, 7]

julia > I
UniformScaling {
   Bool
}
true * I


How do I save a sparse matrix as a CSV in Python?

Since you want to write both 1s and 0s to the CSV file, you can use todense() function first to convert the sparse matrix to a dense matrix. Then you can convert the dense matrix to a pandas dataframe to write to a CSV file.

How do I save a sparse matrix in python?

Save a sparse matrix to a file using . npz format. Either the file name (string) or an open file (file-like object) where the data will be saved.

How do you convert sparse to matrix in python?

Approach:.
Create an empty list which will represent the sparse matrix list..
Iterate through the 2D matrix to find non zero elements..
If an element is non zero, create a temporary empty list..
Append the row value, column value, and the non zero element itself into the temporary list..

How does Python handle sparse matrix?

Sparse matrices in Python.
import numpy as np..
from scipy. sparse import csr_matrix..
# create a 2-D representation of the matrix..
A = np. array([[1, 0, 0, 0, 0, 0], [0, 0, 2, 0, 0, 1],\.
[0, 0, 0, 2, 0, 0]]).
print("Dense matrix representation: \n", A).