This is more of a warning than an answer.
Having seen in the other answers my_list = [None] * 10
, I was tempted and set up an array like this speakers = [['','']] * 10
and came to regret it immensely as the resulting list
did not behave as I thought it should.
I resorted to:
speakers = []
for i in range[10]:
speakers.append[['','']]
As [['','']] * 10
appears to create an list
where subsequent elements are a copy of the first element.
for example:
>>> n=[['','']]*10
>>> n
[['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][0] = "abc"
>>> n
[['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', ''], ['abc', '']]
>>> n[0][1] = "True"
>>> n
[['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True'], ['abc', 'True']]
Whereas with the .append
option:
>>> n=[]
>>> for i in range[10]:
... n.append[['','']]
...
>>> n
[['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][0] = "abc"
>>> n
[['abc', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
>>> n[0][1] = "True"
>>> n
[['abc', 'True'], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['', '']]
I'm sure that the accepted
answer by ninjagecko does attempt to mention this, sadly I was too thick to understand.
Wrapping up, take care!
Created: November-09, 2019 | Updated: December-10, 2020
- Preallocate Storage for Lists
- Preallocate Storage for Other Sequential Data Structures
Preallocating storage for lists or arrays is a typical pattern among programmers when they know the number of elements ahead of time.
Unlike C++ and Java, in Python, you have to initialize all of your pre-allocated storage with some values. Usually, developers use false values for that purpose, such as None
, ''
, False
, and 0
.
Python offers several ways to create a list of a fixed size, each with different performance characteristics.
To
compare performances of different approaches, we will use Python’s standard module timeit
. It provides a handy way to measure run times of small chunks of Python code.
Preallocate Storage for Lists
The first and fastest way to use the *
operator, which repeats a list a specified number of times.
>>> [None] * 10
[None, None, None, None, None, None, None, None, None, None]
A million iterations [default value of iterations in timeit
] take approximately 117 ms.
>>> timeit["[None] * 10"]
0.11655918900214601
Another approach is to use the range
built-in function with a list comprehension.
>>> [None for _ in range[10]]
[None, None, None, None, None, None, None, None, None, None]
It’s almost six times slower and takes 612 ms second per million iterations.
>>> timeit["[None for _ in range[10]]"]
0.6115895550028654
The third approach is to use a simple for
loop together with the
list.append[]
.
>>> a = []
>>> for _ in range[10]:
... a.append[None]
...
>>> a
[None, None, None, None, None, None, None, None, None, None]
Using loops is the slowest method and takes 842 ms to complete a million iterations.
>>> timeit["for _ in range[10]: a.append[None]", setup="a=[]"]
0.8420009529945673
Preallocate Storage for Other Sequential Data Structures
Since you’re preallocating storage for a sequential data structure, it may make a lot of sense to use the array built-in data structure instead of a list.
>>> from array import array
>>> array['i',[0,]*10]
array['i', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
As we see below, this approach is second fastest after [None] * 10
.
>>> timeit["array['i',[0,]*10]", setup="from array import array"]
0.4557597979946877
Let’s compare the above pure Python approaches to the NumPy Python package for scientific computing.
>>> from numpy import empty
>>> empty[10]
array[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
The NumPy way takes 589 ms per million iterations.
>>> timeit["empty[10]", setup="from numpy import empty"]
0.5890094790011062
However, the NumPy way will be much faster for more massive lists.
>>> timeit["[None]*10000"]
16.059584009999526
>>> timeit["empty[10000]", setup="from numpy import empty"]
1.1065983309963485
The conclusion is that it’s best to stick to [None] * 10
for small lists, but switch to NumPy’s empty[]
when dealing with more massive sequential data.