# Python native modules
import os
from copy import deepcopy
# Third party libs
from fastcore.all import *
import numpy as np
# Local modules
Speed
Some obvious / not so obvious notes on speed
Numpy to Tensor Performance
=np.random.randint(0,255,size=(240, 320, 3)) img
=np.random.randint(0,255,size=(240, 320, 3)) img
1.61 ms ± 23.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
deepcopy(img)
240 µs ± 4.64 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Tensor(img)
79.2 µs ± 3.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Tensor([img])
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:2: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:230.)
135 ms ± 3.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
You will notice that if you wrap a numpy in a list, it completely kills the performance. The solution is to just add a batch dim to the existing array and pass it directly.
0)) Tensor(np.expand_dims(img,
85.6 µs ± 4.48 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In fact we can just test this with python lists…
1]]) Tensor([[
6.75 µs ± 95.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
=[[1]*270000] test_arr
Tensor(test_arr)
9.55 ms ± 221 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
=np.array([[1]*270000]) test_arr
Tensor(test_arr)
88 µs ± 5.93 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
This is horrifying just how made of a performance hit this causes… So we will be avoiding python list inputs to Tensors for now on…