python - Why does scipy linear interpolation run faster than nearest neighbor interpolation? -


i've written routine interpolates point data onto regular grid. however, find scipy's implementation of nearest neighbor interpolation performs twice slow radial basis function i'm using linear interpolation (scipy.interpolate.rbf)

relevant code includes how interpolators constructed

if interpolation_mode == 'linear':     interpolator = scipy.interpolate.rbf(         point_array[:, 0], point_array[:, 1], value_array,         function='linear', smooth=.01) elif interpolation_mode == 'nearest':     interpolator = scipy.interpolate.nearestndinterpolator(         point_array, value_array) 

and when interpolation called

result = interpolator(col_coords.ravel(), row_coords.ravel()) 

the sample i'm running on has 27 input interpolant value points , i'm interpolating across 20000 x 20000 grid. (i'm doing in memory block sizes i'm not exploding computer btw.)

below result of 2 cprofiles i've run on relevant code. note nearest neighbor scheme runs in 406 seconds while linear scheme runs in 256 seconds. nearest scheme dominated calls scipy's kdtree, seems reasonable, except rbf outperforms significant amount of time. ideas why or make nearest scheme run faster linear?

linear run:

     25362 function calls in 225.886 seconds     ordered by: internal time    list reduced 328 10 due restriction <10>     ncalls  tottime  percall  cumtime  percall filename:lineno(function)       253  169.302    0.669  207.516    0.820 c:\python27\lib\site-packages\scipy\interpolate\rbf.py:112( _euclidean_norm)       258   38.211    0.148   38.211    0.148 {method 'reduce' of 'numpy.ufunc' objects}       252    6.069    0.024    6.069    0.024 {numpy.core._dotblas.dot}         1    5.077    5.077  225.332  225.332 c:\python27\lib\site-packages\pygeoprocessing-0.3.0a8.post2 8+n5b1ee2de0d07-py2.7-win32.egg\pygeoprocessing\geoprocessing.py:333(interpolate_points_uri)       252    1.849    0.007    2.137    0.008 c:\python27\lib\site-packages\numpy\lib\function_base.py:32 85(meshgrid)       507    1.419    0.003    1.419    0.003 {method 'flatten' of 'numpy.ndarray' objects}      1268    1.368    0.001    1.368    0.001 {numpy.core.multiarray.array}       252    1.018    0.004    1.018    0.004 {_gdal_array.bandrasterionumpy}         1    0.533    0.533  225.886  225.886 pygeoprocessing\tests\helper_driver.py:10(interpolate)       252    0.336    0.001  216.716    0.860 c:\python27\lib\site-packages\scipy\interpolate\rbf.py:225( __call__) 

nearest neighbor run:

     27539 function calls in 405.624 seconds     ordered by: internal time    list reduced 309 10 due restriction <10>     ncalls  tottime  percall  cumtime  percall filename:lineno(function)       252  397.806    1.579  397.822    1.579 {method 'query' of 'ckdtree.ckdtree' objects}       252    1.875    0.007    1.881    0.007 {scipy.interpolate.interpnd._ndim_coords_from_arrays}       252    1.831    0.007    2.101    0.008 c:\python27\lib\site-packages\numpy\lib\function_base.py:3285(meshgrid)       252    1.034    0.004  400.739    1.590 c:\python27\lib\site-packages\scipy\interpolate\ndgriddata.py:60(__call__)         1    1.009    1.009  405.030  405.030 c:\python27\lib\site-packages\pygeoprocessing-0.3.0a8.post28+n5b1ee2de0d07-py2.7-win32.egg\pygeoprocessing\geoprocessing.py:333(interpolate_points_uri)       252    0.719    0.003    0.719    0.003 {_gdal_array.bandrasterionumpy}         1    0.509    0.509  405.624  405.624 pygeoprocessing\tests\helper_driver.py:10(interpolate)       252    0.261    0.001    0.261    0.001 {numpy.core.multiarray.copyto}        27    0.125    0.005    0.125    0.005 {_ogr.layer_createfeature}         1    0.116    0.116    0.254    0.254 c:\python27\lib\site-packages\pygeoprocessing-0.3.0a8.post28+n5b1ee2de0d07-py2.7-win32.egg\pygeoprocessing\geoprocessing.py:362(_parse_point_data) 

for reference, i'm including visual result of these 2 test cases.

nearest

nearest

linear

linear

running example in griddata doc:

in [47]: def func(x, y):           return x*(1-x)*np.cos(4*np.pi*x) * np.sin(4*np.pi*y**2)**2    ....:  in [48]: points = np.random.rand(1000, 2) in [49]: values = func(points[:,0], points[:,1]) in [50]: grid_x, grid_y = np.mgrid[0:1:100j, 0:1:200j] 

so have 1000 scattered points, , interpolate @ 20,000.

in [52]: timeit interpolate.griddata(points, values, (grid_x, grid_y),     method='nearest') 10 loops, best of 3: 83.6 ms per loop  in [53]: timeit interpolate.griddata(points, values, (grid_x, grid_y),     method='linear') 1 loops, best of 3: 24.6 ms per loop  in [54]: timeit interpolate.griddata(points, values, (grid_x, grid_y),      method='cubic') 10 loops, best of 3: 42.7 ms per loop 

and 2 stage interpolators:

in [55]: %%timeit  rbfi = interpolate.rbf(points[:,0],points[:,1],values) dl = rbfi(grid_x.ravel(),grid_y.ravel())    ....:  1 loops, best of 3: 3.89 s per loop  in [56]: %%timeit  ndi=interpolate.nearestndinterpolator(points, values) dl=ndi(grid_x.ravel(),grid_y.ravel())    ....:  10 loops, best of 3: 82.6 ms per loop  in [57]: %%timeit  ldi=interpolate.linearndinterpolator(points, values) dl=ldi(grid_x.ravel(),grid_y.ravel())  .... 10 loops, best of 3: 25.1 ms per loop 

griddata 1 step cover call these last 2 versions.

griddata describes methods as:

nearest return value @ data point closest point of    interpolation. see nearestndinterpolator more details.    uses scipy.spatial.ckdtree  linear tesselate input point set n-dimensional simplices,     , interpolate linearly on each simplex.     linearndinterpolator details are:       interpolant constructed triangulating        input data qhull [r37], , on each triangle        performing linear barycentric interpolation.  cubic (2-d) return value determined piecewise cubic, continuously     differentiable (c1), , approximately curvature-minimizing     polynomial surface. see cloughtocher2dinterpolator more details. 

further tests on 2 stage versions shows setting nearest ckttree fast; of time spent in 2nd interpolation state.

on other hand, setting triangulated surface takes longer linear interpolation.

i don't know enough of rbf method why slower. underlying methods different intuitions developed simple manual methods of interpolation don't mean much.

your example starts fewer scattered points, , interpolates on finer grid.


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -