2

Let's make a reference implementation of N-dimensional pixel hoeing / counting f...

 2 years ago
source link: https://www.codesd.com/item/let-s-make-a-reference-implementation-of-n-dimensional-pixel-hoeing-counting-for-python-digital.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Let's make a reference implementation of N-dimensional pixel hoeing / counting for Python digital

advertisements

I frequently want to pixel bin/pixel bucket a numpy array, meaning, replace groups of N consecutive pixels with a single pixel which is the sum of the N replaced pixels. For example, start with the values:

x = np.array([1, 3, 7, 3, 2, 9])

with a bucket size of 2, this transforms into:

bucket(x, bucket_size=2)
= [1+3, 7+3, 2+9]
= [4, 10, 11]

As far as I know, there's no numpy function that specifically does this (please correct me if I'm wrong!), so I frequently roll my own. For 1d numpy arrays, this isn't bad:

import numpy as np

def bucket(x, bucket_size):
    return x.reshape(x.size // bucket_size, bucket_size).sum(axis=1)

bucket_me = np.array([3, 4, 5, 5, 1, 3, 2, 3])
print(bucket(bucket_me, bucket_size=2)) #[ 7 10  4  5]

...however, I get confused easily for the multidimensional case, and I end up rolling my own buggy, half-assed solution to this "easy" problem over and over again. I'd love it if we could establish a nice N-dimensional reference implementation.

  • Preferably the function call would allow different bin sizes along different axes (perhaps something like bucket(x, bucket_size=(2, 2, 3)))

  • Preferably the solution would be reasonably efficient (reshape and sum are fairly quick in numpy)

  • Bonus points for handling edge effects when the array doesn't divide nicely into an integer number of buckets.

  • Bonus points for allowing the user to choose the initial bin edge offset.

As suggested by Divakar, here's my desired behavior in a sample 2-D case:

x = np.array([[1, 2, 3, 4],
              [2, 3, 7, 9],
              [8, 9, 1, 0],
              [0, 0, 3, 4]])

bucket(x, bucket_size=(2, 2))
= [[1 + 2 + 2 + 3, 3 + 4 + 7 + 9],
   [8 + 9 + 0 + 0, 1 + 0 + 3 + 4]]
= [[8, 23],
   [17, 8]]

...hopefully I did my arithmetic correctly ;)


Natively from as_strided :

x = array([[1, 2, 3, 4],
           [2, 3, 7, 9],
           [8, 9, 1, 0],
           [0, 0, 3, 4]])

from numpy.lib.stride_tricks import as_strided
def bucket(x,bucket_size):
      x=np.ascontiguousarray(x)
      oldshape=array(x.shape)
      newshape=concatenate((oldshape//bucket_size,bucket_size))
      oldstrides=array(x.strides)
      newstrides=concatenate((oldstrides*bucket_size,oldstrides))
      axis=tuple(range(x.ndim,2*x.ndim))
      return as_strided (x,newshape,newstrides).sum(axis)

if a dimension not divide evenly into the corresponding dimension of x, remaining elements are lost.

verification :

In [9]: bucket(x,(2,2))
Out[9]:
array([[ 8, 23],
       [17,  8]])


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK