Categorize

class numcodecs.categorize.Categorize(labels, dtype, astype='u1')

Filter encoding categorical string data as integers.

Parameters:

labels : sequence of strings

Category labels.

dtype : dtype

Data type to use for decoded data.

astype : dtype, optional

Data type to use for encoded data.

Examples

>>> import numcodecs as codecs
>>> import numpy as np
>>> x = np.array([b'male', b'female', b'female', b'male', b'unexpected'])
>>> x
array([b'male', b'female', b'female', b'male', b'unexpected'],
      dtype='|S10')
>>> f = codecs.Categorize(labels=[b'female', b'male'], dtype=x.dtype)
>>> y = f.encode(x)
>>> y
array([2, 1, 1, 2, 0], dtype=uint8)
>>> z = f.decode(y)
>>> z
array([b'male', b'female', b'female', b'male', b''],
      dtype='|S10')
codec_id = 'categorize'