Categorize

class numcodecs.categorize.Categorize(labels, dtype, astype='u1')

Filter encoding categorical string data as integers.

Parameters:

labels : sequence of strings

Category labels.

dtype : dtype

Data type to use for decoded data.

astype : dtype, optional

Data type to use for encoded data.

Examples

>>> import numcodecs
>>> import numpy as np
>>> x = np.array([b'male', b'female', b'female', b'male', b'unexpected'])
>>> x
array([b'male', b'female', b'female', b'male', b'unexpected'],
      dtype='|S10')
>>> codec = numcodecs.Categorize(labels=[b'female', b'male'], dtype=x.dtype)
>>> y = codec.encode(x)
>>> y
array([2, 1, 1, 2, 0], dtype=uint8)
>>> z = codec.decode(y)
>>> z
array([b'male', b'female', b'female', b'male', b''],
      dtype='|S10')
codec_id = 'categorize'
encode(buf)
decode(buf, out=None)
get_config()
from_config(config)

Instantiate codec from a configuration object.