Codecs for variable-length objects

VLenUTF8

class numcodecs.vlen.VLenUTF8

Encode variable-length unicode string objects via UTF-8.

Notes

The encoded bytes values for each string are packed into a parquet-style byte array.

Examples

>>> import numcodecs
>>> import numpy as np
>>> x = np.array(['foo', 'bar', 'baz'], dtype='object')
>>> codec = numcodecs.VLenUTF8()
>>> codec.decode(codec.encode(x))
array(['foo', 'bar', 'baz'], dtype=object)
codec_id = 'vlen-utf8'
encode(self, buf)
decode(self, buf, out=None)
get_config()

Return a dictionary holding configuration parameters for this codec. Must include an ‘id’ field with the codec identifier. All values must be compatible with JSON encoding.

classmethod from_config(config)

Instantiate codec from a configuration object.

VLenBytes

class numcodecs.vlen.VLenBytes

Encode variable-length byte string objects.

Notes

The bytes values for each string are packed into a parquet-style byte array.

Examples

>>> import numcodecs
>>> import numpy as np
>>> x = np.array([b'foo', b'bar', b'baz'], dtype='object')
>>> codec = numcodecs.VLenBytes()
>>> codec.decode(codec.encode(x))
array([b'foo', b'bar', b'baz'], dtype=object)
codec_id = 'vlen-bytes'
encode(self, buf)
decode(self, buf, out=None)
get_config()

Return a dictionary holding configuration parameters for this codec. Must include an ‘id’ field with the codec identifier. All values must be compatible with JSON encoding.

classmethod from_config(config)

Instantiate codec from a configuration object.

VLenArray

class numcodecs.vlen.VLenArray

Encode variable-length 1-dimensional arrays via UTF-8.

Notes

The binary data for each array are packed into a parquet-style byte array.

Examples

>>> import numcodecs
>>> import numpy as np
>>> x = np.array([[1, 3, 5], [4], [7, 9]], dtype='object')
>>> codec = numcodecs.VLenArray('<i4')
>>> codec.decode(codec.encode(x))
array([array([1, 3, 5], dtype=int32), array([4], dtype=int32),
       array([7, 9], dtype=int32)], dtype=object)
codec_id = 'vlen-array'
encode(self, buf)
decode(self, buf, out=None)
get_config(self)
classmethod from_config(config)

Instantiate codec from a configuration object.