Codecs for variable-length objects#
VLenUTF8#
- class numcodecs.vlen.VLenUTF8#
Encode variable-length unicode string objects via UTF-8.
Notes
The encoded bytes values for each string are packed into a parquet-style byte array.
Examples
>>> import numcodecs >>> import numpy as np >>> x = np.array(['foo', 'bar', 'baz'], dtype='object') >>> codec = numcodecs.VLenUTF8() >>> codec.decode(codec.encode(x)) array(['foo', 'bar', 'baz'], dtype=object)
- codec_id: str | None = 'vlen-utf8'#
Codec identifier.
- encode(self, buf)#
- decode(self, buf, out=None)#
- get_config()#
Return a dictionary holding configuration parameters for this codec. Must include an ‘id’ field with the codec identifier. All values must be compatible with JSON encoding.
- classmethod from_config(config)#
Instantiate codec from a configuration object.
VLenBytes#
- class numcodecs.vlen.VLenBytes#
Encode variable-length byte string objects.
See also
Notes
The bytes values for each string are packed into a parquet-style byte array.
Examples
>>> import numcodecs >>> import numpy as np >>> x = np.array([b'foo', b'bar', b'baz'], dtype='object') >>> codec = numcodecs.VLenBytes() >>> codec.decode(codec.encode(x)) array([b'foo', b'bar', b'baz'], dtype=object)
- codec_id: str | None = 'vlen-bytes'#
Codec identifier.
- encode(self, buf)#
- decode(self, buf, out=None)#
- get_config()#
Return a dictionary holding configuration parameters for this codec. Must include an ‘id’ field with the codec identifier. All values must be compatible with JSON encoding.
- classmethod from_config(config)#
Instantiate codec from a configuration object.
VLenArray#
- class numcodecs.vlen.VLenArray(dtype)#
Encode variable-length 1-dimensional arrays via UTF-8.
Notes
The binary data for each array are packed into a parquet-style byte array.
Examples
>>> import numcodecs >>> import numpy as np >>> x1 = np.array([1, 3, 5], dtype=np.int16) >>> x2 = np.array([4], dtype=np.int16) >>> x3 = np.array([7, 9], dtype=np.int16) >>> x = np.array([x1, x2, x3], dtype='object') >>> codec = numcodecs.VLenArray('<i2') >>> codec.decode(codec.encode(x)) array([array([1, 3, 5], dtype=int16), array([4], dtype=int16), array([7, 9], dtype=int16)], dtype=object)
- codec_id: str | None = 'vlen-array'#
Codec identifier.
- encode(self, buf)#
- decode(self, buf, out=None)#
- get_config(self)#
- classmethod from_config(config)#
Instantiate codec from a configuration object.