![]() |
Jazz 1.25.+
|
A block is a moveable BlockHeader followed by a Tensor and a StringBuffer. More...
#include <block.h>
Public Member Functions | |
void | set_dimensions (int *p_dim) |
void | get_dimensions (int *p_dim) |
bool | validate_index (int *p_idx) |
int | validate_offset (int offset) |
int | get_offset (int *p_idx) |
void | get_index (int offset, int *p_idx) |
char * | get_string (int *p_idx) |
char * | get_string (int offset) |
void | set_string (int *p_idx, const char *p_str) |
void | set_string (int offset, const char *p_str) |
char * | get_attribute (int attribute_id) |
void | set_attributes (AttributeMap *all_att) |
void | get_attributes (AttributeMap *all_att) |
void | init_string_buffer () |
bool | find_NAs_in_tensor () |
int * | align64bit (uintptr_t ipt) |
Align a pointer (as uintptr_t) to the next 8 byte boundary assuming the block is aligned. | |
int * | p_attribute_keys () |
pStringBuffer | p_string_buffer () |
int | get_string_offset (pStringBuffer psb, const char *p_str) |
bool | is_a_filter () |
Check (in depth) the validity of a filter. | |
bool | can_filter (pBlock p_block) |
void | close_block (int set_has_NA=SET_HAS_NA_FALSE, bool set_hash=true, bool set_time=true) |
bool | check_hash () |
Additional Inherited Members | |
![]() | |
int | cell_type |
The type for the cells in the tensor. See CELL_TYPE_*. | |
int | size |
The total number of cells in the tensor. | |
TimePoint | created |
Timestamp when the block was created. | |
int | rank |
The number of dimensions. | |
TensorDim | range |
The dimensions of the tensor in terms of ranges (Max. size is 2 Gb.) | |
int | num_attributes |
Number of elements in the JazzAttributesMap. | |
int | total_bytes |
Total size of the block everything included. | |
bool | has_NA |
If true, at least one value is a NA and block requires NA-aware arithmetic. | |
uint64_t | hash64 |
Hash of everything but the header. | |
Tensor | tensor |
A tensor for type cell_type and dimensions set by Block.set_dimensions() | |
A block is a moveable BlockHeader followed by a Tensor and a StringBuffer.
A block. Anything in Jazz is a block. A block is a BlockHeader, followed by a tensor, then two arrays of int of length == num_attributes, then a StringBuffer. Nothing in a Block is a pointer, Blocks can be copied or stored 'as is', every RAM location in a block is defined by its BlockHeader and computed by the methods in Block.
At this level, you only have the fields BlockHeader that you may read and probably only write through some methods. This is the lowest level, it does not even provide support for allocation, at this level you have support for manipulating the StringBuffer to read and write strings and the JazzAttributesMap to read and write attributes.
A filter (is not a separate class anymore) is just a Block with a strict structure and extra methods.
The structure of a filter is strictly:
Details:
|
inline |
Sets the tensor dimensions from a TensorDim array.
p_dim | A pointer to the TensorDim containing the dimensions. |
NOTES: 1. This writes: rank, range[] and size.
|
inline |
Returns the tensor dimensions as a TensorDim array.
p_dim | A pointer to the TensorDim containing the dimensions. |
NOTES: See notes on set_dimensions() to understand why in case of 0 and 1, it may return different values than those passed when the block was created with a set_dimensions() call.
|
inline |
|
inline |
Returns if an offset (as an integer) is valid for the tensor.
offset | An offset corresponding to the cell as if the tensor was a linear vector. |
|
inline |
|
inline |
|
inline |
Get a string from the tensor by index without checking index range.
p_idx | A pointer to the TensorDim containing the index. |
NOTE: Use the pointer as read-only (more than one cell may point to the same value) and never try to free it.
|
inline |
Get a string from the tensor by offset without checking offset range.
offset | An offset corresponding to the cell as if the tensor was a linear vector. |
NOTE: Use the pointer as read-only (more than one cell may point to the same value) and never try to free it.
|
inline |
Set a string in the tensor, if there is enough allocation space to contain it, by index without checking index range.
p_idx | A pointer to the TensorDim containing the index. |
p_str | A pointer to a (zero ended) string that will be allocated inside the Block. |
NOTE: Allocation inside a Block is typically hard since they are created with "just enough space", a Block is typically immutable. jazz_alloc.h contains methods that make a Block bigger if that is necessary. This one doesn't. The 100% safe way is creating a new block from the immutable one using jazz_alloc.h methods. Otherwise, use at your own risk or not at all. When this fails, it sets the variable alloc_failed in the StringBuffer. When alloc_failed is true, it doesn't even try to allocate.
|
inline |
Set a string in the tensor, if there is enough allocation space to contain it, by offset without checking offset range.
offset | An offset corresponding to the cell as if the tensor was a linear vector. |
p_str | A pointer to a (zero ended) string that will be allocated inside the Block. |
NOTE: Allocation inside a Block is typically hard since they are created with "just enough space", a Block is typically immutable. jazz_alloc.h contains methods that make a Block bigger if that is necessary. This one doesn't. The 100% safe way is creating a new block from the immutable one using jazz_alloc.h methods. Otherwise, use at your own risk or not at all. When this fails, it sets the variable alloc_failed in the StringBuffer. When alloc_failed is true, it doesn't even try to allocate.
|
inline |
Find an attribute by its attribute_id (key).
attribute_id | A unique 32-bit id defining what the attribute is: e.g., a Jazz class, a url, a mime type, ... |
NOTE: Search is linear since no assumptions can be made on how keys are ordered. Typically Blocks have few (less than 5) attributes and that is okay. There is no limit on the number of attributes, so if you want to use the block's attributes as a dictionary containing, e.g., thousands of config keys, use get_attributes() instead which returns a map with all the attributes.
|
inline |
Set all attributes of a Block, only when creating it, using a map.
all_att | A map containing all the attributes for the block. A call with nullptr is required for initialization. |
NOTE: This function is public because it has to be called by jazz_alloc.h methods. set_attributes() can only be called once, so it will do nothing if called after a Block is built. Blocks are near-immutable objects, if you need to change a Block's attributes create a new object using jazz_alloc.h methods.
|
inline |
Get all attributes of a Block storing them inside a map.
all_att | A (typically empty) map where all the attributes will be stored. |
NOTE: You can use a non-empty map. This will keep existing key/values not found in the Block and create/override those in the Block by using a normal 'map[key] = value' instruction.
|
inline |
Initialize the StringBuffer of a Block, only when creating it.
NOTE: This function is public because it has to be called by jazz_alloc.h methods. Never call it to construct you own Blocks except for test cases or in jazz_alloc.h methods. Use those to build Blocks instead.
bool jazz_elements::Block::find_NAs_in_tensor | ( | ) |
Scan a tensor object to see if it contains any NA valued of the type specified in cell_type.
Note: For boolean types everything other than true (1) or false (0) is considered NA Note: For floating point types, only binary identity with F_NA or R_NA counts as NA
|
inline |
Align a pointer (as uintptr_t) to the next 8 byte boundary assuming the block is aligned.
ipt | The input pointer as an integer. |
BUT** in the case of misaligned blocks (like those returned by lmdb) it is mandatory that all the arithmetic works just as if the block was aligned in another base (called gap). So this will return the same misalignment the block has. Of course, blocks generated by Containers (other than lmdb) allocated via malloc will have no misalignment.
|
inline |
Return the address of the vector containing both the attribute keys and the attribute ids in the StringBuffer.
NOTE: The actual values (which are strings) are stored in the same StringBuffer containing the strings of the tensor (if any). This array has double the num_attributes size and stores the keys in the lower part and the offsets to the values on the upper part.
|
inline |
Return the address of the StringBuffer containing the strings in the tensor and the attribute values.
NOTE: The StringBuffer is the last part of the Block and contains all the strings in the tensor and the attributes. When the Block is created, the StringBuffer is initialized with two zero bytes (STRING_NA and STRING_EMPTY).
int jazz_elements::Block::get_string_offset | ( | pStringBuffer | psb, |
const char * | p_str | ||
) |
Find an existing string in a block, or allocate a new one and return its offset in the StringBuffer.buffer.
psb | The address of the pStringBuffer (passed to avoid calling p_string_buffer repeatedly). |
p_str | The string to find or allocate in the StringBuffer. |
NOTE: This function is private, called by set_attributes() and set_string(). Use these functions instead and read their NOTES.
bool jazz_elements::Block::is_a_filter | ( | ) |
Check (in depth) the validity of a filter.
Essentially. check that a filter of integer is sorted or boolean has no NA. When using a filter, can_filter() does not check that. It fails when filtering a block which checks the order anyway. This avoids checking the same thing twice. Checking (is_a_filter() && can_filter()) will check it twice and assure that it will not fail when selecting.
|
inline |
Check (fast) if a filter is valid and can be applied to filter inside a specific Block
This is verifies sizes and types, assuming there are no NAs and integer values are sorted.
p_block | The block. |
|
inline |
Set has_NA, the creation time and the hash64 of a JazzBlock based on the content of the tensor
set_has_NA | SET_HAS_NA_FALSE (set the attribute as no NA without checking), SET_HAS_NA_TRUE (set it as true which is always safe) or SET_HAS_NA_AUTO (search the whole tensor for NA and set accordingly). |
set_hash | Compute MurmurHash64A and set attribute hash64 accordingly. |
set_time | Set attribute created as the current time. |
|
inline |
Check the hash of a JazzBlock based on the content of the tensor