Jazz 1.25.+
Loading...
Searching...
No Matches
Public Member Functions
jazz_elements::Block Class Reference

A block is a moveable BlockHeader followed by a Tensor and a StringBuffer. More...

#include <block.h>

Inheritance diagram for jazz_elements::Block:
jazz_elements::StaticBlockHeader jazz_elements::Kind jazz_elements::Tuple jazz_bebop::Snippet jazz_models::Concept

Public Member Functions

void set_dimensions (int *p_dim)
 
void get_dimensions (int *p_dim)
 
bool validate_index (int *p_idx)
 
int validate_offset (int offset)
 
int get_offset (int *p_idx)
 
void get_index (int offset, int *p_idx)
 
char * get_string (int *p_idx)
 
char * get_string (int offset)
 
void set_string (int *p_idx, const char *p_str)
 
void set_string (int offset, const char *p_str)
 
char * get_attribute (int attribute_id)
 
void set_attributes (AttributeMap *all_att)
 
void get_attributes (AttributeMap *all_att)
 
void init_string_buffer ()
 
bool find_NAs_in_tensor ()
 
int * align64bit (uintptr_t ipt)
 Align a pointer (as uintptr_t) to the next 8 byte boundary assuming the block is aligned.
 
int * p_attribute_keys ()
 
pStringBuffer p_string_buffer ()
 
int get_string_offset (pStringBuffer psb, const char *p_str)
 
bool is_a_filter ()
 Check (in depth) the validity of a filter.
 
bool can_filter (pBlock p_block)
 
void close_block (int set_has_NA=SET_HAS_NA_FALSE, bool set_hash=true, bool set_time=true)
 
bool check_hash ()
 

Additional Inherited Members

- Data Fields inherited from jazz_elements::StaticBlockHeader
int cell_type
 The type for the cells in the tensor. See CELL_TYPE_*.
 
int size
 The total number of cells in the tensor.
 
TimePoint created
 Timestamp when the block was created.
 
int rank
 The number of dimensions.
 
TensorDim range
 The dimensions of the tensor in terms of ranges (Max. size is 2 Gb.)
 
int num_attributes
 Number of elements in the JazzAttributesMap.
 
int total_bytes
 Total size of the block everything included.
 
bool has_NA
 If true, at least one value is a NA and block requires NA-aware arithmetic.
 
uint64_t hash64
 Hash of everything but the header.
 
Tensor tensor
 A tensor for type cell_type and dimensions set by Block.set_dimensions()
 

Detailed Description

A block is a moveable BlockHeader followed by a Tensor and a StringBuffer.

A block. Anything in Jazz is a block. A block is a BlockHeader, followed by a tensor, then two arrays of int of length == num_attributes, then a StringBuffer. Nothing in a Block is a pointer, Blocks can be copied or stored 'as is', every RAM location in a block is defined by its BlockHeader and computed by the methods in Block.

At this level, you only have the fields BlockHeader that you may read and probably only write through some methods. This is the lowest level, it does not even provide support for allocation, at this level you have support for manipulating the StringBuffer to read and write strings and the JazzAttributesMap to read and write attributes.

A filter (is not a separate class anymore) is just a Block with a strict structure and extra methods.

The structure of a filter is strictly:

Details:

  1. A filter is a block of rank == 1 and type CELL_TYPE_BYTE_BOOLEAN or CELL_TYPE_INTEGER.
  2. A vector of CELL_TYPE_BYTE_BOOLEAN specifies which rows are selected (cell == true) and must have the size == number of rows.
  3. A vector of ordered CELL_TYPE_INTEGER in 0..(number of rows - 1) can be used to filter a tensor.

Member Function Documentation

◆ set_dimensions()

void jazz_elements::Block::set_dimensions ( int *  p_dim)
inline

Sets the tensor dimensions from a TensorDim array.

Parameters
p_dimA pointer to the TensorDim containing the dimensions.

NOTES: 1. This writes: rank, range[] and size.

  1. rank counts the number of dimension >0, except, when all dimensions == 0 produces: rank == 1, size == 0
  2. size returns the size in number of cells, not bytes.

◆ get_dimensions()

void jazz_elements::Block::get_dimensions ( int *  p_dim)
inline

Returns the tensor dimensions as a TensorDim array.

Parameters
p_dimA pointer to the TensorDim containing the dimensions.

NOTES: See notes on set_dimensions() to understand why in case of 0 and 1, it may return different values than those passed when the block was created with a set_dimensions() call.

◆ validate_index()

bool jazz_elements::Block::validate_index ( int *  p_idx)
inline

Returns if an index (as a TensorDim array) is valid for the tensor.

Parameters
p_idxA pointer to the TensorDim containing the index.
Returns
True if the index is valid.

◆ validate_offset()

int jazz_elements::Block::validate_offset ( int  offset)
inline

Returns if an offset (as an integer) is valid for the tensor.

Parameters
offsetAn offset corresponding to the cell as if the tensor was a linear vector.
Returns
True if the offset is valid.

◆ get_offset()

int jazz_elements::Block::get_offset ( int *  p_idx)
inline

Convert an index (as a TensorDim array) to the corresponding offset without checking its validity.

Parameters
p_idxA pointer to the TensorDim containing the index.
Returns
The offset corresponding to the same cell if the index was in a valid range.

◆ get_index()

void jazz_elements::Block::get_index ( int  offset,
int *  p_idx 
)
inline

Convert an offset to a tensor cell into its corresponding index (as a TensorDim array) without checking its validity.

Parameters
offsetThe input offset
p_idxA pointer to the TensorDim to return the result.

◆ get_string() [1/2]

char * jazz_elements::Block::get_string ( int *  p_idx)
inline

Get a string from the tensor by index without checking index range.

Parameters
p_idxA pointer to the TensorDim containing the index.
Returns
A pointer to where the (zero ended) string is stored in the Block.

NOTE: Use the pointer as read-only (more than one cell may point to the same value) and never try to free it.

◆ get_string() [2/2]

char * jazz_elements::Block::get_string ( int  offset)
inline

Get a string from the tensor by offset without checking offset range.

Parameters
offsetAn offset corresponding to the cell as if the tensor was a linear vector.
Returns
A pointer to where the (zero ended) string is stored in the Block.

NOTE: Use the pointer as read-only (more than one cell may point to the same value) and never try to free it.

◆ set_string() [1/2]

void jazz_elements::Block::set_string ( int *  p_idx,
const char *  p_str 
)
inline

Set a string in the tensor, if there is enough allocation space to contain it, by index without checking index range.

Parameters
p_idxA pointer to the TensorDim containing the index.
p_strA pointer to a (zero ended) string that will be allocated inside the Block.

NOTE: Allocation inside a Block is typically hard since they are created with "just enough space", a Block is typically immutable. jazz_alloc.h contains methods that make a Block bigger if that is necessary. This one doesn't. The 100% safe way is creating a new block from the immutable one using jazz_alloc.h methods. Otherwise, use at your own risk or not at all. When this fails, it sets the variable alloc_failed in the StringBuffer. When alloc_failed is true, it doesn't even try to allocate.

◆ set_string() [2/2]

void jazz_elements::Block::set_string ( int  offset,
const char *  p_str 
)
inline

Set a string in the tensor, if there is enough allocation space to contain it, by offset without checking offset range.

Parameters
offsetAn offset corresponding to the cell as if the tensor was a linear vector.
p_strA pointer to a (zero ended) string that will be allocated inside the Block.

NOTE: Allocation inside a Block is typically hard since they are created with "just enough space", a Block is typically immutable. jazz_alloc.h contains methods that make a Block bigger if that is necessary. This one doesn't. The 100% safe way is creating a new block from the immutable one using jazz_alloc.h methods. Otherwise, use at your own risk or not at all. When this fails, it sets the variable alloc_failed in the StringBuffer. When alloc_failed is true, it doesn't even try to allocate.

◆ get_attribute()

char * jazz_elements::Block::get_attribute ( int  attribute_id)
inline

Find an attribute by its attribute_id (key).

Parameters
attribute_idA unique 32-bit id defining what the attribute is: e.g., a Jazz class, a url, a mime type, ...
Returns
A pointer to where the (zero ended) string (the attribute value) is stored in the Block. Or nullptr if the key is not found.

NOTE: Search is linear since no assumptions can be made on how keys are ordered. Typically Blocks have few (less than 5) attributes and that is okay. There is no limit on the number of attributes, so if you want to use the block's attributes as a dictionary containing, e.g., thousands of config keys, use get_attributes() instead which returns a map with all the attributes.

◆ set_attributes()

void jazz_elements::Block::set_attributes ( AttributeMap all_att)
inline

Set all attributes of a Block, only when creating it, using a map.

Parameters
all_attA map containing all the attributes for the block. A call with nullptr is required for initialization.

NOTE: This function is public because it has to be called by jazz_alloc.h methods. set_attributes() can only be called once, so it will do nothing if called after a Block is built. Blocks are near-immutable objects, if you need to change a Block's attributes create a new object using jazz_alloc.h methods.

◆ get_attributes()

void jazz_elements::Block::get_attributes ( AttributeMap all_att)
inline

Get all attributes of a Block storing them inside a map.

Parameters
all_attA (typically empty) map where all the attributes will be stored.

NOTE: You can use a non-empty map. This will keep existing key/values not found in the Block and create/override those in the Block by using a normal 'map[key] = value' instruction.

◆ init_string_buffer()

void jazz_elements::Block::init_string_buffer ( )
inline

Initialize the StringBuffer of a Block, only when creating it.

NOTE: This function is public because it has to be called by jazz_alloc.h methods. Never call it to construct you own Blocks except for test cases or in jazz_alloc.h methods. Use those to build Blocks instead.

◆ find_NAs_in_tensor()

bool jazz_elements::Block::find_NAs_in_tensor ( )

Scan a tensor object to see if it contains any NA valued of the type specified in cell_type.

Returns
True if NA values of the give type were found.

Note: For boolean types everything other than true (1) or false (0) is considered NA Note: For floating point types, only binary identity with F_NA or R_NA counts as NA

◆ align64bit()

int * jazz_elements::Block::align64bit ( uintptr_t  ipt)
inline

Align a pointer (as uintptr_t) to the next 8 byte boundary assuming the block is aligned.

Parameters
iptThe input pointer as an integer.
Returns
The aligned pointer as an integer.

BUT** in the case of misaligned blocks (like those returned by lmdb) it is mandatory that all the arithmetic works just as if the block was aligned in another base (called gap). So this will return the same misalignment the block has. Of course, blocks generated by Containers (other than lmdb) allocated via malloc will have no misalignment.

◆ p_attribute_keys()

int * jazz_elements::Block::p_attribute_keys ( )
inline

Return the address of the vector containing both the attribute keys and the attribute ids in the StringBuffer.

Returns
A pointer to the vector containing the attributes.

NOTE: The actual values (which are strings) are stored in the same StringBuffer containing the strings of the tensor (if any). This array has double the num_attributes size and stores the keys in the lower part and the offsets to the values on the upper part.

◆ p_string_buffer()

pStringBuffer jazz_elements::Block::p_string_buffer ( )
inline

Return the address of the StringBuffer containing the strings in the tensor and the attribute values.

Returns
A pointer to the StringBuffer.

NOTE: The StringBuffer is the last part of the Block and contains all the strings in the tensor and the attributes. When the Block is created, the StringBuffer is initialized with two zero bytes (STRING_NA and STRING_EMPTY).

◆ get_string_offset()

int jazz_elements::Block::get_string_offset ( pStringBuffer  psb,
const char *  p_str 
)

Find an existing string in a block, or allocate a new one and return its offset in the StringBuffer.buffer.

Parameters
psbThe address of the pStringBuffer (passed to avoid calling p_string_buffer repeatedly).
p_strThe string to find or allocate in the StringBuffer.
Returns
The offset to the (zero terminated) string inside psb->buffer[] or -1 if allocation failed.

NOTE: This function is private, called by set_attributes() and set_string(). Use these functions instead and read their NOTES.

◆ is_a_filter()

bool jazz_elements::Block::is_a_filter ( )

Check (in depth) the validity of a filter.

Essentially. check that a filter of integer is sorted or boolean has no NA. When using a filter, can_filter() does not check that. It fails when filtering a block which checks the order anyway. This avoids checking the same thing twice. Checking (is_a_filter() && can_filter()) will check it twice and assure that it will not fail when selecting.

Returns
true if the block can be used as a filter.

◆ can_filter()

bool jazz_elements::Block::can_filter ( pBlock  p_block)
inline

Check (fast) if a filter is valid and can be applied to filter inside a specific Block

This is verifies sizes and types, assuming there are no NAs and integer values are sorted.

Parameters
p_blockThe block.
Returns
true if it is a valid filter of that type.

◆ close_block()

void jazz_elements::Block::close_block ( int  set_has_NA = SET_HAS_NA_FALSE,
bool  set_hash = true,
bool  set_time = true 
)
inline

Set has_NA, the creation time and the hash64 of a JazzBlock based on the content of the tensor

Parameters
set_has_NASET_HAS_NA_FALSE (set the attribute as no NA without checking), SET_HAS_NA_TRUE (set it as true which is always safe) or SET_HAS_NA_AUTO (search the whole tensor for NA and set accordingly).
set_hashCompute MurmurHash64A and set attribute hash64 accordingly.
set_timeSet attribute created as the current time.

◆ check_hash()

bool jazz_elements::Block::check_hash ( )
inline

Check the hash of a JazzBlock based on the content of the tensor

Returns
true if the hash is correct.

The documentation for this class was generated from the following files: