SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
|
A Wavelet Tree class for byte sequences. More...
#include <wt_huff.hpp>
Public Types | |
typedef int_vector::size_type | size_type |
typedef unsigned char | value_type |
typedef BitVector | bit_vector_type |
typedef RankSupport | rank_1_type |
typedef SelectSupport | select_1_type |
typedef SelectSupportZero | select_0_type |
Public Member Functions | |
template<typename RandomAccessContainer > | |
wt_huff (const RandomAccessContainer &rac, size_type size) | |
Constructor. | |
template<uint8_t w> | |
wt_huff (const int_vector< w > &rac) | |
template<typename RandomAccessContainer > | |
void | construct (const RandomAccessContainer &rac, size_type size) |
template<class size_type_class > | |
wt_huff (int_vector_file_buffer< 8, size_type_class > &rac, size_type size) | |
template<class size_type_class > | |
void | construct (int_vector_file_buffer< 8, size_type_class > &rac, size_type size) |
Construct the wavelet tree from a random access container. | |
wt_huff (const wt_huff &wt) | |
Copy constructor. | |
wt_huff & | operator= (const wt_huff &wt) |
Assignment operator. | |
void | swap (wt_huff &wt) |
Swap operator. | |
size_type | size () const |
Returns the size of the original vector. | |
bool | empty () const |
Returns whether the wavelet tree contains no data. | |
value_type | operator[] (size_type i) const |
Recovers the ith symbol of the original vector. | |
size_type | rank (size_type i, value_type c) const |
Calculates how many symbols c are in the prefix [0..i-1] of the supported vector. | |
size_type | rank_ith_symbol (size_type i, value_type &c) const |
Calculates how many occurrences of symbol wt[i] are in the prefix [0..i-1] of the original sequence. | |
size_type | select (size_type i, value_type c) const |
Calculates the ith occurrence of the symbol c in the supported vector. | |
void | interval_symbols (size_type i, size_type j, size_type &k, std::vector< unsigned char > &cs, std::vector< size_type > &rank_c_i, std::vector< size_type > &rank_c_j) const |
Calculates for each symbol c in wt[i..j-1], how many times c occurs in wt[0..i-1] and wt[0..j-1]. | |
size_type | serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const |
Serializes the data structure into the given ostream. | |
void | load (std::istream &in) |
Loads the data structure from the given istream. | |
Public Attributes | |
const size_type & | sigma |
const bit_vector_type & | tree |
A Wavelet Tree class for byte sequences.
A wavelet tree is build for a vector of characters over the alphabet . This class should be used only for small alphabets (see int_wavelet_tree for a wavelet tree for big alphabets). The wavelet tree consists of a tree of bitvectors and provides three efficient methods:
The idea of using a Huffman shaped wavelet was first mentioned on page 17 of the following technical report: Veli Mäkinen and Gonzalo Navarro: Succinct Suffix Arrays based on Run-Length Encoding. Available under: http://swp.dcc.uchile.cl/TR/2005/TR_DCC-2005-004.pdf
\par Space complexity
bits, where is the size of the vector the wavelet tree was build for.
sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::wt_huff | ( | const RandomAccessContainer & | rac, |
size_type | size | ||
) | [inline] |
Constructor.
\param rac Reference to the vector (or unsigned char array) for which the wavelet tree should be build. \param size Size of the prefix of the vector (or unsigned char array) for which the wavelet tree should be build. \par Time complexity
, where
void sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::construct | ( | int_vector_file_buffer< 8, size_type_class > & | rac, |
size_type | size | ||
) | [inline] |
Construct the wavelet tree from a random access container.
rac | A random access container |
size | The length of the prefix of the random access container, for which the wavelet tree should be build |
void sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::interval_symbols | ( | size_type | i, |
size_type | j, | ||
size_type & | k, | ||
std::vector< unsigned char > & | cs, | ||
std::vector< size_type > & | rank_c_i, | ||
std::vector< size_type > & | rank_c_j | ||
) | const [inline] |
Calculates for each symbol c in wt[i..j-1], how many times c occurs in wt[0..i-1] and wt[0..j-1].
\param i The start index (inclusive) of the interval. \param j The end index (exclusive) of the interval. \param k Reference that will contain the number of different symbols in wt[i..j-1].
cs | Reference to a vector of size k that will contain all symbols that occur in wt[i..j-1] in arbitrary order. |
rank_c_i | Reference to a vector which equals rank_c_i[p] = rank(i,cs[p]), for |
rank_c_j | Reference to a vector which equals rank_c_j[p] = rank(j,cs[p]), for |
value_type sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::operator[] | ( | size_type | i | ) | const [inline] |
Recovers the ith symbol of the original vector.
i | The index of the symbol in the original vector. |
size_type sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::rank | ( | size_type | i, |
value_type | c | ||
) | const [inline] |
Calculates how many symbols c are in the prefix [0..i-1] of the supported vector.
i | The exclusive index of the prefix range [0..i-1], so . |
c | The symbol to count the occurrences in the prefix. |
size_type sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::rank_ith_symbol | ( | size_type | i, |
value_type & | c | ||
) | const [inline] |
Calculates how many occurrences of symbol wt[i] are in the prefix [0..i-1] of the original sequence.
\param i The index of the symbol.
c | Reference that will contain symbol wt[i]. |
size_type sdsl::wt_huff< BitVector, RankSupport, SelectSupport, SelectSupportZero, dfs_shape >::select | ( | size_type | i, |
value_type | c | ||
) | const [inline] |
Calculates the ith occurrence of the symbol c in the supported vector.
i | The ith occurrence. . |
c | The symbol c. |