SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
|
A wavelet tree class. More...
#include <wt.hpp>
Public Types | |
typedef wt_trait < RandomAccessContainer > ::size_type | size_type |
typedef wt_trait < RandomAccessContainer > ::value_type | value_type |
typedef wt_trait < RandomAccessContainer > ::map_type | map_type |
typedef wt_trait < RandomAccessContainer > ::inv_map_type | inv_map_type |
Public Member Functions | |
wt (const typename wt_trait< RandomAccessContainer >::reference_type rac, size_type size) | |
Constructor. | |
template<class size_type_class > | |
wt (int_vector_file_buffer< 8, size_type_class > &rac, size_type size) | |
template<class size_type_class > | |
void | construct (int_vector_file_buffer< 8, size_type_class > &rac, size_type size) |
Construct the wavelet tree from a random access container. | |
wt (const wt &wt) | |
Copy constructor. | |
wt & | operator= (const wt &wt) |
Assignment operator. | |
void | swap (wt &wt) |
Swap operator. | |
size_type | size () const |
Returns the size of the original vector. | |
bool | empty () const |
Returns whether the wavelet tree contains no data. | |
value_type | operator[] (size_type i) const |
Recovers the ith symbol of the original vector. | |
size_type | rank (size_type i, value_type c) const |
Calculates how many symbols c are in the prefix [0..i-1] of the supported vector. | |
size_type | rank_ith_symbol (size_type i, value_type &c) const |
Calculates how many occurrences of symbol wt[i] are in the prefix [0..i-1] of the supported sequence. | |
void | _interval_symbols (size_type i, size_type j, size_type &k, std::vector< unsigned char > &cs, std::vector< size_type > &rank_c_i, std::vector< size_type > &rank_c_j, size_type lex_idx, size_type sigma, size_type node) const |
size_type | count_lex_smaller (size_type i, value_type c) const |
Counts the characters in the range [0..i-1] which are smaller than character c. | |
size_type | count_lex_smaller (size_type i, size_type j, value_type c) const |
Counts the characters in the range [i..j-1] which are smaller than character c. | |
void | interval_symbols (size_type i, size_type j, size_type &k, std::vector< unsigned char > &cs, std::vector< size_type > &rank_c_i, std::vector< size_type > &rank_c_j) const |
Calculates for each symbol c in wt[i..j-1], how many times c occures in wt[0..i-1] and wt[0..j-1]. | |
size_type | select (size_type i, value_type c) const |
Calculates the ith occurence of the symbol c in the supported vector. | |
void | range_search_2d (size_type lb, size_type rb, value_type c1, value_type c2, std::vector< size_type > &result) const |
void | _range_search_2d (size_type node, size_type lb, size_type rb, size_type lex_idx_1, size_type lex_idx_2, size_type sigma, std::vector< size_type > &result) const |
size_type | serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const |
Serializes the data structure into the given ostream. | |
void | load (std::istream &in) |
Loads the data structure from the given istream. | |
Public Attributes | |
const size_type & | sigma |
A wavelet tree class.
A wavelet tree is build for a vector of characters over the alphabet . This class should be used only for small alphabets (see wt_int for a wavelet tree for big alphabets). The wavelet tree consists of a tree of bitvectors and provides three efficient methods:
The select method: returns the index of the jth occurence of symbol .
RandomAccessContainer | Type of the input sequence. |
BitVector | Type of the bitvector used for representing the wavelet tree. |
RankSupport | Type of the support structure for rank on ones. |
SelectSupport | Type of the support structure for select on ones. |
SelectSupport | Type of the support structure for select on ones. |
The wavelet tree was proposed first by Grossi et al. 2003 and applied to the BWT in Foschini et al. 2004.
sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::wt | ( | const typename wt_trait< RandomAccessContainer >::reference_type | rac, |
size_type | size | ||
) | [inline] |
Constructor.
\param rac Reference to the vector (or unsigned char array) for which the wavelet tree should be build. \param size Size of the prefix of the vector (or unsigned char array) for which the wavelet tree should be build. \par Time complexity
, where
void sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::construct | ( | int_vector_file_buffer< 8, size_type_class > & | rac, |
size_type | size | ||
) | [inline] |
Construct the wavelet tree from a random access container.
rac | A random access container |
size | The length of the prefix of the random access container, for which the wavelet tree should be build |
void sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::interval_symbols | ( | size_type | i, |
size_type | j, | ||
size_type & | k, | ||
std::vector< unsigned char > & | cs, | ||
std::vector< size_type > & | rank_c_i, | ||
std::vector< size_type > & | rank_c_j | ||
) | const [inline] |
Calculates for each symbol c in wt[i..j-1], how many times c occures in wt[0..i-1] and wt[0..j-1].
\param i The start index (inclusive) of the interval. \param j The end index (exclusive) of the interval. \param k Reference that will contain the number of different symbols in wt[i..j-1].
cs | Reference to a vector of size k that will contain all symbols that occur in wt[i..j-1] in ascending order. |
rank_c_i | Reference to a vector which equals rank_c_i[p] = rank(cs[p],i), for |
rank_c_j | Reference to a vector which equals rank_c_j[p] = rank(cs[p],j), for |
value_type sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::operator[] | ( | size_type | i | ) | const [inline] |
Recovers the ith symbol of the original vector.
i | The index of the symbol in the original vector. |
size_type sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::rank | ( | size_type | i, |
value_type | c | ||
) | const [inline] |
Calculates how many symbols c are in the prefix [0..i-1] of the supported vector.
i | The exclusive index of the prefix range [0..i-1], so . |
c | The symbol to count the occurences in the prefix. |
size_type sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::rank_ith_symbol | ( | size_type | i, |
value_type & | c | ||
) | const [inline] |
Calculates how many occurrences of symbol wt[i] are in the prefix [0..i-1] of the supported sequence.
\param i The index of the symbol.
size_type sdsl::wt< RandomAccessContainer, BitVector, RankSupport, SelectSupport, SelectSupportZero >::select | ( | size_type | i, |
value_type | c | ||
) | const [inline] |
Calculates the ith occurence of the symbol c in the supported vector.
i | The ith occurence. . |
c | The symbol c. |