SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
|
A class for the Compressed Suffix Array (CSA) based on a Wavelet Tree (WT) of the Burrow Wheeler Transform of the orignal text. More...
#include <csa_wt.hpp>
Public Types | |
enum | { sa_sample_dens = SampleDens, isa_sample_dens = InvSampleDens } |
typedef uint64_t | value_type |
typedef random_access_const_iterator < csa_wt > | const_iterator |
typedef const_iterator | iterator |
typedef const value_type | const_reference |
typedef const_reference | reference |
typedef const_reference * | pointer |
typedef const pointer | const_pointer |
typedef int_vector::size_type | size_type |
typedef size_type | csa_size_type |
typedef ptrdiff_t | difference_type |
typedef psi_of_csa_wt< csa_wt > | psi_type |
typedef bwt_of_csa_wt< csa_wt > | bwt_type |
typedef WaveletTree | wavelet_tree_type |
typedef charType | char_type |
typedef const char_type * | pattern_type |
typedef csa_wt_trait < fixedIntWidth > ::int_vector_type | sa_sample_type |
typedef csa_wt_trait < fixedIntWidth > ::int_vector_type | isa_sample_type |
typedef csa_tag | index_category |
Public Member Functions | |
csa_wt () | |
Default Constructor. | |
~csa_wt () | |
Default Destructor. | |
csa_wt (const csa_wt &csa) | |
Copy constructor. | |
template<typename RandomAccessContainer > | |
csa_wt (const RandomAccessContainer &sa, const char_type *str) | |
Construct the csa_wt from another compressed or uncompressed suffix array. | |
csa_wt (const char_type *str) | |
Constructor for the csa_wt taking a string for that the CSA should be calculated. | |
template<class size_type_class , uint8_t int_width, class size_type_class_1 > | |
csa_wt (int_vector_file_buffer< 8, size_type_class > &bwt_buf, int_vector_file_buffer< int_width, size_type_class_1 > &sa_buf) | |
Construct the csa_wt from the int_vector_file_buffers of the bwt of the text and the suffix array. | |
template<class size_type_class > | |
csa_wt (int_vector_file_buffer< 8, size_type_class > &bwt_buf) | |
Construct the bwt part of the csa_wt from the int_vector_file_buffers of the bwt of the text. | |
csa_wt (tMSS &file_map, const std::string &dir, const std::string &id) | |
void | construct (tMSS &file_map, const std::string &dir, const std::string &id) |
size_type | size () const |
Number of elements in the . | |
bool | empty () const |
Returns if the data strucutre is empty. | |
void | swap (csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &csa) |
Swap method for csa_wt. | |
const_iterator | begin () const |
Returns a const_iterator to the first element. | |
const_iterator | end () const |
Returns a const_iterator to the element after the last element. | |
value_type | operator[] (size_type i) const |
[]-operator | |
value_type | operator() (size_type i) const |
()-operator return inverse suffix array values | |
csa_wt & | operator= (const csa_wt &csa) |
Assignment Operator. | |
bool | operator== (const csa_wt &csa) const |
Equality Operator. | |
bool | operator!= (const csa_wt &csa) const |
Unequality Operator. | |
size_type | serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const |
Serialize to a stream. | |
void | load (std::istream &in) |
Load from a stream. | |
uint32_t | get_psi_sample_dens () const |
void | set_psi_sample_dens (const uint32_t sample_dens) |
size_type | rank_bwt (size_type i, const char_type c) const |
Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text. | |
size_type | select_bwt (size_type i, const char_type c) const |
Calculates the ith occurrence of symbol c in the BWT of the original text. | |
Static Public Member Functions | |
static size_type | max_size () |
Returns the largest size that csa_wt can ever have. | |
Public Attributes | |
const int_vector< 8 > & | char2comp |
const int_vector< 8 > & | comp2char |
const int_vector< 64 > & | C |
const uint16_t & | sigma |
const psi_type & | psi |
const bwt_type & | bwt |
const sa_sample_type & | sa_sample |
const isa_sample_type & | isa_sample |
const wavelet_tree_type & | wavelet_tree |
Friends | |
class | psi_of_csa_wt< csa_wt > |
class | bwt_of_csa_wt< csa_wt > |
A class for the Compressed Suffix Array (CSA) based on a Wavelet Tree (WT) of the Burrow Wheeler Transform of the orignal text.
The CSA is parameterized with an WavletTree, the sample density SampleDens ( ), and the sample density for inverse suffix array entries ( ). I.e. every value from the original suffix array is explicitely stored with bits.
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::const_iterator sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::begin | ( | ) | const |
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::empty | ( | ) | const [inline] |
Returns if the data strucutre is empty.
Required for the Container Concept of the STL.
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::const_iterator sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::end | ( | ) | const |
Returns a const_iterator to the element after the last element.
Required for the STL Container Concept.
void sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::load | ( | std::istream & | in | ) |
Load from a stream.
in | Inputstream to load the data structure from. |
static size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::max_size | ( | ) | [inline, static] |
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator!= | ( | const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & | csa | ) | const |
Unequality Operator.
Two Instances of csa_wt are equal if not all their members are equal.
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::value_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator() | ( | size_type | i | ) | const [inline] |
()-operator return inverse suffix array values
i | Index of the value. . |
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator= | ( | const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & | csa | ) |
Assignment Operator.
Required for the Assignable Concept of the STL.
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator== | ( | const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & | csa | ) | const |
Equality Operator.
Two Instances of csa_wt are equal if all their members are equal.
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::value_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator[] | ( | size_type | i | ) | const [inline] |
[]-operator
i | Index of the value. . Required for the STL Random Access Container Concept. |
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::rank_bwt | ( | size_type | i, |
const char_type | c | ||
) | const [inline] |
Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.
i | The exclusive index of the prefix range [0..i-1], so . |
c | The symbol to count the occurences in the prefix. |
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::select_bwt | ( | size_type | i, |
const char_type | c | ||
) | const [inline] |
Calculates the ith occurrence of symbol c in the BWT of the original text.
i | The ith occurrence. . |
c | The symbol c. |
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::serialize | ( | std::ostream & | out, |
structure_tree_node * | v = NULL , |
||
std::string | name = "" |
||
) | const |
Serialize to a stream.
out | Outstream to write the data structure. |
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::size | ( | ) | const [inline] |
void sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::swap | ( | csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & | csa | ) |
Swap method for csa_wt.
The swap method can be defined in terms of assignment. This requires three assignments, each of which, for a container type, is linear in the container's size. In a sense, then, a.swap(b) is redundant. This implementation guaranties a run-time complexity that is constant rather than linear.
csa | csa_wt to swap. |
Required for the Assignable Conecpt of the STL.