SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
|
A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation. More...
#include <csa_sada.hpp>
Public Types | |
enum | { sa_sample_dens = SampleDens, isa_sample_dens = InvSampleDens } |
typedef uint64_t | value_type |
typedef random_access_const_iterator < csa_sada > | const_iterator |
typedef const_iterator | iterator |
typedef const value_type | const_reference |
typedef const_reference | reference |
typedef const_reference * | pointer |
typedef const pointer | const_pointer |
typedef int_vector::size_type | size_type |
typedef size_type | csa_size_type |
typedef ptrdiff_t | difference_type |
typedef EncVector | enc_vector_type |
typedef psi_of_csa_psi< csa_sada > | psi_type |
typedef bwt_of_csa_psi< csa_sada > | bwt_type |
typedef const unsigned char * | pattern_type |
typedef unsigned char | char_type |
typedef csa_sada_trait < fixedIntWidth > ::int_vector_type | sa_sample_type |
typedef csa_sada_trait < fixedIntWidth > ::int_vector_type | isa_sample_type |
typedef csa_tag | index_category |
Public Member Functions | |
csa_sada () | |
Default Constructor. | |
~csa_sada () | |
Default Destructor. | |
csa_sada (const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa) | |
Copy constructor. | |
template<typename RandomAccessContainer > | |
csa_sada (const RandomAccessContainer &sa, const unsigned char *str) | |
Construct the csa_sada from another (compressed) suffix array and the original text. | |
template<typename RandomAccessContainer > | |
csa_sada (RandomAccessContainer &sa, const unsigned char *str) | |
Construct the csa_sada from another (compressed) suffix array and the original text. | |
template<class size_type_class , uint8_t int_width, class size_type_class_1 > | |
csa_sada (int_vector_file_buffer< 8, size_type_class > &bwt_buf, int_vector_file_buffer< int_width, size_type_class_1 > &sa_buf) | |
Construct the csa_sada from the int_vector_file_buffers of the text, the suffix array and the inverse suffix array. | |
csa_sada (tMSS &file_map, const std::string &dir, const std::string &id) | |
void | construct (tMSS &file_map, const std::string &dir, const std::string &id) |
csa_sada (const unsigned char *str) | |
Constructor for the CSA taking a string for that the CSA should be calculated. | |
size_type | size () const |
Number of elements in the . | |
bool | empty () const |
Returns if the data strucutre is empty. | |
void | swap (csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa) |
Swap method for csa_sada. | |
const_iterator | begin () const |
Returns a const_iterator to the first element. | |
const_iterator | end () const |
Returns a const_iterator to the element after the last element. | |
value_type | operator[] (size_type i) const |
[]-operator | |
value_type | operator() (size_type i) const |
()-operator return inverse suffix array values | |
csa_sada & | operator= (const csa_sada &csa) |
Assignment Operator. | |
bool | operator== (const csa_sada &csa) const |
Equality Operator. | |
bool | operator!= (const csa_sada &csa) const |
Unequality Operator. | |
size_type | serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const |
Serialize to a stream. | |
void | load (std::istream &in) |
Load from a stream. | |
uint32_t | get_sample_dens () const |
uint32_t | get_psi_sample_dens () const |
void | set_psi_sample_dens (const uint32_t sample_dens) |
size_type | rank_bwt (size_type i, const unsigned char c) const |
Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text. | |
size_type | select_bwt (size_type i, const unsigned char c) const |
Calculates the ith occurence of symbol c in the BWT of the original text. | |
Static Public Member Functions | |
static size_type | max_size () |
Returns the largest size that csa_sada can ever have. | |
Public Attributes | |
const int_vector< 8 > & | char2comp |
const int_vector< 8 > & | comp2char |
const int_vector< 64 > & | C |
const uint16_t & | sigma |
const psi_type & | psi |
const bwt_type & | bwt |
const sa_sample_type & | sa_sample |
const isa_sample_type & | isa_sample |
Static Public Attributes | |
static const int | linear_decode_limit = 100000 |
Friends | |
class | psi_of_csa_psi< csa_sada > |
class | bwt_of_csa_psi< csa_sada > |
A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation.
The CSA is parameterized with an EncVector and the sample density SampleDens ( ). I.e. every value from the original suffix array is explicitely stored with bits.
The EncVector (default is sdsl::enc_vector) holds the -function and can be parametrized with .
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | const RandomAccessContainer & | sa, |
const unsigned char * | str | ||
) |
Construct the csa_sada from another (compressed) suffix array and the original text.
sa | The existing (compressed) suffix array for the string str. |
str | The text for which sa was build. |
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | RandomAccessContainer & | sa, |
const unsigned char * | str | ||
) |
Construct the csa_sada from another (compressed) suffix array and the original text.
sa | The existing (compressed) suffix array for the string str. |
str | The text for which the suffix array sa was build. |
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | const unsigned char * | str | ) |
Constructor for the CSA taking a string for that the CSA should be calculated.
str | The text for which the CSA should be constructed. |
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::begin | ( | ) | const |
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::empty | ( | ) | const [inline] |
Returns if the data strucutre is empty.
Required for the Container Concept of the STL.A
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::end | ( | ) | const |
Returns a const_iterator to the element after the last element.
Required for the STL Container Concept.
void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::load | ( | std::istream & | in | ) |
Load from a stream.
in | Inputstream to load the data structure from. |
static size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::max_size | ( | ) | [inline, static] |
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator!= | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) | const |
Unequality Operator.
Two Instances of csa_sada are equal if not all their members are equal.
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator() | ( | size_type | i | ) | const [inline] |
()-operator return inverse suffix array values
i | Index of the value. . |
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator= | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) |
Assignment Operator.
Required for the Assignable Concept of the STL.
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator== | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) | const |
Equality Operator.
Two Instances of csa_sada are equal if all their members are equal.
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator[] | ( | size_type | i | ) | const [inline] |
[]-operator
i | Index of the value. . Required for the STL Random Access Container Concept. |
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::rank_bwt | ( | size_type | i, |
const unsigned char | c | ||
) | const [inline] |
Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.
i | The exclusive index of the prefix range [0..i-1], so . |
c | The symbol to count the occurences in the prefix. |
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::select_bwt | ( | size_type | i, |
const unsigned char | c | ||
) | const [inline] |
Calculates the ith occurence of symbol c in the BWT of the original text.
"
i | The ith occurence. . |
c | The symbol c. |
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::serialize | ( | std::ostream & | out, |
structure_tree_node * | v = NULL , |
||
std::string | name = "" |
||
) | const |
Serialize to a stream.
out | Outstream to write the data structure. |
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size | ( | ) | const [inline] |
void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::swap | ( | csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) |
Swap method for csa_sada.
The swap method can be defined in terms of assignment. This requires three assignments, each of which, for a container type, is linear in the container's size. In a sense, then, a.swap(b) is redundant. This implementation guaranties a run-time complexity that is constant rather than linear.
csa | csa_sada to swap. |
Required for the Assignable Conecpt of the STL.