|
SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
|
A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation. More...
#include <csa_sada.hpp>
Public Types | |
| enum | { sa_sample_dens = SampleDens, isa_sample_dens = InvSampleDens } |
| typedef uint64_t | value_type |
|
typedef random_access_const_iterator < csa_sada > | const_iterator |
| typedef const_iterator | iterator |
| typedef const value_type | const_reference |
| typedef const_reference | reference |
| typedef const_reference * | pointer |
| typedef const pointer | const_pointer |
| typedef int_vector::size_type | size_type |
| typedef size_type | csa_size_type |
| typedef ptrdiff_t | difference_type |
| typedef EncVector | enc_vector_type |
| typedef psi_of_csa_psi< csa_sada > | psi_type |
| typedef bwt_of_csa_psi< csa_sada > | bwt_type |
| typedef const unsigned char * | pattern_type |
| typedef unsigned char | char_type |
|
typedef csa_sada_trait < fixedIntWidth > ::int_vector_type | sa_sample_type |
|
typedef csa_sada_trait < fixedIntWidth > ::int_vector_type | isa_sample_type |
| typedef csa_tag | index_category |
Public Member Functions | |
| csa_sada () | |
| Default Constructor. | |
| ~csa_sada () | |
| Default Destructor. | |
| csa_sada (const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa) | |
| Copy constructor. | |
| template<typename RandomAccessContainer > | |
| csa_sada (const RandomAccessContainer &sa, const unsigned char *str) | |
| Construct the csa_sada from another (compressed) suffix array and the original text. | |
| template<typename RandomAccessContainer > | |
| csa_sada (RandomAccessContainer &sa, const unsigned char *str) | |
| Construct the csa_sada from another (compressed) suffix array and the original text. | |
| template<class size_type_class , uint8_t int_width, class size_type_class_1 > | |
| csa_sada (int_vector_file_buffer< 8, size_type_class > &bwt_buf, int_vector_file_buffer< int_width, size_type_class_1 > &sa_buf) | |
| Construct the csa_sada from the int_vector_file_buffers of the text, the suffix array and the inverse suffix array. | |
| csa_sada (tMSS &file_map, const std::string &dir, const std::string &id) | |
| void | construct (tMSS &file_map, const std::string &dir, const std::string &id) |
| csa_sada (const unsigned char *str) | |
| Constructor for the CSA taking a string for that the CSA should be calculated. | |
| size_type | size () const |
Number of elements in the . | |
| bool | empty () const |
| Returns if the data strucutre is empty. | |
| void | swap (csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa) |
| Swap method for csa_sada. | |
| const_iterator | begin () const |
| Returns a const_iterator to the first element. | |
| const_iterator | end () const |
| Returns a const_iterator to the element after the last element. | |
| value_type | operator[] (size_type i) const |
| []-operator | |
| value_type | operator() (size_type i) const |
| ()-operator return inverse suffix array values | |
| csa_sada & | operator= (const csa_sada &csa) |
| Assignment Operator. | |
| bool | operator== (const csa_sada &csa) const |
| Equality Operator. | |
| bool | operator!= (const csa_sada &csa) const |
| Unequality Operator. | |
| size_type | serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const |
| Serialize to a stream. | |
| void | load (std::istream &in) |
| Load from a stream. | |
| uint32_t | get_sample_dens () const |
| uint32_t | get_psi_sample_dens () const |
| void | set_psi_sample_dens (const uint32_t sample_dens) |
| size_type | rank_bwt (size_type i, const unsigned char c) const |
| Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text. | |
| size_type | select_bwt (size_type i, const unsigned char c) const |
| Calculates the ith occurence of symbol c in the BWT of the original text. | |
Static Public Member Functions | |
| static size_type | max_size () |
| Returns the largest size that csa_sada can ever have. | |
Public Attributes | |
| const int_vector< 8 > & | char2comp |
| const int_vector< 8 > & | comp2char |
| const int_vector< 64 > & | C |
| const uint16_t & | sigma |
| const psi_type & | psi |
| const bwt_type & | bwt |
| const sa_sample_type & | sa_sample |
| const isa_sample_type & | isa_sample |
Static Public Attributes | |
| static const int | linear_decode_limit = 100000 |
Friends | |
| class | psi_of_csa_psi< csa_sada > |
| class | bwt_of_csa_psi< csa_sada > |
A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation.
The CSA is parameterized with an EncVector and the sample density SampleDens (
). I.e. every
value from the original suffix array is explicitely stored with
bits.
The EncVector (default is sdsl::enc_vector) holds the
-function and can be parametrized with
.
| sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | const RandomAccessContainer & | sa, |
| const unsigned char * | str | ||
| ) |
Construct the csa_sada from another (compressed) suffix array and the original text.
| sa | The existing (compressed) suffix array for the string str. |
| str | The text for which sa was build. |
| sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | RandomAccessContainer & | sa, |
| const unsigned char * | str | ||
| ) |
Construct the csa_sada from another (compressed) suffix array and the original text.
| sa | The existing (compressed) suffix array for the string str. |
| str | The text for which the suffix array sa was build. |
| sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada | ( | const unsigned char * | str | ) |
Constructor for the CSA taking a string for that the CSA should be calculated.
| str | The text for which the CSA should be constructed. |
| csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::begin | ( | ) | const |
| bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::empty | ( | ) | const [inline] |
Returns if the data strucutre is empty.
Required for the Container Concept of the STL.A
| csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::end | ( | ) | const |
Returns a const_iterator to the element after the last element.
Required for the STL Container Concept.
| void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::load | ( | std::istream & | in | ) |
Load from a stream.
| in | Inputstream to load the data structure from. |
| static size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::max_size | ( | ) | [inline, static] |
| bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator!= | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) | const |
Unequality Operator.
Two Instances of csa_sada are equal if not all their members are equal.
| csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator() | ( | size_type | i | ) | const [inline] |
()-operator return inverse suffix array values
| i | Index of the value. . |
, where every
th suffix array entry is sampled and
is the access time for an element in the
-function. | csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator= | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) |
Assignment Operator.
Required for the Assignable Concept of the STL.
| bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator== | ( | const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) | const |
Equality Operator.
Two Instances of csa_sada are equal if all their members are equal.
| csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator[] | ( | size_type | i | ) | const [inline] |
[]-operator
| i | Index of the value. . Required for the STL Random Access Container Concept. |
, where every
th suffix array entry is sampled and
is the access time for an element in the
-function. | size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::rank_bwt | ( | size_type | i, |
| const unsigned char | c | ||
| ) | const [inline] |
Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.
| i | The exclusive index of the prefix range [0..i-1], so . |
| c | The symbol to count the occurences in the prefix. |
| size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::select_bwt | ( | size_type | i, |
| const unsigned char | c | ||
| ) | const [inline] |
Calculates the ith occurence of symbol c in the BWT of the original text.
"
| i | The ith occurence. . |
| c | The symbol c. |
| csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::serialize | ( | std::ostream & | out, |
| structure_tree_node * | v = NULL, |
||
| std::string | name = "" |
||
| ) | const |
Serialize to a stream.
| out | Outstream to write the data structure. |
| size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size | ( | ) | const [inline] |
| void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::swap | ( | csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & | csa | ) |
Swap method for csa_sada.
The swap method can be defined in terms of assignment. This requires three assignments, each of which, for a container type, is linear in the container's size. In a sense, then, a.swap(b) is redundant. This implementation guaranties a run-time complexity that is constant rather than linear.
| csa | csa_sada to swap. |
Required for the Assignable Conecpt of the STL.
1.8.0