SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
 All Classes Namespaces Files Functions Variables Typedefs Friends
Public Types | Public Member Functions | Static Public Member Functions | Public Attributes | Static Public Attributes | Friends
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > Class Template Reference

A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation. More...

#include <csa_sada.hpp>

List of all members.

Public Types

enum  { sa_sample_dens = SampleDens, isa_sample_dens = InvSampleDens }
typedef uint64_t value_type
typedef
random_access_const_iterator
< csa_sada
const_iterator
typedef const_iterator iterator
typedef const value_type const_reference
typedef const_reference reference
typedef const_reference * pointer
typedef const pointer const_pointer
typedef int_vector::size_type size_type
typedef size_type csa_size_type
typedef ptrdiff_t difference_type
typedef EncVector enc_vector_type
typedef psi_of_csa_psi< csa_sadapsi_type
typedef bwt_of_csa_psi< csa_sadabwt_type
typedef const unsigned char * pattern_type
typedef unsigned char char_type
typedef csa_sada_trait
< fixedIntWidth >
::int_vector_type 
sa_sample_type
typedef csa_sada_trait
< fixedIntWidth >
::int_vector_type 
isa_sample_type
typedef csa_tag index_category

Public Member Functions

 csa_sada ()
 Default Constructor.
 ~csa_sada ()
 Default Destructor.
 csa_sada (const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa)
 Copy constructor.
template<typename RandomAccessContainer >
 csa_sada (const RandomAccessContainer &sa, const unsigned char *str)
 Construct the csa_sada from another (compressed) suffix array and the original text.
template<typename RandomAccessContainer >
 csa_sada (RandomAccessContainer &sa, const unsigned char *str)
 Construct the csa_sada from another (compressed) suffix array and the original text.
template<class size_type_class , uint8_t int_width, class size_type_class_1 >
 csa_sada (int_vector_file_buffer< 8, size_type_class > &bwt_buf, int_vector_file_buffer< int_width, size_type_class_1 > &sa_buf)
 Construct the csa_sada from the int_vector_file_buffers of the text, the suffix array and the inverse suffix array.
 csa_sada (tMSS &file_map, const std::string &dir, const std::string &id)
void construct (tMSS &file_map, const std::string &dir, const std::string &id)
 csa_sada (const unsigned char *str)
 Constructor for the CSA taking a string for that the CSA should be calculated.
size_type size () const
 Number of elements in the $\CSA$.
bool empty () const
 Returns if the data strucutre is empty.
void swap (csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &csa)
 Swap method for csa_sada.
const_iterator begin () const
 Returns a const_iterator to the first element.
const_iterator end () const
 Returns a const_iterator to the element after the last element.
value_type operator[] (size_type i) const
 []-operator
value_type operator() (size_type i) const
 ()-operator return inverse suffix array values
csa_sadaoperator= (const csa_sada &csa)
 Assignment Operator.
bool operator== (const csa_sada &csa) const
 Equality Operator.
bool operator!= (const csa_sada &csa) const
 Unequality Operator.
size_type serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const
 Serialize to a stream.
void load (std::istream &in)
 Load from a stream.
uint32_t get_sample_dens () const
uint32_t get_psi_sample_dens () const
void set_psi_sample_dens (const uint32_t sample_dens)
size_type rank_bwt (size_type i, const unsigned char c) const
 Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.
size_type select_bwt (size_type i, const unsigned char c) const
 Calculates the ith occurence of symbol c in the BWT of the original text.

Static Public Member Functions

static size_type max_size ()
 Returns the largest size that csa_sada can ever have.

Public Attributes

const int_vector< 8 > & char2comp
const int_vector< 8 > & comp2char
const int_vector< 64 > & C
const uint16_t & sigma
const psi_typepsi
const bwt_typebwt
const sa_sample_typesa_sample
const isa_sample_typeisa_sample

Static Public Attributes

static const int linear_decode_limit = 100000

Friends

class psi_of_csa_psi< csa_sada >
class bwt_of_csa_psi< csa_sada >

Detailed Description

template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
class sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >

A class for the Compressed Suffix Array (CSA) proposed by Sadakane for practical implementation.

The CSA is parameterized with an EncVector and the sample density SampleDens ( $s_{SA}$). I.e. every $s_{SA}th$ value from the original suffix array is explicitely stored with $\log n$ bits.

The EncVector (default is sdsl::enc_vector) holds the $\Psi$-function and can be parametrized with $s_{\Psi}$.

Todo:
example, code example
See also:
csa_sada_theo

Constructor & Destructor Documentation

template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
template<typename RandomAccessContainer >
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada ( const RandomAccessContainer &  sa,
const unsigned char *  str 
)

Construct the csa_sada from another (compressed) suffix array and the original text.

Parameters:
saThe existing (compressed) suffix array for the string str.
strThe text for which sa was build.
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
template<typename RandomAccessContainer >
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada ( RandomAccessContainer &  sa,
const unsigned char *  str 
)

Construct the csa_sada from another (compressed) suffix array and the original text.

Parameters:
saThe existing (compressed) suffix array for the string str.
strThe text for which the suffix array sa was build.
template<class EncVector, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::csa_sada ( const unsigned char *  str)

Constructor for the CSA taking a string for that the CSA should be calculated.

Parameters:
strThe text for which the CSA should be constructed.

Member Function Documentation

template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::begin ( ) const

Returns a const_iterator to the first element.

Required for the STL Container Concept.

See also:
end
template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::empty ( ) const [inline]

Returns if the data strucutre is empty.

Required for the Container Concept of the STL.A

See also:
size
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::const_iterator sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::end ( ) const

Returns a const_iterator to the element after the last element.

Required for the STL Container Concept.

See also:
begin.
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::load ( std::istream &  in)

Load from a stream.

Parameters:
inInputstream to load the data structure from.
template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
static size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::max_size ( ) [inline, static]

Returns the largest size that csa_sada can ever have.

Required for the Container Concept of the STL.

See also:
size
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator!= ( const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &  csa) const

Unequality Operator.

Two Instances of csa_sada are equal if not all their members are equal.

Required for the Equality Comparable Concept of the STL.
See also:
operator==
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator() ( size_type  i) const [inline]

()-operator return inverse suffix array values

Parameters:
iIndex of the value. $ i \in [0..size()-1]$.
Time complexity
$ \Order{s_{SA^{-1}}\cdot t_{\Psi}} $, where every $s_{SA^{-1}}$th suffix array entry is sampled and $t_{\Psi}$ is the access time for an element in the $\Psi$-function.
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > & sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator= ( const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &  csa)

Assignment Operator.

Required for the Assignable Concept of the STL.

template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
bool sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator== ( const csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &  csa) const

Equality Operator.

Two Instances of csa_sada are equal if all their members are equal.

Required for the Equality Comparable Concept of the STL.
See also:
operator!=
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::value_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::operator[] ( size_type  i) const [inline]

[]-operator

Parameters:
iIndex of the value. $ i \in [0..size()-1]$. Required for the STL Random Access Container Concept.
Time complexity
$ \Order{s_{SA}\cdot t_{\Psi}} $, where every $s_{SA}$th suffix array entry is sampled and $t_{\Psi}$ is the access time for an element in the $\Psi$-function.
template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::rank_bwt ( size_type  i,
const unsigned char  c 
) const [inline]

Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.

Parameters:
iThe exclusive index of the prefix range [0..i-1], so $i\in [0..size()]$.
cThe symbol to count the occurences in the prefix.
Returns:
The number of occurences of symbol c in the prefix [0..i-1] of the BWT.
Time complexity
$ \Order{\log n t_{\Psi}} $
template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::select_bwt ( size_type  i,
const unsigned char  c 
) const [inline]

Calculates the ith occurence of symbol c in the BWT of the original text.

"

Parameters:
iThe ith occurence. $i\in [1..rank(size(),c)]$.
cThe symbol c.
Returns:
The ith occurence symbol c in the the BWT or size() if no ith occurence of the symbol exists in the BWT.
Time complexity
$ \Order{t_{\Psi}} $
template<class EncVector , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::serialize ( std::ostream &  out,
structure_tree_node v = NULL,
std::string  name = "" 
) const

Serialize to a stream.

Parameters:
outOutstream to write the data structure.
Returns:
The number of written bytes.
template<class EncVector = enc_vector<>, uint32_t SampleDens = 32, uint32_t InvSampleDens = 64, uint8_t fixedIntWidth = 0>
size_type sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::size ( ) const [inline]

Number of elements in the $\CSA$.

Required for the Container Concept of the STL.

See also:
max_size, empty
Time complexity
$ \Order{1} $
template<class EncVector, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth>
void sdsl::csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth >::swap ( csa_sada< EncVector, SampleDens, InvSampleDens, fixedIntWidth > &  csa)

Swap method for csa_sada.

The swap method can be defined in terms of assignment. This requires three assignments, each of which, for a container type, is linear in the container's size. In a sense, then, a.swap(b) is redundant. This implementation guaranties a run-time complexity that is constant rather than linear.

Parameters:
csacsa_sada to swap.

Required for the Assignable Conecpt of the STL.


The documentation for this class was generated from the following file: