SDSL: Succinct Data Structure Library
A C++ template library for succinct data structures
 All Classes Namespaces Files Functions Variables Typedefs Friends
Public Types | Public Member Functions | Static Public Member Functions | Public Attributes | Friends
sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > Class Template Reference

A class for the Compressed Suffix Array (CSA) based on a Wavelet Tree (WT) of the Burrow Wheeler Transform of the orignal text. More...

#include <csa_wt.hpp>

List of all members.

Public Types

enum  { sa_sample_dens = SampleDens, isa_sample_dens = InvSampleDens }
typedef uint64_t value_type
typedef
random_access_const_iterator
< csa_wt
const_iterator
typedef const_iterator iterator
typedef const value_type const_reference
typedef const_reference reference
typedef const_reference * pointer
typedef const pointer const_pointer
typedef int_vector::size_type size_type
typedef size_type csa_size_type
typedef ptrdiff_t difference_type
typedef psi_of_csa_wt< csa_wtpsi_type
typedef bwt_of_csa_wt< csa_wtbwt_type
typedef WaveletTree wavelet_tree_type
typedef charType char_type
typedef const char_type * pattern_type
typedef csa_wt_trait
< fixedIntWidth >
::int_vector_type 
sa_sample_type
typedef csa_wt_trait
< fixedIntWidth >
::int_vector_type 
isa_sample_type
typedef csa_tag index_category

Public Member Functions

 csa_wt ()
 Default Constructor.
 ~csa_wt ()
 Default Destructor.
 csa_wt (const csa_wt &csa)
 Copy constructor.
template<typename RandomAccessContainer >
 csa_wt (const RandomAccessContainer &sa, const char_type *str)
 Construct the csa_wt from another compressed or uncompressed suffix array.
 csa_wt (const char_type *str)
 Constructor for the csa_wt taking a string for that the CSA should be calculated.
template<class size_type_class , uint8_t int_width, class size_type_class_1 >
 csa_wt (int_vector_file_buffer< 8, size_type_class > &bwt_buf, int_vector_file_buffer< int_width, size_type_class_1 > &sa_buf)
 Construct the csa_wt from the int_vector_file_buffers of the bwt of the text and the suffix array.
template<class size_type_class >
 csa_wt (int_vector_file_buffer< 8, size_type_class > &bwt_buf)
 Construct the bwt part of the csa_wt from the int_vector_file_buffers of the bwt of the text.
 csa_wt (tMSS &file_map, const std::string &dir, const std::string &id)
void construct (tMSS &file_map, const std::string &dir, const std::string &id)
size_type size () const
 Number of elements in the $\CSA$.
bool empty () const
 Returns if the data strucutre is empty.
void swap (csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &csa)
 Swap method for csa_wt.
const_iterator begin () const
 Returns a const_iterator to the first element.
const_iterator end () const
 Returns a const_iterator to the element after the last element.
value_type operator[] (size_type i) const
 []-operator
value_type operator() (size_type i) const
 ()-operator return inverse suffix array values
csa_wtoperator= (const csa_wt &csa)
 Assignment Operator.
bool operator== (const csa_wt &csa) const
 Equality Operator.
bool operator!= (const csa_wt &csa) const
 Unequality Operator.
size_type serialize (std::ostream &out, structure_tree_node *v=NULL, std::string name="") const
 Serialize to a stream.
void load (std::istream &in)
 Load from a stream.
uint32_t get_psi_sample_dens () const
void set_psi_sample_dens (const uint32_t sample_dens)
size_type rank_bwt (size_type i, const char_type c) const
 Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.
size_type select_bwt (size_type i, const char_type c) const
 Calculates the ith occurrence of symbol c in the BWT of the original text.

Static Public Member Functions

static size_type max_size ()
 Returns the largest size that csa_wt can ever have.

Public Attributes

const int_vector< 8 > & char2comp
const int_vector< 8 > & comp2char
const int_vector< 64 > & C
const uint16_t & sigma
const psi_typepsi
const bwt_typebwt
const sa_sample_typesa_sample
const isa_sample_typeisa_sample
const wavelet_tree_type & wavelet_tree

Friends

class psi_of_csa_wt< csa_wt >
class bwt_of_csa_wt< csa_wt >

Detailed Description

template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
class sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >

A class for the Compressed Suffix Array (CSA) based on a Wavelet Tree (WT) of the Burrow Wheeler Transform of the orignal text.

The CSA is parameterized with an WavletTree, the sample density SampleDens ( $s_{SA}$), and the sample density for inverse suffix array entries ( $s_{SA^{-1}}$). I.e. every $s_{SA}th$ value from the original suffix array is explicitely stored with $\log n$ bits.

Todo:
example, code example
See also:
sdsl::csa_sada, sdsl::csa_uncompressed

Member Function Documentation

template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::const_iterator sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::begin ( ) const

Returns a const_iterator to the first element.

Required for the STL Container Concept.

See also:
end
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::empty ( ) const [inline]

Returns if the data strucutre is empty.

Required for the Container Concept of the STL.

See also:
size
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::const_iterator sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::end ( ) const

Returns a const_iterator to the element after the last element.

Required for the STL Container Concept.

See also:
begin.
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
void sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::load ( std::istream &  in)

Load from a stream.

Parameters:
inInputstream to load the data structure from.
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
static size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::max_size ( ) [inline, static]

Returns the largest size that csa_wt can ever have.

Required for the Container Concept of the STL.

See also:
size
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator!= ( const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &  csa) const

Unequality Operator.

Two Instances of csa_wt are equal if not all their members are equal.

Required for the Equality Comparable Concept of the STL.
See also:
operator==
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::value_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator() ( size_type  i) const [inline]

()-operator return inverse suffix array values

Parameters:
iIndex of the value. $ i \in [0..size()-1]$.
Time complexity
$ \Order{s_{SA^{-1}}\cdot t_{\Psi}} $, where every $s_{SA^{-1}}$th suffix array entry is sampled and $t_{\Psi}$ is the access time for an element in the $\Psi$-function.
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > & sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator= ( const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &  csa)

Assignment Operator.

Required for the Assignable Concept of the STL.

template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
bool sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator== ( const csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &  csa) const

Equality Operator.

Two Instances of csa_wt are equal if all their members are equal.

Required for the Equality Comparable Concept of the STL.
See also:
operator!=
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::value_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::operator[] ( size_type  i) const [inline]

[]-operator

Parameters:
iIndex of the value. $ i \in [0..size()-1]$. Required for the STL Random Access Container Concept.
Time complexity
$ \Order{s_{SA}\cdot t_{\Psi}} $, where every $s_{SA}$th suffix array entry is sampled and $t_{\Psi}$ is the access time for an element in the $\Psi$-function.
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::rank_bwt ( size_type  i,
const char_type  c 
) const [inline]

Calculates how many symbols c are in the prefix [0..i-1] of the BWT of the original text.

Parameters:
iThe exclusive index of the prefix range [0..i-1], so $i\in [0..size()]$.
cThe symbol to count the occurences in the prefix.
Returns:
The number of occurences of symbol c in the prefix [0..i-1] of the BWT.
Time complexity
$ \Order{\log |\Sigma|} $
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::select_bwt ( size_type  i,
const char_type  c 
) const [inline]

Calculates the ith occurrence of symbol c in the BWT of the original text.

Parameters:
iThe ith occurrence. $i\in [1..rank(size(),c)]$.
cThe symbol c.
Returns:
The ith occurrence symbol c in the the BWT or size() if no ith occurence of the symbol exists in the BWT.
Time complexity
$ \Order{t_{\Psi}} $
template<class WaveletTree , uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType >
csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::serialize ( std::ostream &  out,
structure_tree_node v = NULL,
std::string  name = "" 
) const

Serialize to a stream.

Parameters:
outOutstream to write the data structure.
Returns:
The number of written bytes.
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
size_type sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::size ( ) const [inline]

Number of elements in the $\CSA$.

Required for the Container Concept of the STL.

See also:
max_size, empty
Time complexity
$ \Order{1} $
template<class WaveletTree, uint32_t SampleDens, uint32_t InvSampleDens, uint8_t fixedIntWidth, class charType>
void sdsl::csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType >::swap ( csa_wt< WaveletTree, SampleDens, InvSampleDens, fixedIntWidth, charType > &  csa)

Swap method for csa_wt.

The swap method can be defined in terms of assignment. This requires three assignments, each of which, for a container type, is linear in the container's size. In a sense, then, a.swap(b) is redundant. This implementation guaranties a run-time complexity that is constant rather than linear.

Parameters:
csacsa_wt to swap.

Required for the Assignable Conecpt of the STL.


The documentation for this class was generated from the following file: