PbiBuilder

#include <pbbam/PbiBuilder.h>
class PacBio::BAM::PbiBuilder

The PbiBuilder class construct PBI index data from BAM record data.

Records are added one-by-one. This allows for either whole-file indexing of existing BAM files or for indexing “on-the-fly” alongside a BAM file as it is generated.

For simple PBI creation from existing BAM files, see PbiFile::CreateFrom. This is the recommended approach, unless finer control or additional processing is needed.

Constructors & Related Methods

PbiBuilder(const std::string &pbiFilename, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)

Initializes builder to write data to pbiFilename.

Parameters
  • pbiFilename: output filename
  • compressionLevel: zlib compression level
  • numThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
Exceptions
  • std::runtime_error: if PBI file cannot be opened for writing

PbiBuilder(const std::string &pbiFilename, const size_t numReferenceSequences, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)

Initializes builder to write data to pbiFilename.

Reference data-tracking structures will be initialized to expect numReferenceSequences. (This is useful so that we can mark any references that lack observed data appropriately).

Parameters
  • pbiFilename: output filename
  • numReferenceSequences: number of possible reference sequences, e.g. BamHeader::NumSequences
  • compressionLevel: zlib compression level
  • numThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
Exceptions
  • std::runtime_error: if PBI file cannot be opened for writing

PbiBuilder(const std::string &pbiFilename, const size_t numReferenceSequences, const bool isCoordinateSorted, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)

Initializes builder to write data to pbiFilename.

Reference data-tracking structures will be initialized to expect numReferenceSequences, but only if isCoordinateSorted is true.

Parameters
  • pbiFilename: output filename
  • numReferenceSequences: number of possible reference sequences, e.g. BamHeader::NumSequences
  • isCoordinateSorted: if false, disables reference sequence tracking (BamHeader::SortOrder != “coordinate”)
  • compressionLevel: zlib compression level
  • numThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
Exceptions
  • std::runtime_error: if PBI file cannot be opened for writing

~PbiBuilder()

Destroys builder, writing its data out to PBI file.

Note
Exceptions are swallowed. Use Close() if you want to catch them.

Index Building

void AddRecord(const BamRecord &record, const int64_t vOffset)

Adds record's data to underlying raw data structure.

To build a PBI index while generating a BAM file:

BamWriter writer(...);
PbiBuilder pbiBuilder(...);
int64_t vOffset;
BamRecord record;
while (...) {

    // ... populate record data ...

    // write record to BAM and add PBI entry
    writer.Write(record, &vOffset);
    pbiBuilder.AddRecord(record, vOffset);
}
Note
vOffset is a BGZF virtual offset into the BAM file. To get this value, you should use one of the following:

To build a PBI index from an existing BAM file:

// To simply create a PBI file from BAM, the following is the easiest method:
//
#include <pbbam/BamFile.h>
#include <pbbam/PbiFile.h>

BamFile bamFile("data.bam");
PbiFile::CreateFrom(bamFile);


// However if you need to perform additional operations while reading the BAM file, 
// you can do something like the following:
//
{
    BamFile bamFile("data.bam");
    PbiBuilder builder(bamFile.PacBioIndexFilename(), 
                       bamFile.Header().Sequences().size());
    BamReader reader(bamFile);
    BamRecord b;
    int64_t offset = reader.VirtualTell(); // first record's vOffset
    while (reader.GetNext(b)) {

        // store PBI recrod entry & get next record's vOffset
        builder.AddRecord(b, offset);
        offset = reader.VirtualTell();
   
        // ... additional stuff as needed ...
    }

} // <-- PBI data will only be written here, as PbiBuilder goes out of scope

Parameters
  • record: input BamRecord to pull index data from
  • vOffset: virtual offset into BAM file where record begins

void Close()

Writes data out to PBI file & closes builder.

Note
Any exceptions are thrown to caller. If you don’t care about catching exceptions with file I/O, just let the builder go out of scope and data will be written, but exceptions swallowed (to avoid throwing from destructor).

Public Types

enum CompressionLevel

This enum allows you to control the compression level of the output PBI file.

Values are equivalent to zlib compression levels. See its documentation for more details: http://www.zlib.net/manual.html

Values:

CompressionLevel_0 = 0
CompressionLevel_1 = 1
CompressionLevel_2 = 2
CompressionLevel_3 = 3
CompressionLevel_4 = 4
CompressionLevel_5 = 5
CompressionLevel_6 = 6
CompressionLevel_7 = 7
CompressionLevel_8 = 8
CompressionLevel_9 = 9
DefaultCompression = -1
NoCompression = CompressionLevel_0
FastCompression = CompressionLevel_1
BestCompression = CompressionLevel_9