PbiBuilder¶
#include <pbbam/PbiBuilder.h>
-
class
PacBio::BAM::PbiBuilder¶ The PbiBuilder class construct PBI index data from BAM record data.
Records are added one-by-one. This allows for either whole-file indexing of existing BAM files or for indexing “on-the-fly” alongside a BAM file as it is generated.
For simple PBI creation from existing BAM files, see PbiFile::CreateFrom. This is the recommended approach, unless finer control or additional processing is needed.
Constructors & Related Methods
-
PbiBuilder(const std::string &pbiFilename, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename.- Parameters
pbiFilename: output filenamecompressionLevel: zlib compression levelnumThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error: if PBI file cannot be opened for writing
-
PbiBuilder(const std::string &pbiFilename, const size_t numReferenceSequences, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename.Reference data-tracking structures will be initialized to expect
numReferenceSequences. (This is useful so that we can mark any references that lack observed data appropriately).- Parameters
pbiFilename: output filenamenumReferenceSequences: number of possible reference sequences, e.g. BamHeader::NumSequencescompressionLevel: zlib compression levelnumThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error: if PBI file cannot be opened for writing
-
PbiBuilder(const std::string &pbiFilename, const size_t numReferenceSequences, const bool isCoordinateSorted, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename.Reference data-tracking structures will be initialized to expect
numReferenceSequences, but only ifisCoordinateSortedis true.- Parameters
pbiFilename: output filenamenumReferenceSequences: number of possible reference sequences, e.g. BamHeader::NumSequencesisCoordinateSorted: if false, disables reference sequence tracking (BamHeader::SortOrder != “coordinate”)compressionLevel: zlib compression levelnumThreads: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error: if PBI file cannot be opened for writing
Index Building
-
void
AddRecord(const BamRecord &record, const int64_t vOffset)¶ Adds
record'sdata to underlying raw data structure.To build a PBI index while generating a BAM file:
BamWriter writer(...); PbiBuilder pbiBuilder(...); int64_t vOffset; BamRecord record; while (...) { // ... populate record data ... // write record to BAM and add PBI entry writer.Write(record, &vOffset); pbiBuilder.AddRecord(record, vOffset); }
- Note
vOffsetis a BGZF virtual offset into the BAM file. To get this value, you should use one of the following:- while reading existing BAM: BamReader::VirtualTell
- while writing new BAM: BamWriter::Write(const BamRecord& record, int64_t* vOffset)
To build a PBI index from an existing BAM file:
// To simply create a PBI file from BAM, the following is the easiest method: // #include <pbbam/BamFile.h> #include <pbbam/PbiFile.h> BamFile bamFile("data.bam"); PbiFile::CreateFrom(bamFile); // However if you need to perform additional operations while reading the BAM file, // you can do something like the following: // { BamFile bamFile("data.bam"); PbiBuilder builder(bamFile.PacBioIndexFilename(), bamFile.Header().Sequences().size()); BamReader reader(bamFile); BamRecord b; int64_t offset = reader.VirtualTell(); // first record's vOffset while (reader.GetNext(b)) { // store PBI recrod entry & get next record's vOffset builder.AddRecord(b, offset); offset = reader.VirtualTell(); // ... additional stuff as needed ... } } // <-- PBI data will only be written here, as PbiBuilder goes out of scope
- Parameters
record: input BamRecord to pull index data fromvOffset: virtual offset into BAM file where record begins
-
void
Close()¶ Writes data out to PBI file & closes builder.
- Note
- Any exceptions are thrown to caller. If you don’t care about catching exceptions with file I/O, just let the builder go out of scope and data will be written, but exceptions swallowed (to avoid throwing from destructor).
Public Types
-
enum
CompressionLevel¶ This enum allows you to control the compression level of the output PBI file.
Values are equivalent to zlib compression levels. See its documentation for more details: http://www.zlib.net/manual.html
Values:
-
CompressionLevel_0= 0¶
-
CompressionLevel_1= 1¶
-
CompressionLevel_2= 2¶
-
CompressionLevel_3= 3¶
-
CompressionLevel_4= 4¶
-
CompressionLevel_5= 5¶
-
CompressionLevel_6= 6¶
-
CompressionLevel_7= 7¶
-
CompressionLevel_8= 8¶
-
CompressionLevel_9= 9¶
-
DefaultCompression= -1¶
-
NoCompression= CompressionLevel_0¶
-
FastCompression= CompressionLevel_1¶
-
BestCompression= CompressionLevel_9¶
-
-