PbiBuilder¶
#include <pbbam/PbiBuilder.h>
-
class
PacBio::BAM::
PbiBuilder
¶ The PbiBuilder class construct PBI index data from BAM record data.
Records are added one-by-one. This allows for either whole-file indexing of existing BAM files or for indexing “on-the-fly” alongside a BAM file as it is generated.
For simple PBI creation from existing BAM files, see PbiFile::CreateFrom. This is the recommended approach, unless finer control or additional processing is needed.
Constructors & Related Methods
-
PbiBuilder
(const std::string &pbiFilename, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename
.- Parameters
pbiFilename
: output filenamecompressionLevel
: zlib compression levelnumThreads
: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error
: if PBI file cannot be opened for writing
-
PbiBuilder
(const std::string &pbiFilename, const size_t numReferenceSequences, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename
.Reference data-tracking structures will be initialized to expect
numReferenceSequences
. (This is useful so that we can mark any references that lack observed data appropriately).- Parameters
pbiFilename
: output filenamenumReferenceSequences
: number of possible reference sequences, e.g. BamHeader::NumSequencescompressionLevel
: zlib compression levelnumThreads
: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error
: if PBI file cannot be opened for writing
-
PbiBuilder
(const std::string &pbiFilename, const size_t numReferenceSequences, const bool isCoordinateSorted, const PbiBuilder::CompressionLevel compressionLevel = PbiBuilder::DefaultCompression, const size_t numThreads = 4)¶ Initializes builder to write data to
pbiFilename
.Reference data-tracking structures will be initialized to expect
numReferenceSequences
, but only ifisCoordinateSorted
is true.- Parameters
pbiFilename
: output filenamenumReferenceSequences
: number of possible reference sequences, e.g. BamHeader::NumSequencesisCoordinateSorted
: if false, disables reference sequence tracking (BamHeader::SortOrder != “coordinate”)compressionLevel
: zlib compression levelnumThreads
: number of threads for compression. If set to 0, PbiBuilder will attempt to determine a reasonable estimate. If set to 1, this will force single-threaded execution. No checks are made against an upper limit.
- Exceptions
std::runtime_error
: if PBI file cannot be opened for writing
Index Building
-
void
AddRecord
(const BamRecord &record, const int64_t vOffset)¶ Adds
record's
data to underlying raw data structure.To build a PBI index while generating a BAM file:
BamWriter writer(...); PbiBuilder pbiBuilder(...); int64_t vOffset; BamRecord record; while (...) { // ... populate record data ... // write record to BAM and add PBI entry writer.Write(record, &vOffset); pbiBuilder.AddRecord(record, vOffset); }
- Note
vOffset
is a BGZF virtual offset into the BAM file. To get this value, you should use one of the following:- while reading existing BAM: BamReader::VirtualTell
- while writing new BAM: BamWriter::Write(const BamRecord& record, int64_t* vOffset)
To build a PBI index from an existing BAM file:
// To simply create a PBI file from BAM, the following is the easiest method: // #include <pbbam/BamFile.h> #include <pbbam/PbiFile.h> BamFile bamFile("data.bam"); PbiFile::CreateFrom(bamFile); // However if you need to perform additional operations while reading the BAM file, // you can do something like the following: // { BamFile bamFile("data.bam"); PbiBuilder builder(bamFile.PacBioIndexFilename(), bamFile.Header().Sequences().size()); BamReader reader(bamFile); BamRecord b; int64_t offset = reader.VirtualTell(); // first record's vOffset while (reader.GetNext(b)) { // store PBI recrod entry & get next record's vOffset builder.AddRecord(b, offset); offset = reader.VirtualTell(); // ... additional stuff as needed ... } } // <-- PBI data will only be written here, as PbiBuilder goes out of scope
- Parameters
record
: input BamRecord to pull index data fromvOffset
: virtual offset into BAM file where record begins
-
void
Close
()¶ Writes data out to PBI file & closes builder.
- Note
- Any exceptions are thrown to caller. If you don’t care about catching exceptions with file I/O, just let the builder go out of scope and data will be written, but exceptions swallowed (to avoid throwing from destructor).
Public Types
-
enum
CompressionLevel
¶ This enum allows you to control the compression level of the output PBI file.
Values are equivalent to zlib compression levels. See its documentation for more details: http://www.zlib.net/manual.html
Values:
-
CompressionLevel_0
= 0¶
-
CompressionLevel_1
= 1¶
-
CompressionLevel_2
= 2¶
-
CompressionLevel_3
= 3¶
-
CompressionLevel_4
= 4¶
-
CompressionLevel_5
= 5¶
-
CompressionLevel_6
= 6¶
-
CompressionLevel_7
= 7¶
-
CompressionLevel_8
= 8¶
-
CompressionLevel_9
= 9¶
-
DefaultCompression
= -1¶
-
NoCompression
= CompressionLevel_0¶
-
FastCompression
= CompressionLevel_1¶
-
BestCompression
= CompressionLevel_9¶
-
-