BamHeader

#include <pbbam/BamHeader.h>
class PacBio::BAM::BamHeader

The BamHeader class represents the header section of the BAM file.

It provides metadata about the file including file version, reference sequences, read groups, comments, etc.

A BamHeader may be fetched from a BamFile to view an existing file’s metadata. Or one may be created/edited for use with writing to a new file (via BamWriter).

Note
A particular BamHeader is likely to be re-used in lots of places throughout the library, for read-only purposes. For this reason, even though a BamHeader may be returned by value, it is essentially a thin wrapper for a shared-pointer to the actual data. This means, though, that if you need to edit an existing BamHeader for use with a BamWriter, please consider using BamHeader::DeepCopy. Otherwise any modifications will affect all BamHeaders that are sharing its underlying data.

Constructors & Related Methods

BamHeader()

Creates an empty BamHeader.

BamHeader(const std::string &samHeaderText)

Creates a BamHeader from SAM-formatted text.

Parameters
  • samHeaderText:

BamHeader(const BamHeader &other)
BamHeader(BamHeader &&other)
BamHeader &operator=(const BamHeader &other)
BamHeader &operator=(BamHeader &&other)
~BamHeader()
BamHeader DeepCopy() const

Detaches underlying data from the shared-pointer, returning a independent copy of the header contents.

This ensures that any modifications to the newly returned BamHeader do not affect other BamHeader objects that were sharing its underlying data.

Operators

BamHeader &operator+=(const BamHeader &other)

Merges another header with this one.

Headers must be compatible for merging. This means that their Version, SortOrder, PacBioBamVersion (and in the case of aligned BAM data, Sequences) must all match. If not, an exception will be thrown.

Return
reference to this header
Parameters
  • other: header to merge with this one
Exceptions
  • std::runtime_error: if the headers are not compatible

BamHeader operator+(const BamHeader &other) const

Creates a new, merged header.

Headers must be compatible for merging. This means that their Version, SortOrder, PacBioBamVersion (and in the case of aligned BAM data, Sequences) must all match. If not, an exception will be thrown.

Both original headers (this header and other) will not be modified.

Return
merged header
Parameters
  • other: header to merge with this one
Exceptions
  • std::runtime_error: if the headers are not compatible

General Attributes

std::string PacBioBamVersion() const

Return
the PacBio BAM version number (@HD:pb)
Note
This is different from the SAM/BAM version number
See
BamHeader::Version.

std::string SortOrder() const

Valid values: “unknown”, “unsorted”, “queryname”, or “coordinate”

Return
the sort order used

std::string Version() const

Return
the SAM/BAM version number (@HD:VN)
Note
This is different from the PacBio BAM version number
See
BamHeader::PacBioBamVersion

BamHeader &PacBioBamVersion(const std::string &version)

Sets this header’s PacBioBAM version number (@HD:pb).

Return
reference to this object
Exceptions
  • std::runtime_error: if version number cannot be parsed or is less than the minimum version allowed.

BamHeader &SortOrder(const std::string &order)

Sets this header’s sort order label (@HD:SO).

Valid values: “unknown”, “unsorted”, “queryname”, or “coordinate”

Return
reference to this object

BamHeader &Version(const std::string &version)

Sets this header’s SAM/BAM version number (@HD:VN).

Return
reference to this object

Read Groups

bool HasReadGroup(const std::string &id) const

Return
true if the header contains a read group with id (@RG:ID)

ReadGroupInfo ReadGroup(const std::string &id) const

Return
a ReadGroupInfo object representing the read group matching id (@RG:ID)
Exceptions
  • std::runtime_error: if id is unknown

std::vector<std::string> ReadGroupIds() const

Return
vector of read group IDs listed in this header

std::vector<ReadGroupInfo> ReadGroups() const

Return
vector of ReadGroupInfo objects, representing all read groups listed in this header

BamHeader &AddReadGroup(const ReadGroupInfo &readGroup)

Appends a read group entry (@RG) to this header.

Return
reference to this object

BamHeader &ClearReadGroups()

Removes all read group entries from this header.

Return
reference to this object

BamHeader &ReadGroups(const std::vector<ReadGroupInfo> &readGroups)

Replaces this header’s list of read group entries with those in readGroups.

Return
reference to this object

Sequences

bool HasSequence(const std::string &name) const

Return
true if header contains a sequence with name (@SQ:SN)

size_t NumSequences() const

Return
number of sequences (@SQ entries) stored in this header

int32_t SequenceId(const std::string &name) const

This is the numeric ID used elsewhere throughout the API.

Return
numeric ID for sequence matching name (@SQ:SN)

See
BamReader::ReferenceId, PbiReferenceIdFilter, PbiRawMappedData::tId_
Exceptions
  • std::runtime_error: if name is unknown

std::string SequenceLength(const int32_t id) const

Return
the length of the sequence (@SQ:LN, e.g. chromosome length) at index id
See
SequenceInfo::Length, BamHeader::SequenceId

std::string SequenceName(const int32_t id) const

Return
the name of the sequence (@SQ:SN) at index id
See
SequenceInfo::Name, BamHeader::SequenceId

std::vector<std::string> SequenceNames() const

Position in the vector is equivalent to SequenceId.

Return
vector of sequence names (@SQ:SN) stored in this header

SequenceInfo Sequence(const int32_t id) const

Return
SequenceInfo object at index id
See
BamHeader::SequenceId
Exceptions
  • std::out_of_range: if is an invalid or unknown index

SequenceInfo Sequence(const std::string &name) const

Return
SequenceInfo for the sequence matching name

std::vector<SequenceInfo> Sequences() const

Return
vector of SequenceInfo objects representing the sequences (@SQ entries) stored in this header

BamHeader &AddSequence(const SequenceInfo &sequence)

Appends a sequence entry (@SQ) to this header.

Return
reference to this object

BamHeader &ClearSequences()

Removes all sequence entries from this header.

Return
reference to this object

BamHeader &Sequences(const std::vector<SequenceInfo> &sequences)

Replaces this header’s list of sequence entries with those in sequences.

Return
reference to this object

Programs

bool HasProgram(const std::string &id) const

Return
true if this header contains a program entry with ID (@PG:ID) matching id

ProgramInfo Program(const std::string &id) const

Return
ProgramInfo object for the program entry matching id
Exceptions
  • std::runtime_error: if id is unknown

std::vector<std::string> ProgramIds() const

Return
vector of program IDs (@PG:ID)

std::vector<ProgramInfo> Programs() const

Return
vector of ProgramInfo objects representing program entries (@PG) stored in this heder

BamHeader &AddProgram(const ProgramInfo &pg)

Appends a program entry (@PG) to this header.

Return
reference to this object

BamHeader &ClearPrograms()

Removes all program entries from this header.

Return
reference to this object

BamHeader &Programs(const std::vector<ProgramInfo> &programs)

Replaces this header’s list of program entries with those in programs.

Return
reference to this object

Comments

std::vector<std::string> Comments() const

Return
vector of comment (@CO) strings

BamHeader &AddComment(const std::string &comment)

Appends a comment (@CO) to this header.

Return
reference to this object

BamHeader &ClearComments()

Removes all comments from this header.

Return
reference to this object

BamHeader &Comments(const std::vector<std::string> &comments)

Replaces this header’s list of comments with those in comments.

Return
reference to this object

Conversion Methods

std::string ToSam() const

Return
SAM-header-formatted string representing this header’s data