libBigWig
|
Go to the source code of this file.
Data Structures | |
struct | bwZoomHdr_t |
BigWig files have multiple "zoom" levels, each of which has its own header. This hold those headers. More... | |
struct | bigWigHdr_t |
The header section of a bigWig file. More... | |
struct | chromList_t |
Holds the chromosomes and their lengths. More... | |
struct | bwWriteBuffer_t |
This is only needed for writing bigWig files (and won't be created otherwise) This should be removed from bigWig.h. More... | |
struct | bigWigFile_t |
A structure that holds everything needed to access a bigWig file. More... | |
struct | bwOverlappingIntervals_t |
Holds interval:value associations. More... | |
struct | bbOverlappingEntries_t |
Holds interval:str associations. More... | |
struct | bwOverlapIterator_t |
A structure to hold iterations One of intervals and entries should be used to access records from bigWig or bigBed files, respectively. More... | |
Macros | |
#define | LIBBIGWIG_VERSION 0.4.5 |
#define | LIBBIGWIG_CURL 1 |
#define | BIGWIG_MAGIC 0x888FFC26 |
#define | BIGBED_MAGIC 0x8789F2EB |
#define | CIRTREE_MAGIC 0x78ca8c91 |
#define | IDX_MAGIC 0x2468ace0 |
#define | DEFAULT_nCHILDREN 64 |
#define | DEFAULT_BLOCKSIZE 32768 |
Enumerations | |
enum | bwStatsType { doesNotExist = -1, mean = 0, average = 0, stdev = 1, dev = 1, max = 2, min = 3, cov = 4, coverage = 4, sum = 5 } |
Functions | |
int | bwInit (size_t bufSize) |
Initializes curl and global variables. This MUST be called before other functions (at least if you want to connect to remote files). For remote file, curl must be initialized and regions of a file read into an internal buffer. If the buffer is too small then an excessive number of connections will be made. If the buffer is too large than more data than required is fetched. 128KiB is likely sufficient for most needs. More... | |
void | bwCleanup (void) |
The counterpart to bwInit, this cleans up curl. More... | |
int | bwIsBigWig (char *fname, CURLcode(*callBack)(CURL *)) |
Determine if a file is a bigWig file. This function will quickly check either local or remote files to determine if they appear to be valid bigWig files. This can be determined by reading the first 4 bytes of the file. More... | |
int | bbIsBigBed (char *fname, CURLcode(*callBack)(CURL *)) |
Determine is a file is a bigBed file. This function will quickly check either local or remote files to determine if they appear to be valid bigWig files. This can be determined by reading the first 4 bytes of the file. More... | |
bigWigFile_t * | bwOpen (char *fname, CURLcode(*callBack)(CURL *), const char *mode) |
Opens a local or remote bigWig file. This will open a local or remote bigWig file. Writing of local bigWig files is also supported. More... | |
bigWigFile_t * | bbOpen (char *fname, CURLcode(*callBack)(CURL *)) |
Opens a local or remote bigBed file. This will open a local or remote bigBed file. Note that this file format can only be read and NOT written! More... | |
char * | bbGetSQL (bigWigFile_t *fp) |
Returns a string containing the SQL entry (or NULL). The "auto SQL" field contains the names and value types of the entries in each bigBed entry. If you need to parse a particular value out of each entry, then you'll need to first parse this. More... | |
void | bwClose (bigWigFile_t *fp) |
Closes a bigWigFile_t and frees up allocated memory This closes both bigWig and bigBed files. More... | |
uint32_t | bwGetTid (bigWigFile_t *fp, char *chrom) |
Converts between chromosome name and ID. More... | |
void | bwDestroyOverlappingIntervals (bwOverlappingIntervals_t *o) |
Frees space allocated by bwGetOverlappingIntervals More... | |
void | bbDestroyOverlappingEntries (bbOverlappingEntries_t *o) |
Frees space allocated by bbGetOverlappingEntries More... | |
bwOverlappingIntervals_t * | bwGetOverlappingIntervals (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end) |
Return bigWig entries overlapping an interval. Find all bigWig entries overlapping a range and returns them, including their associated values. More... | |
bbOverlappingEntries_t * | bbGetOverlappingEntries (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, int withString) |
Return bigBed entries overlapping an interval. Find all bigBed entries overlapping a range and returns them. More... | |
bwOverlapIterator_t * | bwOverlappingIntervalsIterator (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, uint32_t blocksPerIteration) |
Creates an iterator over intervals in a bigWig file Iterators can be traversed with bwIteratorNext() and destroyed with bwIteratorDestroy() . Intervals are in the intervals member and data can be used to determine when to end iteration. More... | |
bwOverlapIterator_t * | bbOverlappingEntriesIterator (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, int withString, uint32_t blocksPerIteration) |
Creates an iterator over entries in a bigBed file Iterators can be traversed with bwIteratorNext() and destroyed with bwIteratorDestroy() . Entries are in the entries member and data can be used to determine when to end iteration. More... | |
bwOverlapIterator_t * | bwIteratorNext (bwOverlapIterator_t *iter) |
Traverses to the entries/intervals in the next group of blocks. More... | |
void | bwIteratorDestroy (bwOverlapIterator_t *iter) |
Destroys a bwOverlapIterator_t. More... | |
bwOverlappingIntervals_t * | bwGetValues (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, int includeNA) |
Return all per-base bigWig values in a given interval. Given an interval (e.g., chr1:0-100), return the value at each position in a bigWig file. Positions without associated values are suppressed by default, but may be returned if includeNA is not 0. More... | |
double * | bwStats (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, uint32_t nBins, enum bwStatsType type) |
Determines per-interval bigWig statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals in a bigWig file. You can optionally give it an interval and ask for values from X number of sub-intervals. More... | |
double * | bwStatsFromFull (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, uint32_t nBins, enum bwStatsType type) |
Determines per-interval bigWig statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals in a bigWig file. You can optionally give it an interval and ask for values from X number of sub-intervals. The difference with bwStats is that zoom levels are never used. More... | |
int | bwCreateHdr (bigWigFile_t *fp, int32_t maxZooms) |
Create a largely empty bigWig header Every bigWig file has a header, this creates the template for one. It also takes care of space allocation in the output write buffer. More... | |
chromList_t * | bwCreateChromList (char **chroms, uint32_t *lengths, int64_t n) |
Take a list of chromosome names and lengths and return a pointer to a chromList_t This MUST be run before bwWriteHdr() . Note that the input is NOT free()d! More... | |
int | bwWriteHdr (bigWigFile_t *bw) |
Write a the header to a bigWig file. You must have already opened the output file, created a header and a chromosome list. More... | |
int | bwAddIntervals (bigWigFile_t *fp, char **chrom, uint32_t *start, uint32_t *end, float *values, uint32_t n) |
Write a new block of bedGraph-like intervals to a bigWig file Adds entries of the form: chromosome start end value to the file. These will always be added in a new block, so you may have previously used a different storage type. More... | |
int | bwAppendIntervals (bigWigFile_t *fp, uint32_t *start, uint32_t *end, float *values, uint32_t n) |
Append bedGraph-like intervals to a previous block of bedGraph-like intervals in a bigWig file. If you have previously used bwAddIntervals() then this will append additional entries into the previous block (or start a new one if needed). More... | |
int | bwAddIntervalSpans (bigWigFile_t *fp, char *chrom, uint32_t *start, uint32_t span, float *values, uint32_t n) |
Add a new block of variable-step entries to a bigWig file Adds entries for the form chromosome start value to the file. Each block of such entries has an associated "span", so each value describes the region chromosome:start-(start+span) More... | |
int | bwAppendIntervalSpans (bigWigFile_t *fp, uint32_t *start, float *values, uint32_t n) |
Append to a previous block of variable-step entries. If you previously used bwAddIntervalSpans() , this will continue appending more values to the block(s) it created. More... | |
int | bwAddIntervalSpanSteps (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t span, uint32_t step, float *values, uint32_t n) |
Add a new block of fixed-step entries to a bigWig file Adds entries for the form value to the file. Each block of such entries has an associated "span", "step", chromosome and start position. See the wiggle format for more details. More... | |
int | bwAppendIntervalSpanSteps (bigWigFile_t *fp, float *values, uint32_t n) |
Append to a previous block of fixed-step entries. If you previously used bwAddIntervalSpanSteps() , this will continue appending more values to the block(s) it created. More... | |
These are the functions and structured that should be used by external users. While I don't particularly recommend dealing with some of the structures (e.g., a bigWigHdr_t), they're described here in case you need them.
BTW, this library doesn't switch endianness as appropriate, since I kind of assume that there's only one type produced these days.
#define BIGBED_MAGIC 0x8789F2EB |
The magic number of a bigBed file.
#define BIGWIG_MAGIC 0x888FFC26 |
The magic number of a bigWig file.
#define CIRTREE_MAGIC 0x78ca8c91 |
The magic number of a "cirTree" block in a file.
#define DEFAULT_BLOCKSIZE 32768 |
The default decompression buffer size in bytes. This is used to determin
#define DEFAULT_nCHILDREN 64 |
The default number of children per block.
#define IDX_MAGIC 0x2468ace0 |
The magic number of an index block in a file.
#define LIBBIGWIG_CURL 1 |
If 1, then this library was compiled with remote file support.
#define LIBBIGWIG_VERSION 0.4.5 |
The library version number
enum bwStatsType |
An enum that dictates the type of statistic to fetch for a given interval
void bbDestroyOverlappingEntries | ( | bbOverlappingEntries_t * | o | ) |
Frees space allocated by bbGetOverlappingEntries
o | A valid bbOverlappingEntries_t pointer. |
bbOverlappingEntries_t* bbGetOverlappingEntries | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
int | withString | ||
) |
Return bigBed entries overlapping an interval. Find all bigBed entries overlapping a range and returns them.
fp | A valid bigWigFile_t pointer. This MUST be for a bigBed file! |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
withString | If not 0, return the string associated with each entry in the output. If 0, there are no associated strings returned. This is useful if the only information needed are the locations of the entries, which require significantly less memory. |
bbOverlappingEntries_t *
holding the intervals and (optionally) the associated string. char* bbGetSQL | ( | bigWigFile_t * | fp | ) |
Returns a string containing the SQL entry (or NULL). The "auto SQL" field contains the names and value types of the entries in each bigBed entry. If you need to parse a particular value out of each entry, then you'll need to first parse this.
fp | The file pointer to a valid bigWigFile_t |
int bbIsBigBed | ( | char * | fname, |
CURLcode(*)(CURL *) | callBack | ||
) |
Determine is a file is a bigBed file. This function will quickly check either local or remote files to determine if they appear to be valid bigWig files. This can be determined by reading the first 4 bytes of the file.
fname | The file name or URL (http, https, and ftp are supported) |
callBack | An optional user-supplied function. This is applied to remote connections so users can specify things like proxy and password information. See test/testRemote for an example. |
bigWigFile_t* bbOpen | ( | char * | fname, |
CURLcode(*)(CURL *) | callBack | ||
) |
Opens a local or remote bigBed file. This will open a local or remote bigBed file. Note that this file format can only be read and NOT written!
fname | The file name or URL (http, https, and ftp are supported) |
callBack | An optional user-supplied function. This is applied to remote connections so users can specify things like proxy and password information. See test/testRemote for an example. |
bwOverlapIterator_t* bbOverlappingEntriesIterator | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
int | withString, | ||
uint32_t | blocksPerIteration | ||
) |
Creates an iterator over entries in a bigBed file Iterators can be traversed with bwIteratorNext()
and destroyed with bwIteratorDestroy()
. Entries are in the entries
member and data
can be used to determine when to end iteration.
fp | A valid bigWigFile_t pointer. This MUST be for a bigBed file! |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
withString | Whether the returned entries should include their associated strings. |
blocksPerIteration | The number of blocks (internal groupings of entries in bigBed files) to return per iteration. |
int bwAddIntervals | ( | bigWigFile_t * | fp, |
char ** | chrom, | ||
uint32_t * | start, | ||
uint32_t * | end, | ||
float * | values, | ||
uint32_t | n | ||
) |
Write a new block of bedGraph-like intervals to a bigWig file Adds entries of the form: chromosome start end value to the file. These will always be added in a new block, so you may have previously used a different storage type.
In general it's more efficient to use the bwAppend* functions, but then you MUST know that the previously written block is of the same type. In other words, you can only use bwAppendIntervals() after bwAddIntervals() or a previous bwAppendIntervals().
fp | The output file pointer. |
chrom | A list of chromosomes, of length n . |
start | A list of start positions of lengthn . |
end | A list of end positions of lengthn . |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
int bwAddIntervalSpans | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t * | start, | ||
uint32_t | span, | ||
float * | values, | ||
uint32_t | n | ||
) |
Add a new block of variable-step entries to a bigWig file Adds entries for the form chromosome start value to the file. Each block of such entries has an associated "span", so each value describes the region chromosome:start-(start+span)
This will always start a new block of values.
fp | The output file pointer. |
chrom | A list of chromosomes, of length n . |
start | A list of start positions of lengthn . |
span | The span of each entry (the must all be the same). |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
int bwAddIntervalSpanSteps | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | span, | ||
uint32_t | step, | ||
float * | values, | ||
uint32_t | n | ||
) |
Add a new block of fixed-step entries to a bigWig file Adds entries for the form value to the file. Each block of such entries has an associated "span", "step", chromosome and start position. See the wiggle format for more details.
This will always start a new block of values.
fp | The output file pointer. |
chrom | The chromosome that the entries describe. |
start | The starting position of the block of entries. |
span | The span of each entry (i.e., the number of bases it describes). |
step | The step between entry start positions. |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
int bwAppendIntervals | ( | bigWigFile_t * | fp, |
uint32_t * | start, | ||
uint32_t * | end, | ||
float * | values, | ||
uint32_t | n | ||
) |
Append bedGraph-like intervals to a previous block of bedGraph-like intervals in a bigWig file. If you have previously used bwAddIntervals() then this will append additional entries into the previous block (or start a new one if needed).
fp | The output file pointer. |
start | A list of start positions of lengthn . |
end | A list of end positions of lengthn . |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
bwAddIntervalSpanSteps()
, bwAppendIntervalSpanSteps()
, bwAddIntervalSpanSteps()
, or bwAppendIntervalSpanSteps()
. int bwAppendIntervalSpans | ( | bigWigFile_t * | fp, |
uint32_t * | start, | ||
float * | values, | ||
uint32_t | n | ||
) |
Append to a previous block of variable-step entries. If you previously used bwAddIntervalSpans()
, this will continue appending more values to the block(s) it created.
fp | The output file pointer. |
start | A list of start positions of lengthn . |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
bwAddIntervals()
, bwAppendIntervals()
, bwAddIntervalSpanSteps()
or bwAppendIntervalSpanSteps()
int bwAppendIntervalSpanSteps | ( | bigWigFile_t * | fp, |
float * | values, | ||
uint32_t | n | ||
) |
Append to a previous block of fixed-step entries. If you previously used bwAddIntervalSpanSteps()
, this will continue appending more values to the block(s) it created.
fp | The output file pointer. |
values | A list of values of lengthn . |
n | The length of the aforementioned lists. |
bwAddIntervals()
, bwAppendIntervals()
, bwAddIntervalSpans()
or bwAppendIntervalSpans()
void bwCleanup | ( | void | ) |
The counterpart to bwInit, this cleans up curl.
void bwClose | ( | bigWigFile_t * | fp | ) |
Closes a bigWigFile_t and frees up allocated memory This closes both bigWig and bigBed files.
fp | The file pointer. |
chromList_t* bwCreateChromList | ( | char ** | chroms, |
uint32_t * | lengths, | ||
int64_t | n | ||
) |
Take a list of chromosome names and lengths and return a pointer to a chromList_t This MUST be run before bwWriteHdr()
. Note that the input is NOT free()d!
chroms | A list of chromosomes. |
lengths | The length of each chromosome. |
n | The number of chromosomes (thus, the length of chroms and lengths ) |
int bwCreateHdr | ( | bigWigFile_t * | fp, |
int32_t | maxZooms | ||
) |
Create a largely empty bigWig header Every bigWig file has a header, this creates the template for one. It also takes care of space allocation in the output write buffer.
fp | The bigWigFile_t* that you want to write to. |
maxZooms | The maximum number of zoom levels. If you specify 0 then there will be no zoom levels. A value <0 or > 65535 will result in a maximum of 10. |
void bwDestroyOverlappingIntervals | ( | bwOverlappingIntervals_t * | o | ) |
Frees space allocated by bwGetOverlappingIntervals
o | A valid bwOverlappingIntervals_t pointer. |
bwOverlappingIntervals_t* bwGetOverlappingIntervals | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end | ||
) |
Return bigWig entries overlapping an interval. Find all bigWig entries overlapping a range and returns them, including their associated values.
fp | A valid bigWigFile_t pointer. This MUST be for a bigWig file! |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
bwOverlappingIntervals_t *
holding the values and intervals. uint32_t bwGetTid | ( | bigWigFile_t * | fp, |
char * | chrom | ||
) |
Converts between chromosome name and ID.
fp | A valid bigWigFile_t pointer |
chrom | A chromosome name |
bwOverlappingIntervals_t* bwGetValues | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
int | includeNA | ||
) |
Return all per-base bigWig values in a given interval. Given an interval (e.g., chr1:0-100), return the value at each position in a bigWig file. Positions without associated values are suppressed by default, but may be returned if includeNA
is not 0.
fp | A valid bigWigFile_t pointer. |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
includeNA | If not 0, report NA values as well (as NA). |
bwOverlappingIntervals_t *
holding the values and positions. int bwInit | ( | size_t | bufSize | ) |
Initializes curl and global variables. This MUST be called before other functions (at least if you want to connect to remote files). For remote file, curl must be initialized and regions of a file read into an internal buffer. If the buffer is too small then an excessive number of connections will be made. If the buffer is too large than more data than required is fetched. 128KiB is likely sufficient for most needs.
bufSize | The internal buffer size used for remote connection. |
int bwIsBigWig | ( | char * | fname, |
CURLcode(*)(CURL *) | callBack | ||
) |
Determine if a file is a bigWig file. This function will quickly check either local or remote files to determine if they appear to be valid bigWig files. This can be determined by reading the first 4 bytes of the file.
fname | The file name or URL (http, https, and ftp are supported) |
callBack | An optional user-supplied function. This is applied to remote connections so users can specify things like proxy and password information. See test/testRemote for an example. |
void bwIteratorDestroy | ( | bwOverlapIterator_t * | iter | ) |
Destroys a bwOverlapIterator_t.
iter | The bwOverlapIterator_t that should be destroyed |
bwOverlapIterator_t* bwIteratorNext | ( | bwOverlapIterator_t * | iter | ) |
Traverses to the entries/intervals in the next group of blocks.
iter | A bwOverlapIterator_t pointer that is updated (or destroyed on error) |
bigWigFile_t* bwOpen | ( | char * | fname, |
CURLcode(*)(CURL *) | callBack, | ||
const char * | mode | ||
) |
Opens a local or remote bigWig file. This will open a local or remote bigWig file. Writing of local bigWig files is also supported.
fname | The file name or URL (http, https, and ftp are supported) |
callBack | An optional user-supplied function. This is applied to remote connections so users can specify things like proxy and password information. See test/testRemote for an example. |
mode | The mode, by default "r". Both local and remote files can be read, but only local files can be written. For files being written the callback function is ignored. If and only if the mode contains "w" will the file be opened for writing (in all other cases the file will be opened for reading. |
bwOverlapIterator_t* bwOverlappingIntervalsIterator | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
uint32_t | blocksPerIteration | ||
) |
Creates an iterator over intervals in a bigWig file Iterators can be traversed with bwIteratorNext()
and destroyed with bwIteratorDestroy()
. Intervals are in the intervals
member and data
can be used to determine when to end iteration.
fp | A valid bigWigFile_t pointer. This MUST be for a bigWig file! |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
blocksPerIteration | The number of blocks (internal groupings of intervals in bigWig files) to return per iteration. |
double* bwStats | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
uint32_t | nBins, | ||
enum bwStatsType | type | ||
) |
Determines per-interval bigWig statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals in a bigWig file. You can optionally give it an interval and ask for values from X number of sub-intervals.
fp | The file from which to extract statistics. |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
nBins | The number of bins within the interval to calculate statistics for. |
type | The type of statistic. |
double* bwStatsFromFull | ( | bigWigFile_t * | fp, |
char * | chrom, | ||
uint32_t | start, | ||
uint32_t | end, | ||
uint32_t | nBins, | ||
enum bwStatsType | type | ||
) |
Determines per-interval bigWig statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals in a bigWig file. You can optionally give it an interval and ask for values from X number of sub-intervals. The difference with bwStats is that zoom levels are never used.
fp | The file from which to extract statistics. |
chrom | A valid chromosome name. |
start | The start position of the interval. This is 0-based half open, so 0 is the first base. |
end | The end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99. |
nBins | The number of bins within the interval to calculate statistics for. |
type | The type of statistic. |
int bwWriteHdr | ( | bigWigFile_t * | bw | ) |
Write a the header to a bigWig file. You must have already opened the output file, created a header and a chromosome list.
bw | The output bigWigFile_t pointer. |